Publications from Computer Vision Laboratory

Jump to: »Journal papers »Books »Book chapters »Conference papers »Conference proceedings »Theses »Other »Reports »Student theses

Show/hide year headlines.

Show/hide links to additional information.

Journal papers

2024

Mubashir Noman, Mustansar Fiaz, Hisham Cholakkal, Salman Khan, Fahad Khan, "ELGC-Net: Efficient Local- Global Context Aggregation for Remote Sensing Change Detection", IEEE Transactions on Geoscience and Remote Sensing, 62, 2024.

AbstractKeywordsBiBTeXDOI

Jan Thies Brockmann, Marco Rudolph, Bodo Rosenhahn, Bastian Wandt, "The voraus-AD Dataset for Anomaly Detection in Robot Applications", IEEE Transactions on robotics, 40: 438-451, 2024.

AbstractKeywordsBiBTeXDOI

Long Li, Junwei Han, Nian Liu, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Khan, "Robust Perception and Precise Segmentation for Scribble-Supervised RGB-D Saliency Detection", IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(1): 479-496, 2024.

AbstractKeywordsBiBTeXDOI

Neelu Madan, Nicolae-Catalin Ristea, Radu Tudor Ionescu, Kamal Nasrollahi, Fahad Khan, Thomas B. Moeslund, Mubarak Shah, "Self-Supervised Masked Convolutional Transformer Block for Anomaly Detection", IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(1): 525-542, 2024.

AbstractKeywordsBiBTeXDOI

Jyoti Kini, Fahad Khan, Salman Khan, Mubarak Shah, "CT-VOS: Cutout prediction and tagging for self-supervised video object segmentation", Computer Vision and Image Understanding, 238, 2024.

AbstractKeywordsBiBTeXDOI

2023

Michael Zwölfer, Dieter Heinrich, Bastian Wandt, Helge Rhodin, Jörg Spörri, Werner Nachbauer, "A graph-based approach can improve keypoint detection of complex poses: a proof-of-concept on injury occurrences in alpine ski racing", Scientific Reports, 13(1), 2023.

AbstractKeywordsBiBTeXDOI

Akshita Gupta, Sanath Narayan, Salman Khan, Fahad Khan, Ling Shao, Joost van de Weijer, "Generative Multi-Label Zero-Shot Learning", IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(12): 14611-14624, 2023.

AbstractKeywordsBiBTeXDOI

Juliano Pinto, Georg Hess, William Ljungbergh, Yuxuan Xia, Henk Wymeersch, Lennart Svensson, "Deep Learning for Model-Based Multiobject Tracking", IEEE Transactions on Aerospace and Electronic Systems, 59(6): 7363-7379, 2023.

AbstractKeywordsBiBTeXDOI

Sandra Carrasco Limeros, Sylwia Majchrowska, Joakim Johnander Faxén, Christoffer Petersson, David Fernandez Llorca, "Towards explainable motion prediction using heterogeneous graph representations", Transportation Research Part C, 157, 2023.

AbstractKeywordsBiBTeXDOI

Salman Khan, Fahad Khan, Ashish Vaswani, Niki Parmar, Ming-Hsuan Yang, Mubarak Shah, "Guest Editorial Introduction to the Special Section on Transformer Models in Vision", IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11): 12721-12725, 2023.

AbstractKeywordsBiBTeXDOI

Yonghao Xu, Weikang Yu, Pedram Ghamisi, Michael Kopp, Sepp Hochreiter, "Txt2Img-MHN: Remote Sensing Image Generation From Text Using Modern Hopfield Networks", IEEE Transactions on Image Processing, 32: 5737-5750, 2023.

AbstractKeywordsBiBTeXDOIFulltext

Yaxing Wang, Abel Gonzalez-Garcia, Chenshen Wu, Luis Herranz, Fahad Khan, Shangling Jui, Jian Yang, Joost van de Weijer, "MineGAN plus plus : Mining Generative Models for Efficient Knowledge Transfer to Limited Data Domains", International Journal of Computer Vision, 132(2): 490-514, 2023.

AbstractKeywordsBiBTeXDOI

Fahad Shamshad, Salman Khan, Syed Waqas Zamir, Muhammad Haris Khan, Munawar Hayat, Fahad Khan, Huazhu Fu, "Transformers in medical imaging: A survey", Medical Image Analysis, 88, 2023.

AbstractKeywordsBiBTeXDOI

Sandra Carrasco Limeros, Sylwia Majchrowska, Joakim Johnander Faxén, Christoffer Petersson, Miguel Angel Sotelo, David Fernandez Llorca, "Towards trustworthy multi-modal motion prediction: Holistic evaluation and interpretability of outputs", CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, (Epub ahead of print), 2023.

AbstractKeywordsBiBTeXDOIFulltext

Muzammal Naseer, Salman Khan, Munawar Hayat, Fahad Khan, Fatih Porikli, "Stylized Adversarial Defense", IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(5): 6403-6414, 2023.

AbstractKeywordsBiBTeXDOI

Sajid Javed, Martin Danelljan, Fahad Khan, Muhammad Haris Khan, Michael Felsberg, Jiri Matas, "Visual Object Tracking With Discriminative Filters and Siamese Networks: A Survey and Outlook", IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(5): 6552-6574, 2023.

AbstractKeywordsBiBTeXDOIFulltext

Nicolae-Catalin Ristea, Andreea-Iuliana Miron, Olivian Savencu, Mariana-Iuliana Georgescu, Nicolae Verga, Fahad Khan, Radu Tudor Ionescu, "CyTran: A cycle-consistent transformer with multi-level consistency for non-contrast to contrast CT translation", Neurocomputing, 538, 2023.

AbstractKeywordsBiBTeXDOI

Abstract

We propose a novel approach to translate unpaired contrast computed tomography (CT) scans to noncontrast CT scans and the other way around. Solving this task has two important applications: (i) to automatically generate contrast CT scans for patients for whom injecting contrast substance is not an option, and (ii) to enhance the alignment between contrast and non-contrast CT by reducing the differences induced by the contrast substance before registration.Our approach is based on cycle-consistent generative adversarial convolutional transformers, for short, CyTran. Our neural model can be trained on unpaired images, due to the integration of a multi-level cycleconsistency loss. Aside from the standard cycle-consistency loss applied at the image level, we propose to apply additional cycle-consistency losses between intermediate feature representations, which enforces the model to be cycle-consistent at multiple representations levels, leading to superior results. To deal with high-resolution images, we design a hybrid architecture based on convolutional and multi-head attention layers. In addition, we introduce a novel data set, Coltea-Lung-CT-100W, containing 100 3D triphasic lung CT scans (with a total of 37,290 images) collected from 100 female patients (there is one examination per patient). Each scan contains three phases (non-contrast, early portal venous, and late arterial), allowing us to perform experiments to compare our novel approach with state-of-the-art methods for image style transfer.Our empirical results show that CyTran outperforms all competing methods. Moreover, we show that CyTran can be employed as a preliminary step to improve a state-of-the-art medical image alignment method. We release our novel model and data set as open source at: https://github.com/ristea/cycletransformer.Our qualitative and subjective human evaluations reveal that CyTran is the only approach that does not introduce visual artifacts during the translation process. We believe this is a key advantage in our application domain, where medical images need to precisely represent the scanned body parts. (c) 2023 Elsevier B.V. All rights reserved.

Antonio Barbalau, Radu Tudor Ionescu, Mariana-Iuliana Georgescu, Jacob Dueholm, Bharathkumar Ramachandra, Kamal Nasrollahi, Fahad Khan, Thomas B. Moeslund, Mubarak Shah, "SSMTL plus plus : Revisiting self-supervised multi-task learning for video anomaly detection", Computer Vision and Image Understanding, 229, 2023.

AbstractKeywordsBiBTeXDOI

Emil Brissman, Joakim Johnander, Martin Danelljan, Michael Felsberg, "Recurrent Graph Neural Networks for Video Instance Segmentation", International Journal of Computer Vision, 131: 471-495, 2023.

AbstractKeywordsBiBTeXDOIFulltext

Jiale Cao, Yanwei Pang, Rao Muhammad Anwer, Hisham Cholakkal, Fahad Shahbaz Khan, Ling Shao, "SipMaskv2: Enhanced Fast Image and Video Instance Segmentation", IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3): 3798-3812, 2023.

AbstractKeywordsBiBTeXDOI

Krešimir Bešenić, Jörgen Ahlberg, Igor S. Pandžić, "Picking out the bad apples: unsupervised biometric data filtering for refined age estimation", The Visual Computer, 39: 219-237, 2023.

AbstractKeywordsBiBTeXDOIFulltext

2022

Salman Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Khan, Mubarak Shah, "Transformers in Vision: A Survey", ACM Computing Surveys, 54(10), 2022.

AbstractKeywordsBiBTeXDOI

Abstract

Astounding results from Transformer models on natural language tasks have intrigued the vision community to study their application to computer vision problems. Among their salient benefits, Transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequence as compared to recurrent networks, e.g., Long short-term memory. Different from convolutional networks, Transformers require minimal inductive biases for their design and are naturally suited as set-functions. Furthermore, the straightforward design of Transformers allows processing multiple modalities (e.g., images, videos, text, and speech) using similar processing blocks and demonstrates excellent scalability to very large capacity networks and huge datasets. These strengths have led to exciting progress on a number of vision tasks using Transformer networks. This survey aims to provide a comprehensive overview of the Transformer models in the computer vision discipline. We start with an introduction to fundamental concepts behind the success of Transformers, i.e., self-attention, large-scale pre-training, and bidirectional feature encoding. We then cover extensive applications of transformers in vision including popular recognition tasks (e.g., image classification, object detection, action recognition, and segmentation), generative modeling, multi-modal tasks (e.g., visual-question answering, visual reasoning, and visual grounding), video processing (e.g., activity recognition, video forecasting), low-level vision (e.g., image super-resolution, image enhancement, and colorization), and three-dimensional analysis (e.g., point cloud classification and segmentation). We compare the respective advantages and limitations of popular techniques both in terms of architectural design and their experimental value. Finally, we provide an analysis on open research directions and possible future works. We hope this effort will ignite further interest in the community to solve current challenges toward the application of transformer models in computer vision.

Jianbing Shen, Yuanpei Liu, Xingping Dong, Xiankai Lu, Fahad Khan, Steven Hoi, "Distilled Siamese Networks for Visual Tracking", IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12): 8896-8909, 2022.

AbstractKeywordsBiBTeXDOI

K. J. Joseph, Jathushan Rajasegaran, Salman Khan, Fahad Khan, Vineeth N. Balasubramanian, "Incremental Object Detection via Meta-Learning", IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12): 9209-9216, 2022.

AbstractKeywordsBiBTeXDOI

Muzammal Naseer, Salman Khan, Fatih Porikli, Fahad Khan, "Guidance Through Surrogate: Toward a Generic Diagnostic Attack", IEEE Transactions on Neural Networks and Learning Systems, (Epub ahead of print), 2022.

AbstractKeywordsBiBTeXDOI

Jiale Cao, Yanwei Pang, Jin Xie, Fahad Shahbaz Khan, Ling Shao, "From Handcrafted to Deep Features for Pedestrian Detection: A Survey", IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(9): 4913-4934, 2022.

AbstractKeywordsBiBTeXDOI

2021

Maria Magnusson, Michael Sandborg, Gudrun Alm Carlsson, Lilian Henriksson, Åsa Carlsson Tedgren, Alexandr Malusek, "ACCURACY OF CT NUMBERS OBTAINED BY DIRA AND MONOENERGETIC PLUS ALGORITHMS IN DUAL-ENERGY COMPUTED TOMOGRAPHY", Radiation Protection Dosimetry, 195(3-4): 212-217, 2021.

AbstractKeywordsBiBTeXDOIFulltext

Julius Jeuthe, José Carlos González Sánchez, Maria Magnusson, Michael Sandborg, Åsa Carlsson Tedgren, Alexandr Malusek, "Semi-Automated 3D Segmentation of Pelvic Region Bones in CT Volumes for the Annotation of Machine Learning Datasets", Radiation Protection Dosimetry, 195(3-4): 172-176, 2021.

AbstractKeywordsBiBTeXDOIFulltext

Maria Magnusson, Gudrun Alm Carlsson, Michael Sandborg, Åsa Carlsson Tedgren, Alexandr Malusek, "Optimal Selection of Base Materials for Accurate Dual-Energy Computed Tomography: Comparison Between the Alvarez–Macovski Method and DIRA", Radiation Protection Dosimetry, 195(3-4): 218-224, 2021.

AbstractKeywordsBiBTeXDOIFulltext

Zahra Gharaee, Shreyas Kowshik, Oliver Stromann, Michael Felsberg, "Graph representation learning for road type classification", Pattern Recognition, 120, 2021.

AbstractKeywordsBiBTeXDOIFulltext

Juhar Ahmed Abdella, N. M. Zaki, Khaled Shuaib, Fahad Khan, "Airline ticket price and demand prediction: A survey", Journal of King Saud University - Computer and Information Sciences, 33(4): 375-391, 2021.

AbstractKeywordsBiBTeXDOIFulltext

Ivan Gogic, Jörgen Ahlberg, Igor S. Pandzic, "Regression-based methods for face alignment: A survey", Signal Processing, 178, 2021.

AbstractKeywordsBiBTeXDOI

Zahra Gharaee, "Online recognition of unsegmented actions with hierarchical SOM architecture", Cognitive Processing, 22: 77-91, 2021.

AbstractKeywordsBiBTeXDOIFulltext

2020

Zahra Gharaee, "Hierarchical growing grid networks for skeleton based action recognition", Cognitive Systems Research, 63: 11-29, 2020.

AbstractKeywordsBiBTeXDOIFulltext

Wojciech Chojnacki, Zygmunt L. Szpak, Mårten Wadenbäck, "The equivalence of two definitions of compatible homography matrices", Pattern Recognition Letters, 135: 38-43, 2020.

AbstractKeywordsBiBTeXDOI

Anderson Tavares, Felix Järemo-Lawin, Per-Erik Forssén, "Assessing Losses for Point Set Registration", IEEE Robotics and Automation Letters, 5(2): 3360-3367, 2020.

AbstractKeywordsBiBTeXDOI

Ivan Gogic, Martina Manhart, Igor S. Pandzic, Jörgen Ahlberg, "Fast facial expression recognition using local binary features and shallow neural networks", The Visual Computer, 36(1): 97-112, 2020.

AbstractKeywordsBiBTeXDOIFulltext

Jose Carlos Gonzalez Sanchez, Maria Magnusson, Michael Sandborg, Åsa Carlsson Tedgren, Alexandr Malusek, "Segmentation of bones in medical dual-energy computed tomography volumes using the 3D U-Net", Physica medica (Testo stampato), 69: 241-247, 2020.

AbstractKeywordsBiBTeXDOIFulltext

Bertil Grelsson, Andreas Robinson, Michael Felsberg, Fahad Khan, "GPS-level accurate camera localization with HorizonNet", Journal of Field Robotics, 37(6): 951-971, 2020.

AbstractKeywordsBiBTeXDOIFulltext

Abdelrahman Eldesokey, Michael Felsberg, Fahad Shahbaz Khan, "Confidence Propagation through CNNs for Guided Sparse Depth Regression", IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(10), 2020.

AbstractKeywordsBiBTeXDOIFulltext

2019

Tobias Klamt, Diego Rodriguez, Lorenzo Baccelliere, Xi Chen, Domenico Chiaradia, Torben Cichon, Massimiliano Gabardi, Paolo Guria, Karl Holmquist, Malgorzata Kamedula, Hakan Karaoguz, Navvab Kashiri, Arturo Laurenzi, Christian Lenz, Daniele Leonardis, Enrico Mingo Hoffman, Luca Muratore, Dmytro Pavlichenko, Francesco Porcini, Zeyu Ren, Fabian Schilling, Max Schwarz, Massimiliano Solazzi, Michael Felsberg, Antonio Frisoli, Michael Gustmann, Patric Jensfelt, Klas Nordberg, Juergen Rossmann, Uwe Suess, Nikos G. Tsagarakis, Sven Behnke, "Flexible Disaster Response of Tomorrow: Final Presentation and Evaluation of the CENTAURO System", IEEE robotics & automation magazine, 26(4): 59-72, 2019.

AbstractKeywordsBiBTeXDOIFulltext

Maria Magnusson, Magnus Björnfot, Åsa Carlsson Tedgren, Gudrun Alm Carlsson, Michael Sandborg, Alexandr Malusek, "DIRA-3D-a model-based iterative algorithm for accurate dual-energy dual-source 3D helical CT", Biomedical Engineering & Physics Express, 5(6), 2019.

AbstractKeywordsBiBTeXDOIFulltext

Hannes Ovrén, Per-Erik Forssén, "Trajectory representation and landmark projection for continuous-time structure from motion", The international journal of robotics research, 38(6): 686-701, 2019.

AbstractKeywordsBiBTeXDOIFulltext

Lichao Zhang, Abel Gonzalez-Garcia, Joost van de Weijer, Martin Danelljan, Fahad Khan, "Synthetic Data Generation for End-to-End Thermal Infrared Tracking", IEEE Transactions on Image Processing, 28(4): 1837-1850, 2019.

AbstractKeywordsBiBTeXDOI

Nenad Markus, Igor S. Pandzic, Jörgen Ahlberg, "Learning Local Descriptors by Optimizing the Keypoint-Correspondence Criterion: Applications to Face Matching, Learning From Unlabeled Videos and 3D-Shape Retrieval", IEEE Transactions on Image Processing, 28(1): 279-290, 2019.

AbstractKeywordsBiBTeXDOIFulltext

Martin Danelljan, Goutam Bhat, Susanna Gladh, Fahad Shahbaz Khan, Michael Felsberg, "Deep motion and appearance cues for visual tracking", Pattern Recognition Letters, 124: 74-81, 2019.

AbstractKeywordsBiBTeXDOI

2018

Jörgen Ahlberg, Anders Åstrom, Robert Forchheimer, "Simultaneous sensing, readout, and classification on an intensity-ranking image sensor", International journal of circuit theory and applications, 46(9): 1606-1619, 2018.

AbstractKeywordsBiBTeXDOI

Rao Muhammad Anwer, Fahad Khan, Joost van de Weijer, Matthieu Molinier, Jorma Laaksonen, "Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification", ISPRS journal of photogrammetry and remote sensing (Print), 138: 74-85, 2018.

AbstractKeywordsBiBTeXDOI

Abstract

Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The de facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Local Binary Patterns (LBP) encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit LBP based texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Furthermore, our final combination leads to consistent improvement over the state-of-the-art for remote sensing scene classification. (C) 2018 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.

Lu Yu, Lichao Zhang, Joost van de Weijer, Fahad Khan, Yongmei Cheng, C. Alejandro Parraga, "Beyond Eleven Color Names for Image Understanding", Machine Vision and Applications, 29(2): 361-373, 2018.

AbstractKeywordsBiBTeXDOI

Fahad Khan, Joost van de Weijer, Rao Muhammad Anwer, Andrew D. Bagdanov, Michael Felsberg, Jorma Laaksonen, "Scale coding bag of deep features for human attribute and action recognition", Machine Vision and Applications, 29(1): 55-71, 2018.

AbstractKeywordsBiBTeXDOIFulltext

Kristoffer Öfjäll, Michael Felsberg, "Approximative Coding Methods for Channel Representations", Journal of Mathematical Imaging and Vision, 60(6): 929-940, 2018.

AbstractKeywordsBiBTeXDOIFulltext

2017

Nolang Fanani, Alina Stuerck, Matthias Ochs, Henry Bradler, Rudolf Mester, "Predictive monocular odometry (PMO): What is possible without RANSAC and multiframe bundle adjustment?", Image and Vision Computing, 68, 2017.

AbstractKeywordsBiBTeXDOI

Alexandr Malusek, Maria Magnusson, Michael Sandborg, Gudrun Alm Carlsson, "A model-based iterative reconstruction algorithm DIRA using patient-specific tissue classification via DECT for improved quantitative CT in dose planning", Medical physics (Lancaster), 44(6): 2345-2357, 2017.

AbstractKeywordsBiBTeXDOI

Abstract

Purpose: To develop and evaluate-in a proof-of-concept configuration-a novel iterative reconstruction algorithm (DIRA) for quantitative determination of elemental composition of patient tissues for application to brachytherapy with low energy (amp;lt; 50 keV) photons and proton therapy. Methods: DIRA was designed as a model-based iterative reconstruction algorithm, which uses filtered backprojection, automatic segmentation and multimaterial tissue decomposition. The evaluation was done for a phantom derived from the voxelized ICRP 110 male phantom. Soft tissues were decomposed to the lipid, protein and water triplet, bones were decomposed to the compact bone and bone marrow doublet. Projections were derived using the Drasim simulation code for an axial scanning configuration resembling a typical DECT (dual-energy CT) scanner with 80 kV and Sn140 kV x-ray spectra. The iterative loop produced mono-energetic images at 50 and 88 keV without beam hardening artifacts. Different noise levels were considered: no noise, a typical noise level in diagnostic imaging and reduced noise level corresponding to tenfold higher doses. An uncertainty analysis of the results was performed using type A and B evaluations. The two approaches were compared. Results: Linear attenuation coefficients averaged over a region were obtained with relative errors less than 0.5% for all evaluated regions. Errors in average mass fractions of the three-material decomposition were less than 0.04 for no noise and reduced noise levels and less than 0.11 for the typical noise level. Mass fractions of individual pixels were strongly affected by noise, which slightly increased after the first iteration but subsequently stabilized. Estimates of uncertainties in mass fractions provided by the type B evaluation differed from the type A estimates by less than 1.5% for most cases. The algorithm was fast, the results converged after 5 iterations. The algorithmic complexity of forward polyenergetic projection calculation was much reduced by using material doublets and triplets. Conclusions: The simulations indicated that DIRA is capable of determining elemental composition of tissues, which are needed in brachytherapy with low energy (amp;lt; 50 keV) photons and proton therapy. The algorithm provided quantitative monoenergetic images with beam hardening artifacts removed. Its convergence was fast, image sharpness expressed via the modulation transfer function was maintained, and image noise did not increase with the number of iterations. c 2017 American Association of Physicists in Medicine

Tahir Nawaz, Amanda Berg, James Ferryman, Jörgen Ahlberg, Michael Felsberg, "Effective evaluation of privacy protection techniques in visible and thermal imagery", Journal of Electronic Imaging (JEI), 26(5), 2017.

AbstractKeywordsBiBTeXDOIFulltext

Jan Heggenes, Arvid Odland, Tomas Chevalier, Jörgen Ahlberg, Amanda Berg, Håkan Larsson, Dag Bjerketvedt, "Herbivore grazing—or trampling? Trampling effects by a large ungulate in cold high-latitude ecosystems", Ecology and Evolution, 7(16): 6423-6431, 2017.

AbstractKeywordsBiBTeXDOIFulltext

Martin Danelljan, Gustav Häger, Fahad Shahbaz Khan, Michael Felsberg, "Discriminative Scale Space Tracking", IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(8): 1561-1575, 2017.

AbstractKeywordsBiBTeXDOIFulltext

Freddie Åström, Michael Felsberg, George Baravdish, "Mapping-Based Image Diffusion", Journal of Mathematical Imaging and Vision, 57(3): 293-323, 2017.

AbstractKeywordsBiBTeXDOIFulltext

2016

Amanda Berg, Jörgen Ahlberg, Michael Felsberg, "Enhanced analysis of thermographic images for monitoring of district heat pipe networks", Pattern Recognition Letters, 83(2): 215-223, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Giuseppe Bianco, Mihaela Ilieva, Clas Veibäck, Kristoffer Öfjäll, Alicja Gadomska, Gustaf Hendeby, Michael Felsberg, Fredrik Gustafsson, Susanne Åkesson, "Emlen funnel experiments revisited: methods update for studying compass orientation in songbirds", Ecology and Evolution, 6(19): 6930-6942, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

1 Migratory songbirds carry an inherited capacity to migrate several thousand kilometers each year crossing continental landmasses and barriers between distant breeding sites and wintering areas. How individual songbirds manage with extreme precision to find their way is still largely unknown. The functional characteristics of biological compasses used by songbird migrants has mainly been investigated by recording the birds directed migratory activity in circular cages, so-called Emlen funnels. This method is 50 years old and has not received major updates over the past decades. The aim of this work was to compare the results from newly developed digital methods with the established manual methods to evaluate songbird migratory activity and orientation in circular cages. 2 We performed orientation experiments using the European robin (Erithacus rubecula) using modified Emlen funnels equipped with thermal paper and simultaneously recorded the songbird movements from above. We evaluated and compared the results obtained with five different methods. Two methods have been commonly used in songbirds orientation experiments; the other three methods were developed for this study and were based either on evaluation of the thermal paper using automated image analysis, or on the analysis of videos recorded during the experiment. 3 The methods used to evaluate scratches produced by the claws of birds on the thermal papers presented some differences compared with the video analyses. These differences were caused mainly by differences in scatter, as any movement of the bird along the sloping walls of the funnel was recorded on the thermal paper, whereas video evaluations allowed us to detect single takeoff attempts by the birds and to consider only this behavior in the orientation analyses. Using computer vision, we were also able to identify and separately evaluate different behaviors that were impossible to record by the thermal paper. 4 The traditional Emlen funnel is still the most used method to investigate compass orientation in songbirds under controlled conditions. However, new numerical image analysis techniques provide a much higher level of detail of songbirds migratory behavior and will provide an increasing number of possibilities to evaluate and quantify specific behaviors as new algorithms will be developed.

Alexander Örtenberg, Maria Magnusson, Michael Sandborg, Gudrun Alm Carlsson, Alexandr Malusek, "PARALLELISATION OF THE MODEL-BASED ITERATIVE RECONSTRUCTION ALGORITHM DIRA", Radiation Protection Dosimetry, 169(1-4): 405-409, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Martin Kardell, Maria Magnusson, Michael Sandborg, Gudrun Alm Carlsson, Julius Jeuthe, Alexandr Malusek, "AUTOMATIC SEGMENTATION OF PELVIS FOR BRACHYTHERAPYOF PROSTATE", Radiation Protection Dosimetry, 169(1-4): 398-404, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Bertil Grelsson, Michael Felsberg, Folke Isaksson, "Highly Accurate Attitude Estimation via Horizon Detection", Journal of Field Robotics, 33(7): 967-993, 2016.

AbstractKeywordsBiBTeXDOIFulltext

2015

Fahad Khan, Jiaolong Xu, Joost van de Weijer, Andrew D. Bagdanov, Rao Muhammad Anwer, Antonio M. Lopez, "Recognizing Actions Through Action-Specific Person Detection", IEEE Transactions on Image Processing, 24(11): 4422-4432, 2015.

AbstractKeywordsBiBTeXDOI

Abstract

Action recognition in still images is a challenging problem in computer vision. To facilitate comparative evaluation independently of person detection, the standard evaluation protocol for action recognition uses an oracle person detector to obtain perfect bounding box information at both training and test time. The assumption is that, in practice, a general person detector will provide candidate bounding boxes for action recognition. In this paper, we argue that this paradigm is suboptimal and that action class labels should already be considered during the detection stage. Motivated by the observation that body pose is strongly conditioned on action class, we show that: 1) the existing state-of-the-art generic person detectors are not adequate for proposing candidate bounding boxes for action classification; 2) due to limited training examples, the direct training of action-specific person detectors is also inadequate; and 3) using only a small number of labeled action examples, the transfer learning is able to adapt an existing detector to propose higher quality bounding boxes for subsequent action classification. To the best of our knowledge, we are the first to investigate transfer learning for the task of action-specific person detection in still images. We perform extensive experiments on two benchmark data sets: 1) Stanford-40 and 2) PASCAL VOC 2012. For the action detection task (i.e., both person localization and classification of the action performed), our approach outperforms methods based on general person detection by 5.7% mean average precision (MAP) on Stanford-40 and 2.1% MAP on PASCAL VOC 2012. Our approach also significantly outperforms the state of the art with a MAP of 45.4% on Stanford-40 and 31.4% on PASCAL VOC 2012. We also evaluate our action detection approach for the task of action classification (i.e., recognizing actions without localizing them). For this task, our approach, without using any ground-truth person localization at test time, outperforms on both data sets state-of-the-art methods, which do use person locations.

Michael Felsberg, Kristoffer Öfjäll, Reiner Lenz, "Unbiased decoding of biologically motivated visual feature descriptors", Frontiers in Robotics and AI, 2(20), 2015.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

Visual feature descriptors are essential elements in most computer and robot vision systems. They typically lead to an abstraction of the input data, images, or video, for further processing, such as clustering and machine learning. In clustering applications, the cluster center represents the prototypical descriptor of the cluster and estimates the corresponding signal value, such as color value or dominating flow orientation, by decoding the prototypical descriptor. Machine learning applications determine the relevance of respective descriptors and a visualization of the corresponding decoded information is very useful for the analysis of the learning algorithm. Thus decoding of feature descriptors is a relevant problem, frequently addressed in recent work. Also, the human brain represents sensorimotor information at a suitable abstraction level through varying activation of neuron populations. In previous work, computational models have been derived that agree with findings of neurophysiological experiments on the represen-tation of visual information by decoding the underlying signals. However, the represented variables have a bias toward centers or boundaries of the tuning curves. Despite the fact that feature descriptors in computer vision are motivated from neuroscience, the respec-tive decoding methods have been derived largely independent. From first principles, we derive unbiased decoding schemes for biologically motivated feature descriptors with a minimum amount of redundancy and suitable invariance properties. These descriptors establish a non-parametric density estimation of the underlying stochastic process with a particular algebraic structure. Based on the resulting algebraic constraints, we show formally how the decoding problem is formulated as an unbiased maximum likelihood estimator and we derive a recurrent inverse diffusion scheme to infer the dominating mode of the distribution. These methods are evaluated in experiments, where stationary points and bias from noisy image data are compared to existing methods.

Nenad Markuš, Marco Fratarcangeli, Igor Pandžić, Jörgen Ahlberg, "Fast Rendering of Image Mosaics and ASCII Art", Computer graphics forum (Print), 34(6): 251-261, 2015.

AbstractKeywordsBiBTeXDOI

George Baravdish, Olof Svensson, Freddie Åström, "On Backward p(x)-Parabolic Equations for Image Enhancement", Numerical Functional Analysis and Optimization, 36(2): 147-168, 2015.

AbstractKeywordsBiBTeXDOI

Fahad Shahbaz Khan, Rao Muhammad Anwer, Joost van de Weijer, Michael Felsberg, Jorma Laaksonen, "Compact color–texture description for texture classification", Pattern Recognition Letters, 51: 16-22, 2015.

AbstractKeywordsBiBTeXDOI

2014

Fahad Shahbaz Khan, Shida Beigpour, Joost van de Weijer, Michael Felsberg, "Painting-91: a large scale database for computational painting categorization", Machine Vision and Applications, 25(6): 1385-1397, 2014.

AbstractKeywordsBiBTeXDOIFulltext

Fahad Khan, Joost van de Weijer, Rao Muhammad Anwer, Michael Felsberg, Carlo Gatta, "Semantic Pyramids for Gender and Action Recognition", IEEE Transactions on Image Processing, 23(8): 3633-3645, 2014.

AbstractKeywordsBiBTeXDOIFulltext

Erik Ringaby, Per-Erik Forssén, Ola Friman, Thomas Olsvik Opsahl, Trym Vegard Haavardsholm, Ingebjørg Kåsen, "Anisotropic Scattered Data Interpolation for Pushbroom Image Rectification", IEEE Transactions on Image Processing, 23(5): 2302-2314, 2014.

AbstractKeywordsBiBTeXDOIFulltext

2013

Fahad Shahbaz Khan, Muhammad Anwer Rao, Joost van de Weijer, Andrew Bagdanov, Antonio Lopez, Michael Felsberg, "Coloring Action Recognition in Still Images", International Journal of Computer Vision, 105(3): 205-221, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Vasileios Zografos, Reiner Lenz, Michael Felsberg, "The Weibull manifold in low-level image processing: an application to automatic image focusing.", Image and Vision Computing, 31(5): 401-417, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Alexandr Malusek, Mattias Karlsson, Maria Magnusson, Gudrun Alm Carlsson, "The potential of dual-energy computed tomography for quantitative decomposition of soft tissues to water, protein and lipid in brachytherapy", Physics in Medicine and Biology, 58(4): 771-785, 2013.

AbstractKeywordsBiBTeXDOIFulltext

David Windridge, Michael Felsberg, Affan Shaukat, "A Framework for Hierarchical Perception–Action Learning Utilizing Fuzzy Reasoning", IEEE transactions on systems, man and cybernetics. Part B. Cybernetics, 43(1): 155-169, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Michael Felsberg, Fredrik Larsson, Johan Wiklund, Niclas Wadströmer, Jörgen Ahlberg, "Online Learning of Correspondences between Images", IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1): 118-129, 2013.

AbstractKeywordsBiBTeXDOIFulltext

2012

Erik Ringaby, Per-Erik Forssén, "Efficient Video Rectification and Stabilisation for Cell-Phones", International Journal of Computer Vision, 96(3): 335-352, 2012.

AbstractKeywordsBiBTeXDOIFulltext

2011

Liam Ellis, Nicholas Dowson, Jiri Matas, Richard Bowden, "Linear Regression and Adaptive Appearance Models for Fast Simultaneous Modelling and Tracking", International Journal of Computer Vision, 95(2): 154-179, 2011.

AbstractKeywordsBiBTeXDOIFulltext

Freddie Åström, Rasit Koker, "A parallel neural network approach to prediction of Parkinson´s Disease", Expert systems with applications, 38(10): 12470-12474, 2011.

AbstractKeywordsBiBTeXDOI

Klas Nordberg, "The Key to Three-View Geometry", International Journal of Computer Vision, 94(3): 282-294, 2011.

AbstractKeywordsBiBTeXDOI

Abstract

In this article we describe a set of canonical transformations of the image spaces that make the description of three-view geometry very simple. The transformations depend on the three-view geometry and the canonically transformed trifocal tensor T' takes the form of a sparse array where 17 elements in well-defined positions are zero, it has a linear relation to the camera matrices and to two of the fundamental matrices, a third order relation to the third fundamental matrix, a second order relation to the other two trifocal tensors, and first order relations to the 10 three-view all-point matching constraints. In this canonical form, it is also simple to determine if the corresponding camera configuration is degenerate or co-linear. An important property of the three canonical transformations of the images spaces is that they are in SO(3). The 9 parameters needed to determine these transformations and the 9 parameters that determine the elements of T' together provide a minimal parameterization of the tensor. It does not have problems with multiple maps or multiple solutions that other parameterizations have, and is therefore simple to use. It also provides an implicit representation of the trifocal internal constraints: the sparse canonical representation of the trifocal tensor can be determined if and only if it is consistent with its internal constraints. In the non-ideal case, the canonical transformation can be determined by solving a minimization problem and a simple algorithm for determining the solution is provided. This allows us to extend the standard linear method for estimation of the trifocal tensor to include a constraint enforcement as a final step, similar to the constraint enforcement of the fundamental matrix.

Experimental evaluation of this extended linear estimation method shows that it significantly reduces the geometric error of the resulting tensor, but on average the algebraic estimation method is even better. For a small percentage of cases, however, the extended linear method gives a smaller geometric error, implying that it can be used as a complement to the algebraic method for these cases.

Fredrik Larsson, Michael Felsberg, Per-Erik Forssen, "Correlating Fourier descriptors of local patches for road sign recognition", IET Computer Vision, 5(4): 244-254, 2011.

AbstractKeywordsBiBTeXDOIFulltext

Michael Felsberg, "Autocorrelation-Driven Diffusion Filtering", IEEE Transactions on Image Processing, 20(7): 1797-1806, 2011.

AbstractKeywordsBiBTeXDOIFulltext

2010

Marcus Wallenberg, Per-Erik Forssén, "Embodied Object Recognition using Adaptive Target Observations", Cognitive Computation, 2(4): 316-325, 2010.

AbstractKeywordsBiBTeXDOIFulltext

Fredrik Viksten, Per-Erik Forssén, Björn Johansson, Anders Moe, "Local Image Descriptors for Full 6 Degree-of-Freedom Object Pose Estimation and Recognition", -, (Submitted), 2010.

AbstractKeywordsBiBTeX

2009

Klas Nordberg, "The triangulation tensor", Computer Vision and Image Understanding, 113(9): 935-945, 2009.

AbstractKeywordsBiBTeXDOI

Abstract

This article presents a computationally efficient approach to the triangulation of 3D points from their projections in two views. The homogenous coordinates of a 3D point is given as a multi-linear mapping on its homogeneous image coordinates, a computation of low computational complexity. The multi-linear mapping is a tensor, and an element of a projective space, that can be computed directly from the camera matrices and some parameters. These parameters imply that the tensor is not unique: for a given camera pair the subspace K of triangulation tensors is six-dimensional. The triangulation tensor is 3D projective covariant and satisfies a set of internal constraints. Reconstruction of 3D points using the proposed tensor is studied for the non-ideal case, when the image coordinates are perturbed by noise and the epipolar constraint exactly is not satisfied exactly. A particular tensor of K is then the optimal choice for a simple reduction of 3D errors, and we present a computationally efficient approach for determining this tensor. This approach implies that normalizing image coordinate transformations are important for obtaining small 3D errors.

In addition to computing the tensor from the cameras, we also investigate how it can be further optimized relative to error measures in the 3D and 2D spaces. This optimization is evaluated for sets of real 3D + 2D + 2D data by comparing the reconstruction to some of the triangulation methods found in the literature, in particular the so-called optimal method that minimizes 2D L₂ errors. The general conclusion is that, depending on the choice of error measure and the optimization implementation, it is possible to find a tensor that produces smaller 3D errors (both L₁ and L₂) but slightly larger 2D errors than the optimal method does. Alternatively, we may find a tensor that gives approximately comparable results to the optimal method in terms of both 3D and 2D errors. This means that the proposed tensor based method of triangulation is both computationally efficient and can be calibrated to produce small reconstruction or reprojection errors for a given data set.

Michael Felsberg, Johan Wiklund, Gösta Granlund, "Exploratory learning structures in artificial cognitive systems", Image and Vision Computing, 27(11): 1671-1687, 2009.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

The major goal of the COSPAL project is to develop an artificial cognitive system architecture, with the ability to autonomously extend its capabilities. Exploratory learning is one strategy that allows an extension of competences as provided by the environment of the system. Whereas classical learning methods aim at best for a parametric generalization, i.e., concluding from a number of samples of a problem class to the problem class itself, exploration aims at applying acquired competences to a new problem class, and to apply generalization on a conceptual level, resulting in new models. Incremental or online learning is a crucial requirement to perform exploratory learning. In the COSPAL project, we mainly investigate reinforcement-type learning methods for exploratory learning, and in this paper we focus on the organization of cognitive systems for efficient operation. Learning is used over the entire system. It is organized in the form of four nested loops, where the outermost loop reflects the user-reinforcement-feedback loop, the intermediate two loops switch between different solution modes at symbolic respectively sub-symbolic level, and the innermost loop performs the acquired competences in terms of perception-action cycles. We present a system diagram which explains this process in more detail. We discuss the learning strategy in terms of learning scenarios provided by the user. This interaction between user (teacher) and system is a major difference to classical robotics systems, where the system designer places his world model into the system. We believe that this is the key to extendable robust system behavior and successful interaction of humans and artificial cognitive systems. We furthermore address the issue of bootstrapping the system, and, in particular, the visual recognition module. We give some more in-depth details about our recognition method and how feedback from higher levels is implemented. The described system is however work in progress and no final results are available yet. The available preliminary results that we have achieved so far, clearly point towards a successful proof of the architecture concept.

Erik Jonsson, Michael Felsberg, "Efficient computation of channel-coded feature maps through piecewise polynomials", Image and Vision Computing, 27(11): 1688-1694, 2009.

AbstractKeywordsBiBTeXDOIFulltext

Gösta Granlund, "Special issue on Perception, Action and Learning", Image and Vision Computing, 27(11): 1639-1640, 2009.

KeywordsBiBTeXDOI

Fredrik Larsson, Erik Jonsson, Michael Felsberg, "Simultaneously learning to recognize and control a low-cost robotic arm", Image and Vision Computing, 27(11): 1729-1739, 2009.

AbstractKeywordsBiBTeXDOIFulltext

Björn Johansson, Johan Wiklund, Per-Erik Forssén, Gösta Granlund, "Combining shadow detection and simulation for estimation of vehicle size and position", PATTERN RECOGNITION LETTERS, 30(8): 751-759, 2009.

AbstractKeywordsBiBTeXDOIFulltext

Michael Felsberg, Sinan Kalkan, Norbert Krüger, "Continuous dimensionality characterization of image structures", Image and Vision Computing, 27(6): 628-636, 2009.

AbstractKeywordsBiBTeXDOIFulltext

Per-Erik Forssen, Anders Moe, "View matching with blob features", Image and Vision Computing, 27(1-2): 99-107, 2009.

AbstractKeywordsBiBTeXDOI

2008

Michael Felsberg, "COSPAL -- A Study on Artificial Cognitive Systems", Engineering & technology, 3(18), 2008.

KeywordsBiBTeX

2007

Remco Duits, Michael Felsberg, Gösta Granlund, Bart M. ter Haar Romeny, "Image Analysis and Reconstruction using a Wavelet Transform Constructed from a Reducible Representation of the Euclidean Motion Group", International Journal of Computer Vision, 72(1): 79-102, 2007.

AbstractKeywordsBiBTeXDOIFulltext

Michael Felsberg, Johan Hedborg, "Real-Time View-Based Pose Recognition and Interpolation for Tracking Initialization", Journal of Real-Time Image Processing, 2(2-3): 103-115, 2007.

AbstractKeywordsBiBTeXDOIFulltext

Michael Felsberg, Reinhard Koch, "Editorial for the special issue on markerless real-time tracking for augmented reality image synthesis", Journal of Real-Time Image Processing, 2(2-3): 67-68, 2007.

AbstractKeywordsBiBTeXDOI

Jigna Chandaria, Graham Thomas, Bogumil Bartczak, Reinhard Koch, Mario Becker, Gabriele Bleser, Didier Stricker, Cedric Wohlleber, Fredrik Gustafsson, Michael Felsberg, Jeroen Hol, Thomas Schön, Johan Skoglund, Per Slycke, Sebastiaan Smeitz, "Real-Time Camera Tracking in the MATRIS Project", Smpte Journal, 116(7-8): 266-271, 2007.

AbstractKeywordsBiBTeX

2006

Björn Johansson, Tommy Elfving, Vladimir Kozlov, Y. Censor, Per-Erik Forssén, Gösta Granlund, "The application of an oblique-projected Landweber method to a model of supervised learning", Mathematical and computer modelling, 43(7-8): 892-909, 2006.

AbstractKeywordsBiBTeXDOI

Michael Felsberg, P.-E. Forssen, H. Scharr, "Channel smoothing: Efficient robust smoothing of low-level signal features", IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(2): 209-222, 2006.

AbstractKeywordsBiBTeXDOIFulltext

Klas Nordberg, Patrick Doherty, Per-Erik Forssén, Johan Wiklund, Per Andersson, "A flexible runtime system for image processing in a distributed computational environment for an unmanned aerial vehicle", International Journal of Pattern Recognition and Artificial Intelligence, 20(5): 763-780, 2006.

AbstractKeywordsBiBTeXDOI

Luis Merino, Johan Wiklund, Fernando Caballero, Anders Moe, Jose Ramiro Martinez-de Dios, Per-Erik Forssén, Klas Nordberg, Annibal Ollero, "Vision-Based Multi-UAV Position Estimation", IEEE Robotics & Automation Magazine, 13(3): 53-62, 2006.

AbstractKeywordsBiBTeXDOI

Maria Magnusson, Per-Erik Danielsson, Johan Sunnegårdh, "Handling of Long Objects in Iterative Improvement of Non-Exact Reconstruction in Helical Cone-Beam CT", IEEE Transactions on Medical Imaging, 25(7): 935-940, 2006.

AbstractKeywordsBiBTeXDOI

Gösta Granlund, "A Cognitive Vision Architecture Integrating Neural Networks with Symbolic Processing", Künstliche Intelligenz, 18-24, 2006.

AbstractKeywordsBiBTeX

2005

Klas Nordberg, Gunnar Farnebäck, "Estimation of orientation tensors for simple signals by means of second-order filters", Signal Processing: Image Communication, 20(6): 582-594, 2005.

AbstractKeywordsBiBTeXDOI

Anibal Ollero, Simon Lacroix, Luis Merino, Jeremi Gancet, Johan Wiklund, Volker Remuß, Iker Veiga Perez, Luis G. Gutiérrez, Domingos Xavier Viegas, Miguel Angel González Benitez, Anthony Mallet, Rachid Alami, Raja Chatila, Günter Hommel, F. J. Colmenero Lechuga, Begoña C. Arrue, Joaquin Ferruz, Jose Ramiro Martinez-de Dios, Fernando Caballero, "Multiple Eyes in the Skies", IEEE robotics & automation magazine, 12(2): 46-57, 2005.

AbstractKeywordsBiBTeXDOI

Michael Felsberg, R. Duits, L. Florack, "The Monogenic Scale Space on a Rectangular Domain and its Features", International Journal of Computer Vision, 64(2--3), 2005.

AbstractKeywordsCodeBiBTeX

Alexandr Malusek, Maria Magnusson Seger, Michael Sandborg, Gudrun Alm Carlsson, "Effect of scatter on reconstructed image quality in cone beam CT: evaluation of a scatterreduction optimization function", Radiation Protection Dosimetry, 114(1-3): 337-340, 2005.

AbstractKeywordsBiBTeXDOI

2004

N. Kruger, Michael Felsberg, "An explicit and compact coding of geometric and structural image information applied to stereo processing", Pattern Recognition Letters, 25(8): 849-863, 2004.

AbstractKeywordsBiBTeXDOI

Michael Felsberg, Gerald Sommer, "The Monogenic Scale-Space: A Unifying Approach to Phase-Based Image Processing in Scale-Space", Journal of Mathematical Imaging and Vision, 21, 2004.

AbstractKeywordsBiBTeX

Gösta Granlund, Anders Moe, "Unrestricted Recognition of 3-D Objects for Robotics Using Multi-Level Triplet Invariants", Artificial Intelligence Magazine, 25(2): 51-67, 2004.

AbstractKeywordsBiBTeX

2001

Michael Felsberg, Gerald Sommer, "The monogenic signal", IEEE Transactions on Signal Processing, 49(12): 3136-3144, 2001.

AbstractKeywordsBiBTeXDOIFulltext

Per-Erik Danielsson, Qingfen Lin, Qin-Zhong Ye, "Efficient detection of second-degree variations in 2D and 3D images", Journal of Visual Communication and Image Representation, 12(3): 255-305, 2001.

AbstractKeywordsBiBTeXDOIFulltext

1999

Gösta Granlund, "The Complexity of Vision", Signal Processing, 74(1): 101-126, 1999.

AbstractKeywordsBiBTeXDOI

Abstract

There is no indication that it will ever be possible to find some simple trick that miraculously solves most problems in vision. It turns out that the processing system must be able to implement a model structure, the complexity of which is directly related to the structural complexity of the problem under consideration in the external world. It has become increasingly apparent that Vision cannot be treated in isolation from the response generation, because a very high degree of integration is required between different levels of percepts and corresponding response primitives. The response to be produced at a given instance is as much dependent upon the state of the system, as the percepts impinging upon the system. In addition, it has become apparent that many classical aspects of perception, such as geometry, probably do not belong to the percept domain of a Vision system, but to the response domain. This article will focus on what are considered crucial problems in Vision for robotics for the future, rather than on the classical solutions today. It will discuss hierarchical architectures for combination of percept and response primitives. It will discuss the concept of combined percept-response invariances as important structural elements for Vision. It will be maintained that learning is essential to obtain the necessary flexibility and adaptivity. In consequence, it will be argued that invariances for the purpose of Vision are not abstractly geometrical, but derived from the percept-response interaction with the environment. The issue of information representation becomes extremely important in distributed structures of the types foreseen, where uncertainty of information has to be stated for update of models and associated data. The question of object representation is central to the paper. Equivalence is established between the representations of response, geometry and time. Finally an integrated percept-response structure is proposed for flexible response control.

1998

Hans Knutsson, Mats T. Andersson, Torbjörn Kronander, Magnus Hemmendorff, "Spatio-temporal filtering of digital angiography image data", Computer Methods and Programs in Biomedicine, 57(1-2): 115-123, 1998.

AbstractKeywordsBiBTeXDOI

Abstract

As welfare diseases become more common all over the world the demand for angiography examinations is increasing rapidly. The development of advanced medical signal processing methods has with few exceptions been concentrated towards CT and MR while traditional contrast based radiology depend on methods developed for ancient photography techniques despite the fact that angiography sequences are generally recorded in digital form. This article presents a new approach for processing of angiography sequences based on advanced image processing methods. The developed algorithm automatically processes angiography sequences containing motion artifacts that cannot be processed by conventional methods like digital subtraction angiography (DSA) and pixel shift due to non uniform motions. The algorithm can in simple terms be described as an ideal pixelshift filter carrying out shifts of different directions and magnitude according to the local motions in the image. In difference to conventional methods it is fully automatic, no mask image needs to be defined and the manual pixelshift operations, which are extremely time consuming, are eliminated. The algorithm is efficient and robust and is designed to run on standard hardware of a powerful workstation which excludes the need for expensive dedicated angiography platforms. Since there is no need to make additional recordings if the patient moves, the patient is exposed to less amount of radiation and contrast fluid. The most exciting benefits by this method are, however, that it opens up new areas for contrast based angiography that are not possible to process with conventional methods e.g. nonuniform motions and multiple layers of moving tissue. Advanced image processing methods provide significantly better image quality and noise suppression but do also provide the means to compute flow velocity and visualize the flow dynamics in the arterial trees by e.g. using color. Initial tests have proven that it is possible to discriminate capillary blood flow from angiography data which opens up interesting possibilities for estimating the blood flow in the heart muscle without use of nuclear methods.

1996

Sisir Roy, Malay K. Kundu, Gösta H. Granlund, "Uncertainty Relations and Time-Frequency Distributions for Unsharp Observables", Information Sciences, 89(3-4): 193-209, 1996.

AbstractKeywordsBiBTeXDOI

Carl-Johan Westelius, Carl-Fredrik Westin, Hans Knutsson, "Focus of Attention Mechanisms using Normalized Convolution", IEEE transactions on robotics and automation, 1996.

KeywordsBiBTeX

1994

Gösta H. Granlund, Hans Knutsson, Carl-Johan Westelius, Johan Wiklund, "Issues in Robot Vision", Image and Vision Computing, 12(3): 131-148, 1994.

AbstractKeywordsBiBTeXDOI

1993

Jörgen Karlholm, "Associative Memories with Short--Range Higher Order Couplings", Neural Networks, 6(3): 409-421, 1993.

AbstractKeywordsBiBTeXDOI

Håkan Bårman, Gösta H. Granlund, Leif Haglund, "Feature Extraction for Computer-Aided Analysis of Mammograms", International journal of pattern recognition and artificial intelligence, 7(6): 1339-1356, 1993.

AbstractKeywordsBiBTeXDOIFulltext

1991

Josef Bigun, Gösta H. Granlund, Johan Wiklund, "Multidimensional orientation estimation with applications to texture analysis and optical flow", IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(8): 775-790, 1991.

AbstractKeywordsBiBTeXDOI

1990

Josef Bigün, "A Structure Feature for Some Image Processing Applications Based on Spiral Functions", Computer Vision, Graphics and Image Processing, 51(2): 166-194, 1990.

AbstractKeywordsBiBTeXDOI

1988

Roland Wilson, Hans Knutsson, "Uncertainty and Inference in the Visual System", IEEE Transactions on Systems, Man and Cybernetics, 18(2): 305-312, 1988.

AbstractKeywordsBiBTeXDOI

1987

J. H. Stern, Hans Knutsson, P. R. MacLeish, "Divalent Cations Directly Affect the Conductance of Excised Patches of Rod Photoreceptor Membrane.", Science, 236(4809): 1674-1678, 1987.

AbstractKeywordsBiBTeXDOI

1986

P. R. MacLeish, Hans Knutsson, J. H. Stern, "The Control of the Rod Outer Segment Conductance by Cyclic-GMP and Divalent Cations.", Photobiochemistry and Photobiophysics, 13: 359-372, 1986.

KeywordsBiBTeX

1984

Ola Wahlström, Hans Knutsson, "A Device for Generation of Electromagnetic Fields of Extremely Low Frequency", Journal of Biomedical Engineering, 6(4): 293-296, 1984.

AbstractKeywordsBiBTeXDOI

1983

Roland Wilson, Hans Knutsson, Gösta H. Granlund, "Anisotropic Non-Stationary Image Estimation and its Applications: Part II. Predictive Image Coding", IEEE Transactions on Communications, 31(3): 398-406, 1983.

AbstractKeywordsBiBTeX

Hans Knutsson, Roland Wilson, Gösta H. Granlund, "Anisotropic Non-Stationary Image Estimation and its Applications: Part I. Restoration of Noisy Images", IEEE Transactions on Communications, COM--31(3): 388-397, 1983.

AbstractKeywordsBiBTeXDOIFulltext

1980

Christer U. Petersson, Paul Edholm, Gösta H. Granlund, Hans E. Knutsson, "Ectomography. A New Radiographic Reconstruction Method: II. Computer Simulated Experiments", IEEE Transactions on Biomedical Engineering, BME--27(11): 649-655, 1980.

AbstractKeywordsBiBTeXDOI

Paul Edholm, Gösta Granlund, Hans Knutsson, C. Petersson, "Ectomography: A New Radiographic Method for Reproducing a Selected Slice of Varying Thickness", Acta Radiologica, 21(4): 433-442, 1980.

AbstractKeywordsBiBTeX

Hans E. Knutsson, Paul Edholm, Gösta H. Granlund, Christer U. Petersson, "Ectomography. A New Radiographic Reconstruction Method: I. Theory and Error Estimates", IEEE Transactions on Biomedical Engineering, BME--27(11): 640-645, 1980.

AbstractKeywordsBiBTeXDOI

1978

Gösta H. Granlund, "In Search of a General Picture Processing Operator", Computer Graphics and Image Processing, 8(2): 155-173, 1978.

AbstractKeywordsBiBTeXDOI

1974

Gösta H. Granlund, "Statistical Analysis of Chromosome Characteristics", Pattern Recognition, 6(2): 115-126, 1974.

AbstractKeywordsBiBTeXDOI

1973

Gösta H. Granlund, "The Use of Distribution Functions to Describe Integrated Density Profiles of Human Chromosomes", Journal of Theoretical Biology, 40(3): 573-589, 1973.

AbstractKeywordsBiBTeXDOI

1972

Gösta H. Granlund, "Fourier Preprocessing for Hand Print Character Recognition", IEEE Transactions on Computers, C--21(2): 195-201, 1972.

KeywordsBiBTeX

Books

2018

Michael Felsberg, "Probabilistic and biologically inspired feature representations", Morgan & Claypool Publishers, San Rafael, No. 8(2), 2018.

AbstractKeywordsBiBTeXDOIFulltext

2016

Rama Chellappa, Anders Heyden, Denis Laurendeau, Michael Felsberg, Magnus Borga, "Special issue on ICPR 2014 awarded papers", Elsevier, No. Vol. 72, 2016.

AbstractKeywordsBiBTeXDOI

Abstract

We, the Guest Editors of this special issue of Pattern Recognition Letters are pleased to share these contributions with you. The papers included here are based on work from the 22nd International Conference on Pattern Recognition (IAPR) in Stockholm, Sweden, held August 24–28, 2014. The papers selected for this special issue were those winning one of the IAPR awards, as well as one paper by a former student of the winner of the KS Fu Prize, Prof. Jitendra Malik. Taken together, this body of work represents some of the finest research being conducted by the IAPR community worldwide, it builds on a rich legacy of accomplishment by the entire community, and it offers a view to the future, to where we are going as a scientific community.

For each of the award-winning papers, the authors were asked to revise and extend their contributions to full journal length and to provide true added value vis-à-vis the original conference submission. In some cases, the authors elected to modify the titles slightly, and in some cases the list of authors has also been modified. The resulting manuscripts were sent out for full review by a different set of referees than those who reviewed the conference versions. The process, including required revisions, was in accordance with the standing editorial policy of Pattern Recognition Letters, resulting in the final versions accepted and appearing here. These are thoroughly vetted, high-caliber scientific contributions.

It has been our honor to serve as Guest Editors for this special issue. We would like to thank the Editors of Pattern Recognition Letters for allowing us this opportunity. We are especially grateful to Dr. Gabriella Sanniti di Baja for her enthusiasm, support, and her willingness to keep prodding us along to bring the special issue through to completion. We would also like to thank all of those who reviewed the papers, both originally for the conference and subsequently for the journal, and those who served on the ICPR awards and KS Fu Prize committees.

Finally, we express our heartfelt gratitude to all of the authors for taking the time to prepare these versions for our collective enlightenment, sharing their knowledge, innovation, and discoveries with the rest of us.

1995

Gösta Granlund, Hans Knutsson, "Signal Processing for Computer Vision", Kluwer, Dordrecht, 1995.

AbstractKeywordsBiBTeX

Book chapters

2023

Maria Magnusson, Gudrun Alm Carlsson, Michael Sandborg, Åsa Carlsson Tedgren, Alexandr Malusek, "On the Choice of Base Materials for Alvarez–Macovski and DIRA Dual-energy Reconstruction Algorithms in CT", Photon Counting Computed Tomography, 153-175, 2023.

AbstractKeywordsBiBTeXDOI

2022

Michael Felsberg, "Visual tracking: Tracking in scenes containing multiple moving objects", Advanced Methods and Deep Learning in Computer Vision, 305-336, 2022.

KeywordsBiBTeX

2015

Hannes Ovrén, Per-Erik Forssén, David Törnqvist, "Improving RGB-D Scene Reconstruction using Rolling Shutter Rectification", New Development in Robot Vision, Cognitive Systems Monographs, No. 23, 55-71, 2015.

AbstractKeywordsBiBTeXDOI

Kristoffer Öfjäll, Michael Felsberg, "Online Learning of Vision-Based Robot Control during Autonomous Operation", New Development in Robot Vision, Cognitive Systems Monographs, No. Vol. 23, 137-156, 2015.

AbstractKeywordsBiBTeXDOIFulltext

2013

Reiner Lenz, Vasileios Zografos, Martin Solli, "Dihedral Color Filtering", Advanced Color Image Processing and Analysis, 119-145, 2013.

AbstractKeywordsBiBTeXDOI

2012

Michael Felsberg, "Adaptive Filtering using Channel Representations", Mathematical Methods for Signal and Image Analysis and Representation, Computational Imaging and Vision, No. 41, 31-48, 2012.

AbstractKeywordsBiBTeXDOIFulltext

2007

Luis Merino, Fernando Caballero, Joaquín Ferruz, Johan Wiklund, Per-Erik Forssen, Anibal Ollero, "Multi-UAV Cooperative Perception Techniques", Multiple Heterogeneous Unmanned Aerial Vehicles, Springer Tracts in Advanced Robotics, No. 37, 67-110, 2007.

AbstractKeywordsBiBTeXDOI

Luis Merino, Fernando Caballero, Per-Erik Forssén, Johan Wiklund, Joaquín Ferruz, Jose Ramiro Martinez-de Dios, Anders Moe, Klas Nordberg, Anibal Ollero, "Single and Multi-UAV Relative Position Estimation Based on Natural Landmarks", Advances in Unmanned Aerial Vehicles, Microprocessor-Based and Intelligent Systems Engineering, No. 33, 267-307, 2007.

AbstractKeywordsBiBTeXDOI

2006

Gösta Granlund, "Organization of Architectures for Cognitive Vision Systems", Cognitive Vision Systems, Lecture Notes in Computer Science, No. Vol. 3948, 37-55, 2006.

AbstractKeywordsBiBTeXDOIFulltext

2003

Katarina Flood, Per-Erik Danielsson, Maria Magnusson Seger, "On 3D scanning, reconstruction, enhancement, and segmentation of logs", Image Analysis, Lecture Notes in Computer Science, No. 2749, 733-740, 2003.

AbstractKeywordsBiBTeXDOI

Gunnar Farnebäck, "Two-frame motion estimation based on polynomial expansion", Image Analysis, Lecture Notes in Computer Science, No. 2749, 363-370, 2003.

AbstractKeywordsBiBTeXDOI

Per-Erik Danielsson, Qingfen Lin, "A modified fast marching method", Image Analysis, Lecture Notes in Computer Science, No. 2749, 1154-1161, 2003.

AbstractKeywordsBiBTeXDOI

2000

Björn Johansson, Gösta Granlund, "Fast selective detection of rotational symmetries using normalized inhibition", Proceedings of the 6th European Conference on Computer Vision, Dublin, Ireland, June 26 - July 1, Part I, Lecture notes in computer science, 0302-9743, No. 1842, 871-887, 2000.

AbstractKeywordsBiBTeXFulltext

Gösta H. Granlund, "An Associative Perception-Action Structure using a Localized Space Variant Information Representation", Algebraic Frames for the Perception-Action Cycle, Lecture Notes in Computer Science, 48-68, 2000.

AbstractKeywordsBiBTeXDOIFulltext

1996

Morgan Ulvklo, Gösta H. Granlund, Hans Knutsson, "Texture Gradient in Sparse Texture Fields", Theory and Applications of Image Analysis II, 1996.

KeywordsBiBTeX

1995

Morgan Ulvklo, "Texture Analysis", Signal Processing for Computer Vision, 399-418, 1995.

AbstractKeywordsBiBTeX

Gösta H. Granlund, Jörgen Karlholm, "Classification and Response Generation", Signal Processing for Computer Vision, 367-397, 1995.

AbstractKeywordsBiBTeX

Carl-Fredrik Westin, "Vector and Tensor Field Filtering", Signal Processing for Computer Vision, 343-365, 1995.

AbstractKeywordsBiBTeX

Hans Knutsson, Leif Haglund, "Adaptive Filtering", Signal Processing for Computer Vision, 309-342, 1995.

AbstractKeywordsBiBTeX

Carl-Fredrik Westin, Hans Knutsson, "Representation and Averaging", Signal Processing for Computer Vision, 297-308, 1995.

AbstractKeywordsBiBTeX

Hans Knutsson, Carl-Fredrik Westin, "Local Frequency", Signal Processing for Computer Vision, 279-295, 1995.

AbstractKeywordsBiBTeX

Carl-Johan Westelius, "Local Phase Estimation", Signal Processing for Computer Vision, 259-278, 1995.

AbstractKeywordsBiBTeX

Hans Knutsson, Mats Andersson, Leif Haglund, Johan Wiklund, "Orientation and Velocity", Signal Processing for Computer Vision, 219-258, 1995.

AbstractKeywordsBiBTeX

Hans Knutsson, "Kernel Optimizatioin", Signal Processing for Computer Vision, 199-218, 1995.

AbstractKeywordsBiBTeX

Klas Nordberg, "Fourier Transforms", Signal Processing for Computer Vision, 117-197, 1995.

AbstractKeywordsBiBTeX

Gösta H. Granlund, Johan Wiklund, "Low Level Operations", Signal Processing for Computer Vision, 97-116, 1995.

AbstractKeywordsBiBTeX

Gösta H. Granlund, Jörgen Karlholm, Carl-Johan Westelius, Carl-Fredrik Westin, "Biological Vision", Signal Processing for Computer Vision, 41-95, 1995.

AbstractKeywordsBiBTeX

Gösta H. Granlund, "Introduction and Overview", Signal Processing for Computer Vision, 1-39, 1995.

AbstractKeywordsBiBTeX

Carl-Johan Westelius, Hans Knutsson, Gösta Granlund, "Low Level Focus of Attention Mechanisms", Vision as Process, 1995.

KeywordsBiBTeX

Carl-Fredrik Westin, Hans Knutsson, "Line Extraction using Tensors", Vision as Process, 1995.

KeywordsBiBTeX

Jörgen Karlholm, Carl-Johan Westelius, Carl-Fredrik Westin, Hans Knutsson, "Object Tracking Based on the Orientation Tensor Concept", Theory and Applications of Image Analysis II, 267-278, 1995.

AbstractKeywordsBiBTeX

Carl-Fredrik Westin, Gösta Granlund, Hans Knutsson, "Advanced Image Processing: Introduction and Background", Vision as Process, 1995.

KeywordsBiBTeX

Carl-Johan Westelius, Hans Knutsson, Johan Wiklund, Carl-Fredrik Westin, "Phase-based Disparity Estimation", Vision as Process, 157-178, 1995.

AbstractKeywordsBiBTeX

1994

Håkan Bårman, Gösta H. Granlund, Leif Haglund, "Feature Extraction for Computer-Aided Analysis of Mammograms", State of the Art in Digital Mammographic Image Analysis, 1994.

AbstractKeywordsBiBTeX

Carl-Johan Westelius, Johan Wiklund, Carl-Fredrik Westin, "Prototyping, Visualization and Simulation Using the Application Visualization System", Experimental Environments for Computer Vision and Image Processing, Machine Perception and Artificial Intelligence, No. 11, 33-62, 1994.

AbstractKeywordsBiBTeX

1992

Leif Haglund, Håkan Bårman, Hans Knutsson, "Estimation of Velocity and Acceleration in Time Sequences", Theory & Applications of Image Analysis, 223-236, 1992.

KeywordsBiBTeX

Mats Andersson, Hans Knutsson, "Orientation Estimation in Ambiguous Neighbourhoods", Theory & Applications of Image Analysis, 189-210, 1992.

KeywordsBiBTeX

1983

Gösta H. Granlund, Jan Arvidsson, "The GOP Image Computer", Fundamentals in Computer Vision, 1983.

KeywordsBiBTeX

Gösta H. Granlund, Hans Knutsson, "Contrast of Structured and Homogenous Representations", Physical and Biological Processing of Images, 282-303, 1983.

KeywordsBiBTeX

Gösta H. Granlund, Hans Knutsson, Roland Wilson, "Image Enhancement", Fundamentals in Computer Vision, 57-68, 1983.

KeywordsBiBTeX

1980

Hans Knutsson, Paul Edholm, Gösta H. Granlund, "Aspects of 3-D Reconstruction by Fourier Techniques", Digital Signal Processing, 1980.

KeywordsBiBTeX

1973

Gösta H. Granlund, "The Use of Distribution Functions to Describe Interated Profiles of Human Chromosomes", Chromosome Identification, Proceedings of the 23rd Nobel Symposium, 1973.

KeywordsBiBTeX

Conference papers

2024

Arvi Jonnarth, Yushan Zhang, Michael Felsberg, "High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation", 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 999-1008, 2024.

AbstractKeywordsBiBTeXDOIFulltext

Martin Larsen, Sigmund Rolfsfjord, Daniel Gusland, Jörgen Ahlberg, Kim Mathiassen, "BASE: Probably a Better Approach to Visual Multi-Object Tracking", Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Rome, Italy, 2024, VISIGRAPP, 110-121, 2024.

AbstractKeywordsBiBTeXDOI

Krešimir Bešenić, Jörgen Ahlberg, Igor Pandžić, "Let Me Take a Better Look: Towards Video-Based Age Estimation", <em>Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - ICPRAM, Rome , Italy</em>, ICPRAM, 57-59, 2024.

AbstractKeywordsBiBTeXDOI

Wafa Alghallabi, Akshay Dudhane, Waqas Zamir, Salman Khan, Fahad Khan, "Accelerated MRI Reconstruction via Dynamic Deformable Alignment Based Transformer", MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2023, PT I, Lecture Notes in Computer Science, 104-114, 2024.

AbstractKeywordsBiBTeXDOI

2023

Amandeep Kumar, Ankan Kumar Bhunia, Sanath Narayan, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Ming-Hsuan Yang, Fahad Khan, "Generative Multiplane Neural Radiance for 3D-Aware Image Generation", 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, IEEE International Conference on Computer Vision, 7354-7364, 2023.

AbstractKeywordsBiBTeXDOI

Salwa Al Khatib, Mohamed El Amine Boudjoghra, Jean Lahoud, Fahad Khan, "3D Instance Segmentation via Enhanced Spatial and Semantic Supervision", 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, IEEE International Conference on Computer Vision, 541-550, 2023.

AbstractKeywordsBiBTeXDOI

Syed Talal Wasim, Muhammad Uzair Khattak, Muzammal Naseer, Salman Khan, Mubarak Shah, Fahad Khan, "Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition", 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), IEEE International Conference on Computer Vision, 13732-13743, 2023.

AbstractKeywordsBiBTeXDOI

Muhammad Uzair Khattak, Syed Talal Wasim, Muzammal Naseer, Salman Khan, Ming-Hsuan Yang, Fahad Khan, "Self-regulating Prompts: Foundational Model Adaptation without Forgetting", 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), IEEE International Conference on Computer Vision, 15144-15154, 2023.

AbstractKeywordsBiBTeXDOI

Isak Meding, Alexander Bodin, Adam Tonderski, Joakim Johnander Faxén, Christoffer Petersson, Lennart Svensson, "You can have your ensemble and run it too - Deep Ensembles Spread Over Time", 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, IEEE International Conference on Computer Vision Workshops, 4022-4031, 2023.

AbstractKeywordsBiBTeXDOI

Matej Kristan, Jiri Matas, Martin Danelljan, Michael Felsberg, Hyung Jin Chang, Luka Cehovin Zajc, Alan Lukezic, Ondrej Drbohlav, Zhongqun Zhang, Khanh-Tung Tran, Xuan-Son Vu, Johanna Bjorklund, Christoph Mayer, Yushan Zhang, Lei Ke, Jie Zhao, Gustavo Fernandez, Noor Al-Shakarji, Dong An, Michael Arens, Stefan Becker, Goutam Bhat, Sebastian Bullinger, Antoni B. Chan, Shijie Chang, Hanyuan Chen, Xin Chen, Yan Chen, Zhenyu Chen, Yangming Cheng, Yutao Cui, Chunyuan Deng, Jiahua Dong, Matteo Dunnhofer, Wei Feng, Jianlong Fu, Jie Gao, Ruize Han, Zeqi Hao, Jun-Yan He, Keji He, Zhenyu He, Xiantao Hu, Kaer Huang, Yuqing Huang, Yi Jiang, Ben Kang, Jin-Peng Lan, Hyungjun Lee, Chenyang Li, Jiahao Li, Ning Li, Wangkai Li, Xiaodi Li, Xin Li, Pengyu Liu, Yue Liu, Huchuan Lu, Bin Luo, Ping Luo, Yinchao Ma, Deshui Miao, Christian Micheloni, Kannappan Palaniappan, Hancheol Park, Matthieu Paul, HouWen Peng, Zekun Qian, Gani Rahmon, Norbert Scherer-Negenborn, Pengcheng Shao, Wooksu Shin, Elham Soltani Kazemi, Tianhui Song, Rainer Stiefelhagen, Rui Sun, Chuanming Tang, Zhangyong Tang, Imad Eddine Toubal, Jack Valmadre, Joost van de Weijer, Luc Van Gool, Jash Vira, Stephane Vujasinovic, Cheng Wan, Jia Wan, Dong Wang, Fei Wang, Feifan Wang, He Wang, Limin Wang, Song Wang, Yaowei Wang, Zhepeng Wang, Gangshan Wu, Jiannan Wu, Qiangqiang Wu, Xiaojun Wu, Anqi Xiao, Jinxia Xie, Chenlong Xu, Min Xu, Tianyang Xu, Yuanyou Xu, Bin Yan, Dawei Yang, Ming-Hsuan Yang, Tianyu Yang, Yi Yang, Zongxin Yang, Xuanwu Yin, Fisher Yu, Hongyuan Yu, Qianjin Yu, Weichen Yu, YongSheng Yuan, Zehuan Yuan, Jianlin Zhang, Lu Zhang, Tianzhu Zhang, Guodongfang Zhao, Shaochuan Zhao, Yaozong Zheng, Bineng Zhong, Jiawen Zhu, Xuefeng Zhu, Yueting Zhuang, ChengAo Zong, Kunlong Zuo, "The First Visual Object Tracking Segmentation VOTS2023 Challenge Results", 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, IEEE International Conference on Computer Vision Workshops, 1788-1810, 2023.

AbstractKeywordsBiBTeXDOI

BIBTEX

@inproceedings{diva2:1847535,
  author = {Kristan, Matej and Matas, Jiri and Danelljan, Martin and Felsberg, Michael and Chang, Hyung Jin and Zajc, Luka Cehovin and Lukezic, Alan and Drbohlav, Ondrej and Zhang, Zhongqun and Tran, Khanh-Tung and Vu, Xuan-Son and Bjorklund, Johanna and Mayer, Christoph and Zhang, Yushan and Ke, Lei and Zhao, Jie and Fernandez, Gustavo and Al-Shakarji, Noor and An, Dong and Arens, Michael and Becker, Stefan and Bhat, Goutam and Bullinger, Sebastian and Chan, Antoni B. and Chang, Shijie and Chen, Hanyuan and Chen, Xin and Chen, Yan and Chen, Zhenyu and Cheng, Yangming and Cui, Yutao and Deng, Chunyuan and Dong, Jiahua and Dunnhofer, Matteo and Feng, Wei and Fu, Jianlong and Gao, Jie and Han, Ruize and Hao, Zeqi and He, Jun-Yan and He, Keji and He, Zhenyu and Hu, Xiantao and Huang, Kaer and Huang, Yuqing and Jiang, Yi and Kang, Ben and Lan, Jin-Peng and Lee, Hyungjun and Li, Chenyang and Li, Jiahao and Li, Ning and Li, Wangkai and Li, Xiaodi and Li, Xin and Liu, Pengyu and Liu, Yue and Lu, Huchuan and Luo, Bin and Luo, Ping and Ma, Yinchao and Miao, Deshui and Micheloni, Christian and Palaniappan, Kannappan and Park, Hancheol and Paul, Matthieu and Peng, HouWen and Qian, Zekun and Rahmon, Gani and Scherer-Negenborn, Norbert and Shao, Pengcheng and Shin, Wooksu and Kazemi, Elham Soltani and Song, Tianhui and Stiefelhagen, Rainer and Sun, Rui and Tang, Chuanming and Tang, Zhangyong and Toubal, Imad Eddine and Valmadre, Jack and van de Weijer, Joost and Van Gool, Luc and Vira, Jash and Vujasinovic, Stephane and Wan, Cheng and Wan, Jia and Wang, Dong and Wang, Fei and Wang, Feifan and Wang, He and Wang, Limin and Wang, Song and Wang, Yaowei and Wang, Zhepeng and Wu, Gangshan and Wu, Jiannan and Wu, Qiangqiang and Wu, Xiaojun and Xiao, Anqi and Xie, Jinxia and Xu, Chenlong and Xu, Min and Xu, Tianyang and Xu, Yuanyou and Yan, Bin and Yang, Dawei and Yang, Ming-Hsuan and Yang, Tianyu and Yang, Yi and Yang, Zongxin and Yin, Xuanwu and Yu, Fisher and Yu, Hongyuan and Yu, Qianjin and Yu, Weichen and Yuan, YongSheng and Yuan, Zehuan and Zhang, Jianlin and Zhang, Lu and Zhang, Tianzhu and Zhao, Guodongfang and Zhao, Shaochuan and Zheng, Yaozong and Zhong, Bineng and Zhu, Jiawen and Zhu, Xuefeng and Zhuang, Yueting and Zong, ChengAo and Zuo, Kunlong},
  title = {{The First Visual Object Tracking Segmentation VOTS2023 Challenge Results}},
  booktitle = {2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW},
  year = {2023},
  series = {IEEE International Conference on Computer Vision Workshops},
  pages = {1788--1810},
  publisher = {IEEE COMPUTER SOC},
}

Amandeep Kumar, Ankan Kumar Bhunia, Sanath Narayan, Hisham Cholakkal, Rao Muhammad Anwer, Jorma Laaksonen, Fahad Khan, "Cross-Modulated Few-Shot Image Generation for Colorectal Tissue Classification", MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT III, Lecture Notes in Computer Science, 128-137, 2023.

AbstractKeywordsBiBTeXDOI

Omkar Thawakar, Rao Muhammad Anwer, Jorma Laaksonen, Orly Reiner, Mubarak Shah, Fahad Khan, "3D Mitochondria Instance Segmentation with Spatio-Temporal Transformers", MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VIII, Lecture Notes in Computer Science, 613-623, 2023.

AbstractKeywordsBiBTeXDOI

Chao Qin, Jiale Cao, Huazhu Fu, Rao Muhammad Anwer, Fahad Khan, "A Spatial-Temporal Deformable Attention Based Framework for Breast Lesion Detection in Videos", MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT II, Lecture Notes in Computer Science, 479-488, 2023.

AbstractKeywordsBiBTeXDOI

Asif Hanif, Muzammal Naseer, Salman Khan, Mubarak Shah, Fahad Khan, "Frequency Domain Adversarial Training for Robust Volumetric Medical Segmentation", MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT II, Lecture Notes in Computer Science, 457-467, 2023.

AbstractKeywordsBiBTeXDOI

Syed Talal Wasim, Muzammal Naseer, Salman Khan, Fahad Khan, Mubarak Shah, "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting", 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, 23034-23044, 2023.

AbstractKeywordsBiBTeXDOI

Nancy Mehta, Akshay Dudhane, Subrahmanyam Murala, Syed Waqas Zamir, Salman Khan, Fahad Khan, "Gated Multi-Resolution Transfer Network for Burst Restoration and Enhancement", 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, 22201-22210, 2023.

AbstractKeywordsBiBTeXDOI

Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, Fahad Khan, "MaPLe: Multi-modal Prompt Learning", 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, 19113-19122, 2023.

AbstractKeywordsBiBTeXDOI

Emanuel Sanchez Aimar, Arvi Jonnarth, Michael Felsberg, Marco Kuhlmann, "Balanced Product of Calibrated Experts for Long-Tailed Recognition", 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, 19967-19977, 2023.

AbstractKeywordsBiBTeXDOI

Muhammad Akhtar Munir, Muhammad Haris Khan, Salman Khan, Fahad Khan, "Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection", 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, 11474-11483, 2023.

AbstractKeywordsBiBTeXDOI

Senmao Li, Joost van de Weijer, Yaxing Wang, Fahad Khan, Meiqin Liu, Jian Yang, "3D-Aware Multi-Class Image-to-Image Translation with NeRFs", 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, 12652-12662, 2023.

AbstractKeywordsBiBTeXDOI

Sheng Zhang, Salman Khan, Zhiqiang Shen, Muzammal Naseer, Guangyi Chen, Fahad Khan, "PromptCAL: Contrastive Affinity Learning via Auxiliary Prompts for Generalized Novel Category Discovery", 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, IEEE Conference on Computer Vision and Pattern Recognition, 3479-3488, 2023.

AbstractKeywordsBiBTeXDOI

Hanoona Rasheed, Muhammad Uzair Khattak, Muhammad Maaz, Salman Khan, Fahad Khan, "Fine-tuned CLIP Models are Efficient Video Learners", 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, IEEE Conference on Computer Vision and Pattern Recognition, 6545-6554, 2023.

AbstractKeywordsBiBTeXDOI

Long Li, Junwei Han, Ni Zhang, Nian Liu, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Khan, "Discriminative Co-Saliency and Background Mining Transformer for Co-Salient Object Detection", 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, IEEE Conference on Computer Vision and Pattern Recognition, 7247-7256, 2023.

AbstractKeywordsBiBTeXDOI

Akshay Dudhane, Syed Waqas Zamir, Salman Khan, Fahad Khan, Ming-Hsuan Yang, "Burstormer: Burst Image Restoration and Enhancement Transformer", 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, IEEE Conference on Computer Vision and Pattern Recognition, 5703-5712, 2023.

AbstractKeywordsBiBTeXDOI

Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Jorma Laaksonen, Mubarak Shah, Fahad Khan, "Person Image Synthesis via Denoising Diffusion Model", 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, IEEE Conference on Computer Vision and Pattern Recognition, 5968-5976, 2023.

AbstractKeywordsBiBTeXDOI

Isak Meding, Alexander Bodin, Adam Tonderski, Joakim Johnander, Christoffer Petersson, Lennart Svensson, "You can have your ensemble and run it too -- Deep Ensembles Spread Over Time", Proceedings of the IEEE/CVF International Conference on Computer Vision, 4020-4029, 2023.

AbstractKeywordsBiBTeX

Yushan Zhang, Andreas Robinson, Maria Magnusson, Michael Felsberg, "Leveraging Optical Flow Features for Higher Generalization Power in Video Object Segmentation", 2023 IEEEInternational Conferenceon Image Processing, 326-330, 2023.

AbstractKeywordsBiBTeXDOI

William Ljungbergh, Joakim Johnander, Christoffer Petersson, Michael Felsberg, "Raw or Cooked? Object Detection on RAW Images", Image Analysis, Lecture Notes in Computer Science, Vol. 13885, 374-385, 2023.

AbstractKeywordsBiBTeXDOI

Karl Holmquist, Bastian Wandt, "Diffpose: Multi-hypothesis human pose estimation using diffusion models", ICCV 2023, Paris, France, October 4-6, 2023., 2023.

AbstractKeywordsBiBTeX

Emmanuele Barberi, Filippo Cucinotta, Per-Erik Forssén, Marcello Raffaele, Fabio Salmeri, "A differential entropy-based method for reverse engineering quality assessment", ADM 2023 International Conference, Florence, Italy 6-8 September 2023, 2023.

AbstractKeywordsBiBTeX

Emil Brissman, Per-Erik Forssén, Johan Edstedt, "Camera Calibration Without Camera Access - A Robust Validation Technique for Extended PnP Methods", 22nd Scandinavian Conference, SCIA 2023 Sirkka, Finland, April 18–21, 2023, Lecture Notes in Computer Science, Vol. 13885, 34-49, 2023.

AbstractKeywordsBiBTeXDOI

Johan Edstedt, Ioannis Athanasiadis, Mårten Wadenbäck, Michael Felsberg, "DKM: Dense Kernelized Feature Matching for Geometry Estimation", 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Proceedings:IEEE Conference on Computer Vision and Pattern Recognition, 17765-17775, 2023.

AbstractKeywordsCodeBiBTeXDOI

Marco Rudolph, Tom Wehrbein, Bodo Rosenhahn, Bastian Wandt, "Asymmetric Student-Teacher Networks for Industrial Anomaly Detection", 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), IEEE Winter Conference on Applications of Computer Vision, 2591-2601, 2023.

AbstractKeywordsBiBTeXDOI

Mariana-Iuliana Georgescu, Radu Tudor Ionescu, Andreea-Iuliana Miron, Olivian Savencu, Nicolae-Catalin Ristea, Nicolae Verga, Fahad Khan, "Multimodal Multi-Head Convolutional Attention with Various Kernel Sizes for Medical Image Super-Resolution", 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), IEEE Winter Conference on Applications of Computer Vision, 2194-2204, 2023.

AbstractKeywordsBiBTeXDOI

Karl Holmquist, Lena Klasén, Michael Felsberg, "Evidential Deep Learning for Class-Incremental Semantic Segmentation", Image Analysis. SCIA 2023., Lecture Notes in Computer Science, Vol. 13886, 32-48, 2023.

AbstractKeywordsBiBTeXDOI

2022

Nicolaea Catalin Ristea, Radu Tudor Ionescu, Fahad Khan, "SepTr: Separable Transformer for Audio Spectrogram Processing", INTERSPEECH 2022, Interspeech, 4103-4107, 2022.

AbstractKeywordsBiBTeXDOI

Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Khan, Jorma Laaksonen, Michael Felsberg, "DoodleFormer: Creative Sketch Drawing with Transformers", COMPUTER VISION - ECCV 2022, PT XVII, Lecture Notes in Computer Science, 338-355, 2022.

AbstractKeywordsBiBTeXDOI

Johan Edstedt, Amanda Berg, Michael Felsberg, Johan Karlsson, Francisca Benavente, Anette Novak, Gustav Grund Pihlgren, "VidHarm: A Clip Based Dataset for Harmful Content Detection", 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), International Conference on Pattern Recognition, 1543-1549, 2022.

AbstractKeywordsBiBTeXDOI

Omkar Thawakar, Sanath Narayan, Jiale Cao, Hisham Cholakkal, Rao Muhammad Anwer, Muhammad Haris Khan, Salman Khan, Michael Felsberg, Fahad Khan, "Video Instance Segmentation via Multi-Scale Spatio-Temporal Split Attention Transformer", COMPUTER VISION, ECCV 2022, PT XXIX, Lecture Notes in Computer Science, 666-681, 2022.

AbstractKeywordsBiBTeXDOI

Joakim Johnander, Johan Edstedt, Michael Felsberg, Fahad Khan, Martin Danelljan, "Dense Gaussian Processes for Few-Shot Segmentation", COMPUTER VISION, ECCV 2022, PT XXIX, Lecture Notes in Computer Science, 217-234, 2022.

AbstractKeywordsBiBTeXDOI

Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Khan, Rao Muhammad Anwer, Ming-Hsuan Yang, "Class-Agnostic Object Detection with Multi-modal Transformer", COMPUTER VISION, ECCV 2022, PT X, Lecture Notes in Computer Science, 512-531, 2022.

AbstractKeywordsBiBTeXDOI

Anirudh Thatipelli, Sanath Narayan, Salman Khan, Rao Muhammad Anwer, Fahad Khan, Bernard Ghanem, "Spatio-temporal Relation Modeling for Few-shot Action Recognition", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), IEEE Conference on Computer Vision and Pattern Recognition, 19926-19935, 2022.

AbstractKeywordsBiBTeXDOI

Oliver Stromann, Alireza Razavi, Michael Felsberg, "LEARNING TO INTEGRATE VISION DATA INTO ROAD NETWORK DATA", 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), International Conference on Acoustics Speech and Signal Processing ICASSP, 4548-4552, 2022.

AbstractKeywordsBiBTeXDOI

Arvi Jonnarth, Michael Felsberg, "IMPORTANCE SAMPLING CAMS FOR WEAKLY-SUPERVISED SEGMENTATION", 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), International Conference on Acoustics Speech and Signal Processing ICASSP, 2639-2643, 2022.

AbstractKeywordsBiBTeXDOI

Andra Acsintoae, Andrei Florescu, Mariana-Iuliana Georgescu, Tudor Mare, Paul Sumedrea, Radu Tudor Ionescu, Fahad Khan, Mubarak Shah, "UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), IEEE Conference on Computer Vision and Pattern Recognition, 20111-20121, 2022.

AbstractKeywordsBiBTeXDOI

Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Khan, Ming-Hsuan Yang, "Restormer: Efficient Transformer for High-Resolution Image Restoration", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), IEEE Conference on Computer Vision and Pattern Recognition, 5718-5729, 2022.

AbstractKeywordsBiBTeXDOI

Nicolae-Catalin Ristea, Neelu Madan, Radu Tudor Ionescu, Kamal Nasrollahi, Fahad Khan, Thomas B. Moeslund, Mubarak Shah, "Self-Supervised Predictive Convolutional Attentive Block for Anomaly Detection", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, 13566-13576, 2022.

AbstractKeywordsBiBTeXDOI

Kanchana Ranasinghe, Muzammal Naseer, Salman Khan, Fahad Khan, Michael S. Ryoo, "Self-supervised Video Transformer", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), IEEE Conference on Computer Vision and Pattern Recognition, 2864-2874, 2022.

AbstractKeywordsBiBTeXDOI

K. J. Joseph, Salman Khan, Fahad Khan, Rao Muhammad Anwer, Vineeth N. Balasubramanian, "Energy-based Latent Aligner for Incremental Learning", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, 7442-7451, 2022.

AbstractKeywordsBiBTeXDOI

Akshita Gupta, Sanath Narayan, K. J. Joseph, Salman Khan, Fahad Khan, Mubarak Shah, "OW-DETR: Open-world Detection Transformer", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, 9225-9234, 2022.

AbstractKeywordsBiBTeXDOI

Akshay Dudhane, Syed Waqas Zamir, Salman Khan, Fahad Khan, Ming-Hsuan Yang, "Burst Image Restoration and Enhancement", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), IEEE Conference on Computer Vision and Pattern Recognition, 5749-5758, 2022.

AbstractKeywordsBiBTeXDOI

Jiale Cao, Yanwei Pang, Rao Muhammad Anwer, Hisham Cholakkal, Jin Xie, Mubarak Shah, Fahad Khan, "PSTR: End-to-End One-Step Person Search With Transformers", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, 9448-9457, 2022.

AbstractKeywordsBiBTeXDOI

Nancy Mehta, Akshay Dudhane, Subrahmanyam Murala, Syed Waqas Zamir, Salman Khan, Fahad Khan, "Adaptive Feature Consolidation Network for Burst Super-Resolution", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2022), 1278-1285, 2022.

AbstractKeywordsBiBTeXDOI

Goutam Bhat, Martin Danelljan, Radu Timofte, Yizhen Cao, Yuntian Cao, Meiya Chen, Xihao Chen, Shen Cheng, Akshay Dudhane, Haoqiang Fan, Ruipeng Gang, Jian Gao, Yan Gu, Jie Huang, Liufeng Huang, Youngsu Jo, Sukju Kang, Salman Khan, Fahad Khan, Yuki Kondo, Chenghua Li, Fangya Li, Jinjing Li, Youwei Li, Zechao Li, Chenming Liu, Shuaicheng Liu, Zikun Liu, Zhuoming Liu, Ziwei Luo, Zhengxiong Luo, Nancy Mehta, Subrahmanyam Murala, Yoonchan Nam, Chihiro Nakatani, Pavel Ostyakov, Jinshan Pan, Ge Song, Jian Sun, Long Sun, Jinhui Tang, Norimichi Ukita, Zhihong Wen, Qi Wu, Xiaohe Wu, Zeyu Xiao, Zhiwei Xiong, Rongjian Xu, Ruikang Xu, Youliang Yan, Jialin Yang, Wentao Yang, Zhongbao Yang, Fuma Yasue, Mingde Yao, Lei Yu, Cong Zhang, Syed Waqas Zamir, Jianxing Zhang, Shuohao Zhang, Zhilu Zhang, Qian Zheng, Gaofeng Zhou, Magauiya Zhussip, Xueyi Zou, Wangmeng Zuo, "NTIRE 2022 Burst Super-Resolution Challenge", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2022), 1040-1060, 2022.

AbstractKeywordsBiBTeXDOI

Marcus Valtonen Örnhag, Patrik Persson, Mårten Wadenbäck, Kalle Åström, Anders Heyden, "Trust Your IMU: Consequences of Ignoring the IMU Drift", Proceedings 2022 IEEE/CVF Conference on Computer Visionand Pattern Recognition Workshops, IEEE Computer Society Conference on Computer Vision and Pattern Recognition workshops, 4467-4476, 2022.

AbstractKeywordsBiBTeXDOI

Puneet Mangla, Shivam Chandhok, Vineeth N. Balasubramanian, Fahad Khan, "COCOA: Context-Conditional Adaptation for Recognizing Unseen Classes in Unseen Domains", 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), IEEE Winter Conference on Applications of Computer Vision, 1618-1627, 2022.

AbstractKeywordsBiBTeXDOI

Pavlo Melnyk, Michael Felsberg, Mårten Wadenbäck, "Steerable 3D Spherical Neurons", Proceedings of the 39th International Conference on Machine Learning, Proceedings of Machine Learning Research, 15330-15339, 2022.

AbstractKeywordsBiBTeX

Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Fahad Shahbaz Khan, Fatih Porikli, "ON IMPROVING ADVERSARIAL TRANSFERABILITY OF VISION TRANSFORMERS", The Tenth International Conference on Learning Representations (Virtual)Mon Apr 25th through Fri the 29th, 2022.

KeywordsBiBTeX

Zahra Gharaee, "Predicting the intended action using internal simulation of perception", ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, 626-635, 2022.

AbstractKeywordsBiBTeXDOIFulltext

2021

Karl Holmquist, Lena Klasén, Michael Felsberg, "Class-Incremental Learning for Semantic Segmentation - A study", 2021 Swedish Artificial Intelligence Society Workshop (SAIS), 25-28, 2021.

AbstractKeywordsBiBTeXDOI

Muzammal Naseer, Salman Khan, Munawar Hayat, Fahad Khan, Fatih Porikli, "On Generating Transferable Targeted Perturbations", 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 7688-7697, 2021.

AbstractKeywordsBiBTeXDOI

Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Khan, Mubarak Shah, "Handwriting Transformers", 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 1066-1074, 2021.

AbstractKeywordsBiBTeXDOI

Kanchana Ranasinghe, Muzammal Naseer, Munawar Hayat, Salman Khan, Fahad Khan, "Orthogonal Projection Loss", 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 12313-12323, 2021.

AbstractKeywordsBiBTeXDOI

Sanath Narayan, Hisham Cholakkal, Munawar Hayat, Fahad Khan, Ming-Hsuan Yang, Ling Shao, "D2-Net: Weakly-Supervised Action Localization via Discriminative Embeddings and Denoised Activations", 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 13588-13597, 2021.

AbstractKeywordsBiBTeXDOI

Sanath Narayan, Akshita Gupta, Salman Khan, Fahad Khan, Ling Shao, Mubarak Shah, "Discriminative Region-based Multi-Label Zero-Shot Learning", 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 8711-8720, 2021.

AbstractKeywordsBiBTeXDOI

Joakim Johnander, Emil Brissman, Martin Danelljan, Michael Felsberg, "Video Instance Segmentation with Recurrent Graph Neural Networks", Pattern Recognition, Lecture Notes in Computer Science, Vol. 13024, 206-221, 2021.

AbstractKeywordsBiBTeXDOI

Matej Kristan, Jiri Matas, Ales Leonardis, Michael Felsberg, Roman Pflugfelder, Joni-Kristian Kamarainen, Hyung Jin Chang, Martin Danelljan, Luka Cehovin Zajc, Alan Lukezic, Ondrej Drbohlav, Jani Kapyla, Gustav Häger, Song Yan, Jinyu Yang, Zhongqun Zhang, Gustavo Fernandez, Mohamed Abdelpakey, Goutam Bhat, Llukman Cerkezi, Hakan Cevikalp, Shengyong Chen, Xin Chen, Miao Cheng, Ziyi Cheng, Yu-Chen Chiu, Ozgun Cirakman, Yutao Cui, Kenan Dai, Mohana Murali Dasari, Qili Deng, Xingping Dong, Daniel K. Du, Matteo Dunnhofer, Zhen-Hua Feng, Zhiyong Feng, Zhihong Fu, Shiming Ge, Rama Krishna Gorthi, Yuzhang Gu, Bilge Gunsel, Qing Guo, Filiz Gurkan, Wencheng Han, Yanyan Huang, Felix Järemo-Lawin, Shang-Jhih Jhang, Rongrong Ji, Cheng Jiang, Yingjie Jiang, Felix Juefei-Xu, Yin Jun, Xiao Ke, Fahad Shahbaz Khan, Byeong Hak Kim, Josef Kittler, Xiangyuan Lan, Jun Ha Lee, Bastian Leibe, Hui Li, Jianhua Li, Xianxian Li, Yuezhou Li, Bo Liu, Chang Liu, Jingen Liu, Li Liu, Qingjie Liu, Huchuan Lu, Wei Lu, Jonathon Luiten, Jie Ma, Ziang Ma, Niki Martinel, Christoph Mayer, Alireza Memarmoghadam, Christian Micheloni, Yuzhen Niu, Danda Paudel, Houwen Peng, Shoumeng Qiu, Aravindh Rajiv, Muhammad Rana, Andreas Robinson, Hasan Saribas, Ling Shao, Mohamed Shehata, Furao Shen, Jianbing Shen, Kristian Simonato, Xiaoning Song, Zhangyong Tang, Radu Timofte, Philip Torr, Chi-Yi Tsai, Bedirhan Uzun, Luc Van Gool, Paul Voigtlaender, Dong Wang, Guangting Wang, Liangliang Wang, Lijun Wang, Limin Wang, Linyuan Wang, Yong Wang, Yunhong Wang, Chenyan Wu, Gangshan Wu, Xiao-Jun Wu, Fei Xie, Tianyang Xu, Xiang Xu, Wanli Xue, Bin Yan, Wankou Yang, Xiaoyun Yang, Yu Ye, Jun Yin, Chengwei Zhang, Chunhui Zhang, Haitao Zhang, Kaihua Zhang, Kangkai Zhang, Xiaohan Zhang, Xiaolin Zhang, Xinyu Zhang, Zhibin Zhang, Shaochuan Zhao, Ming Zhen, Bineng Zhong, Jiawen Zhu, Xue-Feng Zhu, "The Ninth Visual Object Tracking VOT2021 Challenge Results", 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), IEEE International Conference on Computer Vision Workshops, 2711-2738, 2021.

AbstractKeywordsBiBTeXDOI

BIBTEX

@inproceedings{diva2:1643014,
  author = {Kristan, Matej and Matas, Jiri and Leonardis, Ales and Felsberg, Michael and Pflugfelder, Roman and Kamarainen, Joni-Kristian and Chang, Hyung Jin and Danelljan, Martin and Zajc, Luka Cehovin and Lukezic, Alan and Drbohlav, Ondrej and Kapyla, Jani and Häger, Gustav and Yan, Song and Yang, Jinyu and Zhang, Zhongqun and Fernandez, Gustavo and Abdelpakey, Mohamed and Bhat, Goutam and Cerkezi, Llukman and Cevikalp, Hakan and Chen, Shengyong and Chen, Xin and Cheng, Miao and Cheng, Ziyi and Chiu, Yu-Chen and Cirakman, Ozgun and Cui, Yutao and Dai, Kenan and Dasari, Mohana Murali and Deng, Qili and Dong, Xingping and Du, Daniel K. and Dunnhofer, Matteo and Feng, Zhen-Hua and Feng, Zhiyong and Fu, Zhihong and Ge, Shiming and Gorthi, Rama Krishna and Gu, Yuzhang and Gunsel, Bilge and Guo, Qing and Gurkan, Filiz and Han, Wencheng and Huang, Yanyan and Järemo-Lawin, Felix and Jhang, Shang-Jhih and Ji, Rongrong and Jiang, Cheng and Jiang, Yingjie and Juefei-Xu, Felix and Jun, Yin and Ke, Xiao and Khan, Fahad Shahbaz and Kim, Byeong Hak and Kittler, Josef and Lan, Xiangyuan and Lee, Jun Ha and Leibe, Bastian and Li, Hui and Li, Jianhua and Li, Xianxian and Li, Yuezhou and Liu, Bo and Liu, Chang and Liu, Jingen and Liu, Li and Liu, Qingjie and Lu, Huchuan and Lu, Wei and Luiten, Jonathon and Ma, Jie and Ma, Ziang and Martinel, Niki and Mayer, Christoph and Memarmoghadam, Alireza and Micheloni, Christian and Niu, Yuzhen and Paudel, Danda and Peng, Houwen and Qiu, Shoumeng and Rajiv, Aravindh and Rana, Muhammad and Robinson, Andreas and Saribas, Hasan and Shao, Ling and Shehata, Mohamed and Shen, Furao and Shen, Jianbing and Simonato, Kristian and Song, Xiaoning and Tang, Zhangyong and Timofte, Radu and Torr, Philip and Tsai, Chi-Yi and Uzun, Bedirhan and Van Gool, Luc and Voigtlaender, Paul and Wang, Dong and Wang, Guangting and Wang, Liangliang and Wang, Lijun and Wang, Limin and Wang, Linyuan and Wang, Yong and Wang, Yunhong and Wu, Chenyan and Wu, Gangshan and Wu, Xiao-Jun and Xie, Fei and Xu, Tianyang and Xu, Xiang and Xue, Wanli and Yan, Bin and Yang, Wankou and Yang, Xiaoyun and Ye, Yu and Yin, Jun and Zhang, Chengwei and Zhang, Chunhui and Zhang, Haitao and Zhang, Kaihua and Zhang, Kangkai and Zhang, Xiaohan and Zhang, Xiaolin and Zhang, Xinyu and Zhang, Zhibin and Zhao, Shaochuan and Zhen, Ming and Zhong, Bineng and Zhu, Jiawen and Zhu, Xue-Feng},
  title = {{The Ninth Visual Object Tracking VOT2021 Challenge Results}},
  booktitle = {2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021)},
  year = {2021},
  series = {IEEE International Conference on Computer Vision Workshops},
  pages = {2711--2738},
  publisher = {IEEE COMPUTER SOC},
}

Pavlo Melnyk, Michael Felsberg, Mårten Wadenbäck, "Embed Me If You Can: A Geometric Perceptron", Proceedings 2021 IEEE/CVF International Conference on Computer Vision ICCV 2021, IEEE International Conference on Computer Vision. Proceedings, 1256-1264, 2021.

AbstractKeywordsBiBTeXDOI

K. J. Joseph, Salman Khan, Fahad Khan, Vineeth N. Balasubramanian, "Towards Open World Object Detection", 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, IEEE Conference on Computer Vision and Pattern Recognition, 5826-5836, 2021.

AbstractKeywordsBiBTeXDOI

Mikael Persson, Gustav Häger, Hannes Ovrén, Per-Erik Forssén, "Practical Pose Trajectory Splines With Explicit Regularization", 2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), International Conference on 3D Vision, 156-165, 2021.

AbstractKeywordsBiBTeXDOI

Mikael Persson, Per-Erik Forssén, "Independently Moving Object Trajectories from Sequential Hierarchical Ransac", VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 5: VISAPP, 722-731, 2021.

AbstractKeywordsBiBTeXDOI

KJ Joseph, Salman Khan, Fahad Shahbaz Khan, Vineeth N Balasubramanian, "Towards Open World Object Detection", CVPR 2021, June 19-25 2021, 2021.

AbstractBiBTeX

Gustav Häger, Mikael Persson, Michael Felsberg, "Predicting Disparity Distributions", 2021 IEEE International Conference on Robotics and Automation (ICRA), IEEE International Conference on Robotics and Automation (ICRA), 2021.

AbstractKeywordsBiBTeXDOIFulltext

Emil Brissman, Joakim Johnander, Michael Felsberg, "Predicting Signed Distance Functions for Visual Instance Segmentation", 33rd Annual Workshop of the Swedish-Artificial-Intelligence-Society (SAIS), Annual Workshop of the Swedish-Artificial-Intelligence-Society (SAIS), 5-10, 2021.

AbstractKeywordsBiBTeXDOIFulltext

Marcus Valtonen Örnhag, Patrik Persson, Mårten Wadenbäck, Kalle Åström, Anders Heyden, "Efficient Real-Time Radial Distortion Correction for UAVs", 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE Winter Conference on Applications of Computer Vision (WACV), 1750-1759, 2021.

AbstractKeywordsBiBTeXDOI

Marcus Valtonen Örnhag, Patrik Persson, Mårten Wadenbäck, Kalle Åström, Anders Heyden, "Minimal Solvers for Indoor UAV Positioning", 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), International Conference on Pattern Recognition, 1136-1143, 2021.

AbstractKeywordsBiBTeXDOIFulltext

Zahra Gharaee, Karl Holmquist, Linbo He, Michael Felsberg, "A Bayesian Approach to Reinforcement Learning of Vision-Based Vehicular Control", 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), International Conference on Pattern Recognition, 3947-3954, 2021.

AbstractKeywordsBiBTeXDOI

Abdelrahman Eldesokey, Michael Felsberg, "Normalized Convolution Upsampling for Refined Optical Flow Estimation", Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP, 742-752, 2021.

AbstractKeywordsBiBTeXDOI

Andreas Robinson, Abdelrahman Eldesokey, Michael Felsberg, "Distractor-aware video object segmentation", Pattern Recognition. DAGM GCPR 2021, Lecture Notes in Computer Science, Vol. 13024, 222-234, 2021.

AbstractKeywordsBiBTeXDOI

2020

M. Kristan, A. Leonardis, J. Matas, Michael Felsberg, R. Pflugfelder, J.-K. Kämäräinen, M. Danelljan, L.C. Zajc, A. Lukežic, O. Drbohlav, Linbo He, Yushan Zhang, S. Yan, J. Yang, G. Fernández, A. Hauptmann, A. Memarmoghadam, Á. García-Martín, Andreas Robinson, A. Varfolomieiev, A.H. Gebrehiwot, B. Uzun, B. Yan, B. Li, C. Qian, C.-Y. Tsai, C. Micheloni, D. Wang, F. Wang, F. Xie, Felix Järemo-Lawin, F. Gustafsson, G.L. Foresti, G. Bhat, G. Chen, H. Ling, H. Zhang, H. Cevikalp, H. Zhao, H. Bai, H.C. Kuchibhotla, H. Saribas, H. Fan, H. Ghanei-Yakhdan, H. Li, H. Peng, H. Lu, H. Li, J. Khaghani, J. Bescos, J. Li, J. Fu, J. Yu, J. Xu, J. Kittler, J. Yin, J. Lee, K. Yu, K. Liu, K. Yang, K. Dai, L. Cheng, L. Zhang, L. Wang, L. Wang, Gool L. Van, L. Bertinetto, M. Dunnhofer, M. Cheng, M.M. Dasari, N. Wang, N. Wang, P. Zhang, P.H.S. Torr, Q. Wang, R. Timofte, R.K.S. Gorthi, S. Choi, S.M. Marvasti-Zadeh, S. Zhao, S. Kasaei, S. Qiu, S. Chen, T.B. Schön, T. Xu, W. Lu, W. Hu, W. Zhou, X. Qiu, X. Ke, X.-J. Wu, X. Zhang, X. Yang, X. Zhu, Y. Jiang, Y. Wang, Y. Chen, Y. Ye, Y. Li, Y. Yao, Y. Lee, Y. Gu, Z. Wang, Z. Tang, Z.-H. Feng, Z. Mai, Z. Zhang, Z. Wu, Z. Ma, "The Eighth Visual Object Tracking VOT2020 Challenge Results", Computer Vision, Lecture Notes in Computer Science, Vol. 12539, 547-601, 2020.

AbstractKeywordsBiBTeXDOI

Amanda Berg, Jörgen Ahlberg, Michael Felsberg, "Unsupervised Adversarial Learning of Anomaly Detection in the Wild", Proceedings of the 24th European Conference on Artificial Intelligence (ECAI), Frontiers in Artificial Intelligence and Applications, Vol. 325, 1002-1008, 2020.

AbstractKeywordsBiBTeXDOIFulltext

Felix Järemo-Lawin, Per-Erik Forssén, "Registration Loss Learning for Deep Probabilistic Point Set Registration", 2020 International Conference on 3D Vision (3DV), International Conference on 3D Vision, 563-572, 2020.

AbstractKeywordsBiBTeXDOIFulltext

Santiago A. Rodriguez Gonzalez, Michal Shimoni, Javier Plaza, Antonio Plaza, Ingmar Renhorn, Jörgen Ahlberg, "The Detection of Concealed Targets in Woodland Areas using Hyperspectral Imagery", 2020 IEEE Latin American GRSS & ISPRS Remote Sensing Conference (LAGIRS), 451-455, 2020.

AbstractKeywordsBiBTeXDOI

Abdelrahman Eldesokey, Michael Felsberg, Karl Holmquist, Mikael Persson, "Uncertainty-Aware CNNs for Depth Completion: Uncertainty from Beginning to End", 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Conference on Computer Vision and Pattern Recognition (CVPR), 12011-12020, 2020.

AbstractKeywordsBiBTeXDOIFulltext

Bhat Goutam, Felix Järemo-Lawin, Martin Danelljan, Andreas Robinson, Michael Felsberg, Luc Van Gool, Radu Timofte, "Learning What to Learn for Video Object Segmentation", Computer Vision, Lecture Notes in Computer Science, Vol. 12347, 777-794, 2020.

AbstractKeywordsBiBTeXDOIFulltext

Andreas Robinson, Felix Järemo-Lawin, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg, "Learning Fast and Robust Target Models for Video Object Segmentation", 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Computer Society Conference on Computer Vision and Pattern Recognition, 7404-7413, 2020.

AbstractKeywordsBiBTeXDOIFulltext

M. Naseer, S. Khan, M. Hayat, Fahad Shahbaz Khan, F. Porikli, "A Self-supervised Approach for Adversarial Robustness", 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Computer Society Conference on Computer Vision and Pattern Recognition, 259-268, 2020.

AbstractKeywordsBiBTeXDOI

T. Wang, T. Yang, M. Danelljan, Fahad Shahbaz Khan, X. Zhang, J. Sun, "Learning Human-Object Interaction Detection Using Interaction Points", 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Computer Society Conference on Computer Vision and Pattern Recognition, 4115-4124, 2020.

AbstractKeywordsBiBTeXDOI

Y. Wang, A. Gonzalez-Garcia, D. Berga, L. Herranz, Fahad Shahbaz Khan, J. van de Weijer, "MineGAN: Effective Knowledge Transfer From GANs to Target Domains With Few Images", 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Computer Society Conference on Computer Vision and Pattern Recognition, 9329-9338, 2020.

AbstractKeywordsBiBTeXDOI

Y. Wang, S. Khan, A. Gonzalez-Garcia, J. van de Weijer, Fahad Shahbaz Khan, "Semi-Supervised Learning for Few-Shot Image-to-Image Translation", 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Computer Society Conference on Computer Vision and Pattern Recognition, 4452-4461, 2020.

AbstractKeywordsBiBTeXDOI

J. Rajasegaran, S. Khan, M. Hayat, Fahad Shahbaz Khan, M. Shah, "iTAML: An Incremental Task-Agnostic Meta-learning Approach", 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 13585-13594, 2020.

AbstractKeywordsBiBTeXDOI

2019

Nolang Fanani, Matthias Ochs, Rudolf Mester, "Detecting Parallel-Moving Objects in the Monocular Case Employing CNN Depth Maps", COMPUTER VISION - ECCV 2018 WORKSHOPS, PT III, Lecture Notes in Computer Science, 281-297, 2019.

AbstractKeywordsBiBTeXDOI

Matej Kristanl, Jiri Matas, Ales Leonardis, Michael Felsberg, Roman Pflugfelder, Joni-Kristian Kamarainen, Luka Cehovin Zajc, Ondrej Drbohlav, Alan Lukezic, Amanda Berg, Abdelrahman Eldesokey, Jani Kapyla, Gustavo Fernandez, Abel Gonzalez-Garcia, Alireza Memarrnoghadam, Andong Lu, Anfeng He, Anton Varfolomieiev, Antoni Chan, Ardhendu Shekhar Tripathi, Arnold Smeulders, Bala Suraj Pedasingu, Bao Xin Chen, Baopeng Zhang, Baoyuan Wu, Bi Li, Bin He, Bin Yan, Bing Bai, Bing Li, Bo Li, Bycong Hak Kim, Chao Ma, Chen Fang, Chen Qian, Cheng Chen, Chenglong Li, Chengquan Zhang, Chi-Yi Tsai, Chong Luo, Christian Micheloni, Chunhui Zhang, Dacheng Tao, Deepak Gupta, Dejia Song, Dong Wang, Efstratios Gavves, Eunu Yi, Fahad Shahbaz Khan, Fangyi Zhang, Fei Wang, Fei Zhao, George De Ath, Goutam Bhat, Guanqi Chen, Guangting Wang, Guoxuan Li, Hakan Cevikalp, Hao Du, Haojie Zhao, Hasan Saribas, Ho Min Jung, Hongliang Bai, Hongyuan Yu, Houwen Peng, Huchuan Lu, Hui Li, Jiakun Li, Jianhu Li, Jianlong Fu, Jie Chen, Jie Gao, Jie Zhao, Jin Tang, Jing Li, Jingjing Wu, Jingtuo Liu, Jinqiao Wang, Jingqing Qi, Jingyue Zhang, John K. Tsotsos, John Hyuk Lee, Joost van de Weijer, Josef Kittler, Jun Ha Lee, Junfei Zhuang, Kangkai Zhang, Kangkang wang, Kenan Dai, Lei Chen, Lei Liu, Leida Guo, Li Zhang, Liang Wang, Liangliang Wang, Lichao Zhang, Lijun Wang, Lijun Zhou, Linyu Zheng, Litu Rout, Luc Van Gool, Luca Bertinetto, Martin Danelljan, Matteo Dunnhofer, Meng Ni, Min Young Kim, Ming Tang, Ming-Hsuan Yang, Naveen Paluru, Niki Martine, Pengfei Xu, Pengfei Zhang, Pengkun Zheng, Pengyu Zhang, Philip H. S. Torr, Qi Zhang Qiang Wang, Qing Gua, Radu Timofte, Rama Krishna Gorthi, Richard Everson, Ruize Han, Ruohan Zhang, Shan You, Shao-Chuan Zhao, Shengwei Zhao, Shihu Li, Shikun Li, Shiming Ge, Shuai Bai, Shuosen Guan, Tengfei Xing, Tianyang Xu, Tianyu Yang, Ting Zhang, Tomas Vojir, Wei Feng, Weiming Hu, Weizhao Wang, Wenjie Tang, Wenjun Zeng, Wenyu Liu, Xi Chen, Xi Qiu, Xiang Bai, Xiao-Jun Wu, Xiaoyun Yang, Xier Chen, Xin Li, Xing Sun, Xingyu Chen, Xinmei Tian, Xu Tang, Xue-Feng Zhu, Yan Huang, Yanan Chen, Yanchao Lian, Yang Gu, Yang Liu, Yanjie Chen, Yi Zhang, Yinda Xu, Yingming Wang, Yingping Li, Yu Zhou, Yuan Dong, Yufei Xu, Yunhua Zhang, Yunkun Li, Zeyu Wang Zhao Luo, Zhaoliang Zhang, Zhen-Hua Feng, Zhenyu He, Zhichao Song, Zhihao Chen, Zhipeng Zhang, Zhirong Wu, Zhiwei Xiong, Zhongjian Huang, Zhu Teng, Zihan Ni, "The Seventh Visual Object Tracking VOT2019 Challenge Results", 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), IEEE International Conference on Computer Vision Workshops, 2206-2241, 2019.

AbstractKeywordsBiBTeXDOIFulltext

BIBTEX

@inproceedings{diva2:1466584,
  author = {Kristanl, Matej and Matas, Jiri and Leonardis, Ales and Felsberg, Michael and Pflugfelder, Roman and Kamarainen, Joni-Kristian and Zajc, Luka Cehovin and Drbohlav, Ondrej and Lukezic, Alan and Berg, Amanda and Eldesokey, Abdelrahman and Kapyla, Jani and Fernandez, Gustavo and Gonzalez-Garcia, Abel and Memarrnoghadam, Alireza and Lu, Andong and He, Anfeng and Varfolomieiev, Anton and Chan, Antoni and Tripathi, Ardhendu Shekhar and Smeulders, Arnold and Pedasingu, Bala Suraj and Chen, Bao Xin and Zhang, Baopeng and Wu, Baoyuan and Li, Bi and He, Bin and Yan, Bin and Bai, Bing and Li, Bing and Li, Bo and Kim, Bycong Hak and Ma, Chao and Fang, Chen and Qian, Chen and Chen, Cheng and Li, Chenglong and Zhang, Chengquan and Tsai, Chi-Yi and Luo, Chong and Micheloni, Christian and Zhang, Chunhui and Tao, Dacheng and Gupta, Deepak and Song, Dejia and Wang, Dong and Gavves, Efstratios and Yi, Eunu and Khan, Fahad Shahbaz and Zhang, Fangyi and Wang, Fei and Zhao, Fei and De Ath, George and Bhat, Goutam and Chen, Guanqi and Wang, Guangting and Li, Guoxuan and Cevikalp, Hakan and Du, Hao and Zhao, Haojie and Saribas, Hasan and Jung, Ho Min and Bai, Hongliang and Yu, Hongyuan and Peng, Houwen and Lu, Huchuan and Li, Hui and Li, Jiakun and Li, Jianhu and Fu, Jianlong and Chen, Jie and Gao, Jie and Zhao, Jie and Tang, Jin and Li, Jing and Wu, Jingjing and Liu, Jingtuo and Wang, Jinqiao and Qi, Jingqing and Zhang, Jingyue and Tsotsos, John K. and Lee, John Hyuk and van de Weijer, Joost and Kittler, Josef and Lee, Jun Ha and Zhuang, Junfei and Zhang, Kangkai and wang, Kangkang and Dai, Kenan and Chen, Lei and Liu, Lei and Guo, Leida and Zhang, Li and Wang, Liang and Wang, Liangliang and Zhang, Lichao and Wang, Lijun and Zhou, Lijun and Zheng, Linyu and Rout, Litu and Van Gool, Luc and Bertinetto, Luca and Danelljan, Martin and Dunnhofer, Matteo and Ni, Meng and Kim, Min Young and Tang, Ming and Yang, Ming-Hsuan and Paluru, Naveen and Martine, Niki and Xu, Pengfei and Zhang, Pengfei and Zheng, Pengkun and Zhang, Pengyu and Torr, Philip H. S. and Wang, Qi Zhang Qiang and Gua, Qing and Timofte, Radu and Gorthi, Rama Krishna and Everson, Richard and Han, Ruize and Zhang, Ruohan and You, Shan and Zhao, Shao-Chuan and Zhao, Shengwei and Li, Shihu and Li, Shikun and Ge, Shiming and Bai, Shuai and Guan, Shuosen and Xing, Tengfei and Xu, Tianyang and Yang, Tianyu and Zhang, Ting and Vojir, Tomas and Feng, Wei and Hu, Weiming and Wang, Weizhao and Tang, Wenjie and Zeng, Wenjun and Liu, Wenyu and Chen, Xi and Qiu, Xi and Bai, Xiang and Wu, Xiao-Jun and Yang, Xiaoyun and Chen, Xier and Li, Xin and Sun, Xing and Chen, Xingyu and Tian, Xinmei and Tang, Xu and Zhu, Xue-Feng and Huang, Yan and Chen, Yanan and Lian, Yanchao and Gu, Yang and Liu, Yang and Chen, Yanjie and Zhang, Yi and Xu, Yinda and Wang, Yingming and Li, Yingping and Zhou, Yu and Dong, Yuan and Xu, Yufei and Zhang, Yunhua and Li, Yunkun and Luo, Zeyu Wang Zhao and Zhang, Zhaoliang and Feng, Zhen-Hua and He, Zhenyu and Song, Zhichao and Chen, Zhihao and Zhang, Zhipeng and Wu, Zhirong and Xiong, Zhiwei and Huang, Zhongjian and Teng, Zhu and Ni, Zihan},
  title = {{The Seventh Visual Object Tracking VOT2019 Challenge Results}},
  booktitle = {2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW)},
  year = {2019},
  series = {IEEE International Conference on Computer Vision Workshops},
  pages = {2206--2241},
  publisher = {IEEE COMPUTER SOC},
}

Rao Muhammad Anwer, Fahad Khan, Jorma Laaksonen, Nazar Zaki, "Multi-stream Convolutional Networks for Indoor Scene Recognition", COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2019, PT I, Lecture Notes in Computer Science, 196-208, 2019.

AbstractKeywordsBiBTeXDOI

Hisham Cholakkal, Guolei Sun, Fahad Shahbaz Khan, Ling Shao, "Object Counting and Instance Segmentation with Image-level Supervision", 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), Long Beach, CA, JUN 16-20, 2019, IEEE Conference on Computer Vision and Pattern Recognition, 12389-12397, 2019.

AbstractKeywordsBiBTeXDOI

Yanwei Pang, Tiancai Wang, Rao Muhammad Anwer, Fahad Shahbaz Khan, Ling Shao, "Efficient Featurized Image Pyramid Network for Single Shot Detector", 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), Long Beach, CA, JUN 16-20, 2019, IEEE Conference on Computer Vision and Pattern Recognition, 7328-7336, 2019.

AbstractKeywordsBiBTeXDOI

Yanwei Pang, Jin Xie, Muhammad Haris Khan, Rao Muhammad Anwer, Fahad Shahbaz Khan, Ling Shao, "Mask-Guided Attention Network for Occluded Pedestrian Detection", 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), IEEE International Conference on Computer Vision, 4966-4974, 2019.

AbstractKeywordsBiBTeXDOI

Lichao Zhang, Abel Gonzalez-Garcia, Joost van de Weijer, Martin Danelljan, Fahad Shahbaz Khan, "Learning the Model Update for Siamese Trackers", 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), Seoul, SOUTH KOREA, OCT 27-NOV 02, 2019, IEEE International Conference on Computer Vision, 4009-4018, 2019.

AbstractKeywordsBiBTeXDOI

Muzammal Naseer, Salman Khan, Muhammad Haris Khan, Fahad Khan, Fatih Porikli, "Cross-Domain Transferability of Adversarial Perturbations", ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), Advances in Neural Information Processing Systems, 2019.

AbstractKeywordsBiBTeX

Andreas Robinson, Felix Järemo-Lawin, Martin Danelljan, Michael Felsberg, "Discriminative Learning and Target Attention for the 2019 DAVIS Challenge onVideo Object Segmentation", CVPR 2019 workshops, 2019.

AbstractKeywordsBiBTeX

Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, Michael Felsberg, "ATOM: Accurate tracking by overlap maximization", 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online), 4655-4664, 2019.

AbstractKeywordsBiBTeXDOI

Matej Kristan, Aleš Leonardis, Jirí Matas, Michael Felsberg, Roman Pflugfelder, Luka Cehovin Zajc, Tomáš Vojírì, Goutam Bhat, Alan Lukezič, Abdelrahman Eldesokey, Gustavo Fernández, Álvaro García-Martín, Álvaro Iglesias-Arias, A. Aydin Alatan, Abel González-García, Alfredo Petrosino, Alireza Memarmoghadam, Andrea Vedaldi, Andrej Muhič, Anfeng He, Arnold Smeulders, Asanka G. Perera, Bo Li, Boyu Chen, Changick Kim, Changsheng Xu, Changzhen Xiong, Cheng Tian, Chong Luo, Chong Sun, Cong Hao, Daijin Kim, Deepak Mishra, Deming Chen, Dong Wang, Dongyoon Wee, Efstratios Gavves, Erhan Gundogdu, Erik Velasco-Salido, Fahad Shahbaz Khan, Fan Yang, Fei Zhao, Feng Li, Francesco Battistone, George De Ath, Gorthi R. K. S. Subrahmanyam, Guilherme Bastos, Haibin Ling, Hamed Kiani Galoogahi, Hankyeol Lee, Haojie Li, Haojie Zhao, Heng Fan, Honggang Zhang, Horst Possegger, Houqiang Li, Huchuan Lu, Hui Zhi, Huiyun Li, Hyemin Lee, Hyung Jin Chang, Isabela Drummond, Jack Valmadre, Jaime Spencer Martin, Javaan Chahl, Jin Young Choi, Jing Li, Jinqiao Wang, Jinqing Qi, Jinyoung Sung, Joakim Johnander, Joao Henriques, Jongwon Choi, Joost van de Weijer, Jorge Rodríguez Herranz, José M. Martínez, Josef Kittler, Junfei Zhuang, Junyu Gao, Klemen Grm, Lichao Zhang, Lijun Wang, Lingxiao Yang, Litu Rout, Liu Si, Luca Bertinetto, Lutao Chu, Manqiang Che, Mario Edoardo Maresca, Martin Danelljan, Ming-Hsuan Yang, Mohamed Abdelpakey, Mohamed Shehata, Myunggu Kang, Namhoon Lee, Ning Wang, Ondrej Miksik, P. Moallem, Pablo Vicente-Moñivar, Pedro Senna, Peixia Li, Philip Torr, Priya Mariam Raju, Qian Ruihe, Qiang Wang, Qin Zhou, Qing Guo, Rafael Martín-Nieto, Rama Krishna Gorthi, Ran Tao, Richard Bowden, Richard Everson, Runling Wang, Sangdoo Yun, Seokeon Choi, Sergio Vivas, Shuai Bai, Shuangping Huang, Sihang Wu, Simon Hadfield, Siwen Wang, Stuart Golodetz, Tang Ming, Tianyang Xu, Tianzhu Zhang, Tobias Fischer, Vincenzo Santopietro, Vitomir Štruc, Wang Wei, Wangmeng Zuo, Wei Feng, Wei Wu, Wei Zou, Weiming Hu, Wengang Zhou, Wenjun Zeng, Xiaofan Zhang, Xiaohe Wu, Xiao-Jun Wu, Xinmei Tian, Yan Li, Yan Lu, Yee Wei Law, Yi Wu, Yiannis Demiris, Yicai Yang, Yifan Jiao, Yuhong Li, Yunhua Zhang, Yuxuan Sun, Zheng Zhang, Zheng Zhu, Zhen-Hua Feng, Zhihui Wang, Zhiqun He, "The Sixth Visual Object Tracking VOT2018 Challenge Results", Computer Vision – ECCV 2018 Workshops, Lecture Notes in Computer Science, Vol. 11129, 3-53, 2019.

AbstractKeywordsBiBTeXDOIFulltext

BIBTEX

@inproceedings{diva2:1366619,
  author = {Kristan, Matej and Leonardis, Ale\v{s} and Matas, Jirí and Felsberg, Michael and Pflugfelder, Roman and Zajc, Luka Cehovin and Vojírì, Tomá\v{s} and Bhat, Goutam and Lukezi\v{c}, Alan and Eldesokey, Abdelrahman and Fernández, Gustavo and García-Martín, Álvaro and Iglesias-Arias, Álvaro and Alatan, A. Aydin and González-García, Abel and Petrosino, Alfredo and Memarmoghadam, Alireza and Vedaldi, Andrea and Muhi\v{c}, Andrej and He, Anfeng and Smeulders, Arnold and Perera, Asanka G. and Li, Bo and Chen, Boyu and Kim, Changick and Xu, Changsheng and Xiong, Changzhen and Tian, Cheng and Luo, Chong and Sun, Chong and Hao, Cong and Kim, Daijin and Mishra, Deepak and Chen, Deming and Wang, Dong and Wee, Dongyoon and Gavves, Efstratios and Gundogdu, Erhan and Velasco-Salido, Erik and Khan, Fahad Shahbaz and Yang, Fan and Zhao, Fei and Li, Feng and Battistone, Francesco and De Ath, George and Subrahmanyam, Gorthi R. K. S. and Bastos, Guilherme and Ling, Haibin and Galoogahi, Hamed Kiani and Lee, Hankyeol and Li, Haojie and Zhao, Haojie and Fan, Heng and Zhang, Honggang and Possegger, Horst and Li, Houqiang and Lu, Huchuan and Zhi, Hui and Li, Huiyun and Lee, Hyemin and Chang, Hyung Jin and Drummond, Isabela and Valmadre, Jack and Martin, Jaime Spencer and Chahl, Javaan and Choi, Jin Young and Li, Jing and Wang, Jinqiao and Qi, Jinqing and Sung, Jinyoung and Johnander, Joakim and Henriques, Joao and Choi, Jongwon and van de Weijer, Joost and Herranz, Jorge Rodríguez and Martínez, Jos\'{e} M. and Kittler, Josef and Zhuang, Junfei and Gao, Junyu and Grm, Klemen and Zhang, Lichao and Wang, Lijun and Yang, Lingxiao and Rout, Litu and Si, Liu and Bertinetto, Luca and Chu, Lutao and Che, Manqiang and Maresca, Mario Edoardo and Danelljan, Martin and Yang, Ming-Hsuan and Abdelpakey, Mohamed and Shehata, Mohamed and Kang, Myunggu and Lee, Namhoon and Wang, Ning and Miksik, Ondrej and Moallem, P. and Vicente-Moñivar, Pablo and Senna, Pedro and Li, Peixia and Torr, Philip and Raju, Priya Mariam and Ruihe, Qian and Wang, Qiang and Zhou, Qin and Guo, Qing and Martín-Nieto, Rafael and Gorthi, Rama Krishna and Tao, Ran and Bowden, Richard and Everson, Richard and Wang, Runling and Yun, Sangdoo and Choi, Seokeon and Vivas, Sergio and Bai, Shuai and Huang, Shuangping and Wu, Sihang and Hadfield, Simon and Wang, Siwen and Golodetz, Stuart and Ming, Tang and Xu, Tianyang and Zhang, Tianzhu and Fischer, Tobias and Santopietro, Vincenzo and Štruc, Vitomir and Wei, Wang and Zuo, Wangmeng and Feng, Wei and Wu, Wei and Zou, Wei and Hu, Weiming and Zhou, Wengang and Zeng, Wenjun and Zhang, Xiaofan and Wu, Xiaohe and Wu, Xiao-Jun and Tian, Xinmei and Li, Yan and Lu, Yan and Law, Yee Wei and Wu, Yi and Demiris, Yiannis and Yang, Yicai and Jiao, Yifan and Li, Yuhong and Zhang, Yunhua and Sun, Yuxuan and Zhang, Zheng and Zhu, Zheng and Feng, Zhen-Hua and Wang, Zhihui and He, Zhiqun},
  title = {{The Sixth Visual Object Tracking VOT2018 Challenge Results}},
  booktitle = {Computer Vision -- ECCV 2018 Workshops},
  year = {2019},
  series = {Lecture Notes in Computer Science},
  volume = {11129},
  pages = {3--53},
  publisher = {Springer Publishing Company},
  address = {Cham},
}

Adam Nyberg, Abdelrahman Eldesokey, David Bergström, David Gustafsson, "Unpaired Thermal to Visible Spectrum Transfer using Adversarial Training", Computer Vision - Eccv 2018 Workshops, Pt VI, Lecture Notes in Computer Science, Vol. 11134, 657-669, 2019.

AbstractKeywordsBiBTeXDOI

Amanda Berg, Joakim Johnander, Flavie Durand de Gevigney, Jörgen Ahlberg, Michael Felsberg, "Semi-automatic Annotation of Objects in Visual-Thermal Video", 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 2019.

AbstractKeywordsBiBTeXDOIFulltext

Joakim Johnander, Martin Danelljan, Emil Brissman, Fahad Shahbaz Khan, Michael Felsberg, "A generative appearance model for end-to-end video object segmentation", 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Proceedings - IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR, IEEE Conference on Computer Vision and Pattern Recognition, 8945-8954, 2019.

AbstractKeywordsBiBTeXDOIFulltext

Joakim Johnander, Goutam Bhat, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg, "On the Optimization of Advanced DCF-Trackers", Computer Vision – ECCV 2018 Workshops, Lecture Notes in Computer Science, Vol. 11129, 54-69, 2019.

AbstractKeywordsBiBTeXDOIFulltext

Amanda Berg, Jörgen Ahlberg, Michael Felsberg, "Visual Spectrum Image Generation fromThermal Infrared", Swedish Symposium on Image Analysis, 2019.

AbstractKeywordsBiBTeX

Krešimir Bešenić, Jörgen Ahlberg, Igor Pandžić, "Unsupervised Facial Biometric Data Filtering for Age and Gender Estimation", Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP 2019), 209-217, 2019.

AbstractKeywordsBiBTeXDOIFulltext

Abdelrahman Eldesokey, Michael Felsberg, Fahad Shahbaz Khan, "Propagating Confidences through CNNs for Sparse Data Regression", British Machine Vision Conference 2018, BMVC 2018, 2019.

AbstractKeywordsBiBTeXFulltext

2018

Mikael Persson, Klas Nordberg, "Lambda Twist: An Accurate Fast Robust Perspective Three Point (P3P) Solver", European Conference on Computer VisionECCV 2018: Computer Vision – ECCV 2018, Lecture Notes in Computer Science, Vol. 11208, 334-349, 2018.

AbstractKeywordsBiBTeXDOIFulltext

Goutam Bhat, Joakim Johnander, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg, "Unveiling the power of deep tracking", Computer Vision – ECCV 2018, Lecture Notes in Computer Science, Vol. 11206, 493-509, 2018.

AbstractKeywordsBiBTeXDOIFulltext

Bertil Grelsson, Andreas Robinson, Michael Felsberg, Fahad Shahbaz Khan, "HorizonNet for visual terrain navigation", Proceedings of 2018 IEEE International Conference on Image Processing, Applications and Systems (IPAS), 149-155, 2018.

AbstractKeywordsBiBTeXDOIFulltext

Goutam Bhat, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg, "Combining Local and Global Models for Robust Re-detection", Proceedings of AVSS 2018. 2018 IEEE International Conference on Advanced Video and Signal-based Surveillance, Auckland, New Zealand, 27-30 November 2018, 25-30, 2018.

AbstractKeywordsBiBTeXDOIFulltext

Rao Muhammad Anwer, Fahad Khan, Jorma Laaksonen, "Two-Stream Part-based Deep Representation for Human Attribute Recognition", 2018 INTERNATIONAL CONFERENCE ON BIOMETRICS (ICB), International Conference on Biometrics, 90-97, 2018.

AbstractKeywordsBiBTeXDOI

Karl Holmquist, Deniz Senel, Michael Felsberg, "Computing a Collision-Free Path using the monogenic scale space", 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), International Conference on Intelligent Robots and Systems (IROS), 8097-8102, 2018.

AbstractKeywordsBiBTeXDOIFulltext

Sepehr Hatami, Anton Dahl-Jendelin, Jörgen Ahlberg, Claes Nelsson, "Selective Laser Melting Process Monitoring by Means of Thermography", Proceedings of Euro Powder Metallurgy Congress (Euro PM), 2018.

AbstractKeywordsBiBTeX

Nenad Markus, Ivan Gogic, Igor Pandžic, Jörgen Ahlberg, "Memory-efficient Global Refinement of Decision-Tree Ensembles and its Application to Face Alignment", Proceedings of BMVC 2018 and Workshops, 1-11, 2018.

AbstractKeywordsBiBTeXFulltext

Nolang Fanani, Rudolf Mester, "The precision of triangulation in monocular visual odometry", 2018 IEEE SOUTHWEST SYMPOSIUM ON IMAGE ANALYSIS AND INTERPRETATION (SSIAI), IEEE Southwest Symposium on Image Analysis and Interpretation, 73-76, 2018.

AbstractKeywordsBiBTeXDOI

Bertil Grelsson, Michael Felsberg, "Improved Learning in Convolutional Neural Networks with Shifted Exponential Linear Units (ShELUs)", 2018 24th International Conference on Pattern Recognition (ICPR), International Conference on Pattern Recognition, 517-522, 2018.

AbstractKeywordsBiBTeXDOIFulltext

Gustav Häger, Michael Felsberg, Fahad Shahbaz Khan, "Countering bias in tracking evaluations", Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 581-587, 2018.

AbstractKeywordsBiBTeXDOIFulltext

Felix Järemo Lawin, Martin Danelljan, Fahad Shahbaz Khan, Per-Erik Forssén, Michael Felsberg, "Density Adaptive Point Set Registration", 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE Conference on Computer Vision and Pattern Recognition, 3829-3837, 2018.

AbstractKeywordsBiBTeXDOIFulltext

Hannes Ovrén, Per-Erik Forssén, "Spline Error Weighting for Robust Visual-Inertial Fusion", 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Computer Vision and Pattern Recognition, 321-329, 2018.

AbstractKeywordsBiBTeXDOI

Amanda Berg, Jörgen Ahlberg, Michael Felsberg, "Generating Visible Spectrum Images from Thermal Infrared", Proceedings 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops CVPRW 2018, IEEE Computer Society Conference on Computer Vision and Pattern Recognition workshops, 1224-1233, 2018.

AbstractKeywordsBiBTeXDOIFulltext

2017

Jens Ogniewski, Per-Erik Forssén, "Pushing the Limits for View Prediction in Video Coding", PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISIGRAPP 2017), VOL 4, 68-76, 2017.

AbstractKeywordsBiBTeXDOI

Gabriel Eilertsen, Per-Erik Forssén, Jonas Unger, "BriefMatch: Dense binary feature matching for real-time optical flow estimation", Proceedings of the Scandinavian Conference on Image Analysis (SCIA17), Lecture Notes in Computer Science, 221-233, 2017.

AbstractKeywordsBiBTeXDOI

Matej Kristan, Ales Leonardis, Jiri Matas, Michael Felsberg, Roman Pflugfelder, Luka Cehovin Zajc, Tomas Vojir, Gustav Häger, Alan Lukezic, Abdelrahman Eldesokey, Gustavo Fernandez, Alvaro Garcia-Martin, A. Muhic, Alfredo Petrosino, Alireza Memarmoghadam, Andrea Vedaldi, Antoine Manzanera, Antoine Tran, Aydin Alatan, Bogdan Mocanu, Boyu Chen, Chang Huang, Changsheng Xu, Chong Sun, Dalong Du, David Zhang, Dawei Du, Deepak Mishra, Erhan Gundogdu, Erik Velasco-Salido, Fahad Khan, Francesco Battistone, Gorthi R. K. Sai Subrahmanyam, Goutam Bhat, Guan Huang, Guilherme Bastos, Guna Seetharaman, Hongliang Zhang, Houqiang Li, Huchuan Lu, Isabela Drummond, Jack Valmadre, Jae-Chan Jeong, Jae-Il Cho, Jae-Yeong Lee, Jana Noskova, Jianke Zhu, Jin Gao, Jingyu Liu, Ji-Wan Kim, Joao F. Henriques, Jose M. Martinez, Junfei Zhuang, Junliang Xing, Junyu Gao, Kai Chen, Kannappan Palaniappan, Karel Lebeda, Ke Gao, Kris M. Kitani, Lei Zhang, Lijun Wang, Lingxiao Yang, Longyin Wen, Luca Bertinetto, Mahdieh Poostchi, Martin Danelljan, Matthias Mueller, Mengdan Zhang, Ming-Hsuan Yang, Nianhao Xie, Ning Wang, Ondrej Miksik, P. Moallem, Pallavi M. Venugopal, Pedro Senna, Philip H. S. Torr, Qiang Wang, Qifeng Yu, Qingming Huang, Rafael Martin-Nieto, Richard Bowden, Risheng Liu, Ruxandra Tapu, Simon Hadfield, Siwei Lyu, Stuart Golodetz, Sunglok Choi, Tianzhu Zhang, Titus Zaharia, Vincenzo Santopietro, Wei Zou, Weiming Hu, Wenbing Tao, Wenbo Li, Wengang Zhou, Xianguo Yu, Xiao Bian, Yang Li, Yifan Xing, Yingruo Fan, Zheng Zhu, Zhipeng Zhang, Zhiqun He, "The Visual Object Tracking VOT2017 challenge results", 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), IEEE International Conference on Computer Vision Workshops, 1949-1972, 2017.

AbstractKeywordsBiBTeXDOIFulltext

BIBTEX

@inproceedings{diva2:1192158,
  author = {Kristan, Matej and Leonardis, Ales and Matas, Jiri and Felsberg, Michael and Pflugfelder, Roman and Zajc, Luka Cehovin and Vojir, Tomas and Häger, Gustav and Lukezic, Alan and Eldesokey, Abdelrahman and Fernandez, Gustavo and Garcia-Martin, Alvaro and Muhic, A. and Petrosino, Alfredo and Memarmoghadam, Alireza and Vedaldi, Andrea and Manzanera, Antoine and Tran, Antoine and Alatan, Aydin and Mocanu, Bogdan and Chen, Boyu and Huang, Chang and Xu, Changsheng and Sun, Chong and Du, Dalong and Zhang, David and Du, Dawei and Mishra, Deepak and Gundogdu, Erhan and Velasco-Salido, Erik and Khan, Fahad and Battistone, Francesco and Subrahmanyam, Gorthi R. K. Sai and Bhat, Goutam and Huang, Guan and Bastos, Guilherme and Seetharaman, Guna and Zhang, Hongliang and Li, Houqiang and Lu, Huchuan and Drummond, Isabela and Valmadre, Jack and Jeong, Jae-Chan and Cho, Jae-Il and Lee, Jae-Yeong and Noskova, Jana and Zhu, Jianke and Gao, Jin and Liu, Jingyu and Kim, Ji-Wan and Henriques, Joao F. and Martinez, Jose M. and Zhuang, Junfei and Xing, Junliang and Gao, Junyu and Chen, Kai and Palaniappan, Kannappan and Lebeda, Karel and Gao, Ke and Kitani, Kris M. and Zhang, Lei and Wang, Lijun and Yang, Lingxiao and Wen, Longyin and Bertinetto, Luca and Poostchi, Mahdieh and Danelljan, Martin and Mueller, Matthias and Zhang, Mengdan and Yang, Ming-Hsuan and Xie, Nianhao and Wang, Ning and Miksik, Ondrej and Moallem, P. and Venugopal, Pallavi M. and Senna, Pedro and Torr, Philip H. S. and Wang, Qiang and Yu, Qifeng and Huang, Qingming and Martin-Nieto, Rafael and Bowden, Richard and Liu, Risheng and Tapu, Ruxandra and Hadfield, Simon and Lyu, Siwei and Golodetz, Stuart and Choi, Sunglok and Zhang, Tianzhu and Zaharia, Titus and Santopietro, Vincenzo and Zou, Wei and Hu, Weiming and Tao, Wenbing and Li, Wenbo and Zhou, Wengang and Yu, Xianguo and Bian, Xiao and Li, Yang and Xing, Yifan and Fan, Yingruo and Zhu, Zheng and Zhang, Zhipeng and He, Zhiqun},
  title = {{The Visual Object Tracking VOT2017 challenge results}},
  booktitle = {2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017)},
  year = {2017},
  series = {IEEE International Conference on Computer Vision Workshops},
  pages = {1949--1972},
  publisher = {IEEE},
}

Jan van den Brand, Matthias Ochs, Rudolf Mester, "Instance-Level Segmentation of Vehicles by Deep Contours", COMPUTER VISION - ACCV 2016 WORKSHOPS, PT I, Lecture Notes in Computer Science, 477-492, 2017.

AbstractKeywordsBiBTeXDOI

Nolang Fanani, Alina Stuerck, Marc Barnada, Rudolf Mester, "Multimodal Scale Estimation for Monocular Visual Odometry", 2017 28TH IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV 2017), IEEE Intelligent Vehicles Symposium, 1714-1721, 2017.

AbstractKeywordsBiBTeXDOI

Matthias Ochs, Henry Bradler, Rudolf Mester, "Learning Rank Reduced Interpolation with Principal Component Analysis", 2017 28TH IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV 2017), IEEE Intelligent Vehicles Symposium, 1126-1133, 2017.

AbstractKeywordsBiBTeXDOI

Felix Järemo-Lawin, Martin Danelljan, Patrik Tosteberg, Goutam Bhat, Fahad Shahbaz Khan, Michael Felsberg, "Deep Projective 3D Semantic Segmentation", Computer Analysis of Images and Patterns, Lecture Notes in Computer Science, Vol. 10424, 95-107, 2017.

AbstractKeywordsBiBTeXDOIFulltext

Joakim Johnander, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg, "DCCO: Towards Deformable Continuous Convolution Operators for Visual Tracking", Computer Analysis of Images and Patterns, Lecture Notes in Computer Science, Vol. 10424, 55-67, 2017.

AbstractKeywordsBiBTeXDOIFulltext

Abdelrahman Eldesokey, Michael Felsberg, Fahad Shahbaz Khan, "Ellipse Detection for Visual Cyclists Analysis “In the Wild”", Computer Analysis of Images and Patterns, Lecture Notes in Computer Science, Vol. 10424, 319-331, 2017.

AbstractKeywordsBiBTeXDOIFulltext

Andreas Robinson, Mikael Persson, Michael Felsberg, "Robust Accurate Extrinsic Calibration of Static Non-overlapping Cameras", Computer Analysis of Images and Patterns, Lecture Notes in Computer Science, Vol. 10425, 342-353, 2017.

AbstractKeywordsBiBTeXDOIFulltext

Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, Michael Felsberg, "ECO: Efficient Convolution Operators for Tracking", Proceedings 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2017, 6931-6939, 2017.

AbstractKeywordsBiBTeXDOIFulltext

Jens Ogniewski, Per-Erik Forssén, "What is the best depth-map compression for Depth Image Based Rendering?", Computer Analysis of Images and Patterns, Lecture Notes in Computer Science, Vol. 10425, 403-415, 2017.

AbstractKeywordsBiBTeXDOIFulltext

Marcus Wallenberg, Per-Erik Forssen, "Attentional Masking for Pre-trained Deep Networks", Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS17), 6149-6154, 2017.

AbstractKeywordsBiBTeXDOIFulltext

Henry Bradler, Matthias Ochs, Nolang Fanani, Rudolf Mester, "Joint Epipolar Tracking (JET): Simultaneous optimization of epipolar geometry and feature correspondences", 2017 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2017), IEEE Winter Conference on Applications of Computer Vision, 445-453, 2017.

AbstractKeywordsBiBTeXDOI

Jörgen Ahlberg, Ingmar Renhorn, Tomas Chevalier, Joakim Rydell, David Bergström, "Three-dimensional hyperspectral imaging technique", ALGORITHMS AND TECHNOLOGIES FOR MULTISPECTRAL, HYPERSPECTRAL, AND ULTRASPECTRAL IMAGERY XXIII, Proceedings of SPIE, Vol. 10198, 2017.

AbstractKeywordsBiBTeXDOIFulltext

Amanda Berg, Jörgen Ahlberg, Michael Felsberg, "Object Tracking in Thermal Infrared Imagery based on Channel Coded Distribution Fields", Swedish Symposium on Image Analysis, 2017.

AbstractKeywordsBiBTeXFulltext

2016

Tomas Brandtberg, "Virtual hexagonal and multi-scale operator for fuzzy rank order texture classification using one-dimensional generalised Fourier analysis", 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), International Conference on Pattern Recognition, 2018-2024, 2016.

AbstractKeywordsBiBTeXDOI

Nenad Markus, Igor S. Pandzic, Jörgen Ahlberg, "Learning Local Descriptors by Optimizing the Keypoint-Correspondence Criterion", 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), International Conference on Pattern Recognition, 2380-2385, 2016.

AbstractKeywordsBiBTeXDOI

Martin Danelljan, Gustav Häger, Fahad Shahbaz Khan, Michael Felsberg, "Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking", 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2016, 1430-1438, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Martin Danelljan, Giulia Meneghetti, Fahad Shahbaz Khan, Michael Felsberg, "A Probabilistic Framework for Color-Based Point Set Registration", 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2016, 1818-1826, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Gustav Häger, Goutam Bhat, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg, Piotr Rudol, Patrick Doherty, "Combining Visual Tracking and Person Detection for Long Term Tracking on a UAV", Proceedings of the 12th International Symposium on Advances in Visual Computing, Lecture Notes in Computer Science, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Susanna Gladh, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg, "Deep motion features for visual tracking", Proceedings of the 23rd International Conference on, Pattern Recognition (ICPR), 2016, 1243-1248, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Martin Danelljan, Giulia Meneghetti, Fahad Shahbaz Khan, Michael Felsberg, "Aligning the Dissimilar: A Probabilistic Feature-Based Point Set Registration Approach", Proceedings of the 23rd International Conference on Pattern Recognition (ICPR) 2016, 247-252, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Amanda Berg, Jörgen Ahlberg, Michael Felsberg, "Channel Coded Distribution Field Tracking for Thermal Infrared Imagery", PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 1248-1256, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Marcus Wallenberg, Per-Erik Forssén, "Improving Random Forests by Correlation-Enhancing Projections and Sample-Based Sparse Discriminant Selection", Proceedings 13th Conference on Computer and Robot Vision CRV 2016, 222-227, 2016.

AbstractKeywordsBiBTeXDOI

Nolang Fanani, Matthias Ochs, Henry Bradler, Rudolf Mester, "Keypoint Trajectory Estimation Using Propagation Based Tracking", 2016 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), IEEE Intelligent Vehicles Symposium, 933-939, 2016.

AbstractKeywordsBiBTeXDOI

Daniel Biedermann, Matthias Ochs, Rudolf Mester, "Evaluating visual ADAS components on the COnGRATS dataset", 2016 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), IEEE Intelligent Vehicles Symposium, 986-991, 2016.

AbstractKeywordsBiBTeXDOI

Christian Conrad, Rudolf Mester, "LEARNING RANK REDUCED MAPPINGS USING CANONICAL CORRELATION ANALYSIS", 2016 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2016.

AbstractKeywordsBiBTeXDOI

Matej Kristan, Ales Leonardis, Jiri Matas, Michael Felsberg, Roman Pflugfelder, Luka Cehovin, Tomas Vojir, Gustav Häger, Alan Lukezic, Gustavo Fernandez, Abhinav Gupta, Alfredo Petrosino, Alireza Memarmoghadam, Alvaro Garcia-Martin, Andres Solis Montero, Andrea Vedaldi, Andreas Robinson, Andy J. Ma, Anton Varfolomieiev, Aydin Alatan, Aykut Erdem, Bernard Ghanem, Bin Liu, Bohyung Han, Brais Martinez, Chang-Ming Chang, Changsheng Xu, Chong Sun, Daijin Kim, Dapeng Chen, Dawei Du, Deepak Mishra, Dit-Yan Yeung, Erhan Gundogdu, Erkut Erdem, Fahad Khan, Fatih Porikli, Fei Zhao, Filiz Bunyak, Francesco Battistone, Gao Zhu, Giorgio Roffo, Gorthi R. K. Sai Subrahmanyam, Guilherme Bastos, Guna Seetharaman, Henry Medeiros, Hongdong Li, Honggang Qi, Horst Bischof, Horst Possegger, Huchuan Lu, Hyemin Lee, Hyeonseob Nam, Hyung Jin Chang, Isabela Drummond, Jack Valmadre, Jae-chan Jeong, Jae-il Cho, Jae-Yeong Lee, Jianke Zhu, Jiayi Feng, Jin Gao, Jin Young Choi, Jingjing Xiao, Ji-Wan Kim, Jiyeoup Jeong, Joao F. Henriques, Jochen Lang, Jongwon Choi, Jose M. Martinez, Junliang Xing, Junyu Gao, Kannappan Palaniappan, Karel Lebeda, Ke Gao, Krystian Mikolajczyk, Lei Qin, Lijun Wang, Longyin Wen, Luca Bertinetto, Madan Kumar Rapuru, Mahdieh Poostchi, Mario Maresca, Martin Danelljan, Matthias Mueller, Mengdan Zhang, Michael Arens, Michel Valstar, Ming Tang, Mooyeol Baek, Muhammad Haris Khan, Naiyan Wang, Nana Fan, Noor Al-Shakarji, Ondrej Miksik, Osman Akin, Payman Moallem, Pedro Senna, Philip H. S. Torr, Pong C. Yuen, Qingming Huang, Rafael Martin-Nieto, Rengarajan Pelapur, Richard Bowden, Robert Laganiere, Rustam Stolkin, Ryan Walsh, Sebastian B. Krah, Shengkun Li, Shengping Zhang, Shizeng Yao, Simon Hadfield, Simone Melzi, Siwei Lyu, Siyi Li, Stefan Becker, Stuart Golodetz, Sumithra Kakanuru, Sunglok Choi, Tao Hu, Thomas Mauthner, Tianzhu Zhang, Tony Pridmore, Vincenzo Santopietro, Weiming Hu, Wenbo Li, Wolfgang Huebner, Xiangyuan Lan, Xiaomeng Wang, Xin Li, Yang Li, Yiannis Demiris, Yifan Wang, Yuankai Qi, Zejian Yuan, Zexiong Cai, Zhan Xu, Zhenyu He, Zhizhen Chi, "The Visual Object Tracking VOT2016 Challenge Results", COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, Lecture Notes in Computer Science, Vol. 9914, 777-823, 2016.

AbstractKeywordsBiBTeXDOIFulltext

BIBTEX

@inproceedings{diva2:1063965,
  author = {Kristan, Matej and Leonardis, Ales and Matas, Jiri and Felsberg, Michael and Pflugfelder, Roman and Cehovin, Luka and Vojir, Tomas and Häger, Gustav and Lukezic, Alan and Fernandez, Gustavo and Gupta, Abhinav and Petrosino, Alfredo and Memarmoghadam, Alireza and Garcia-Martin, Alvaro and Solis Montero, Andres and Vedaldi, Andrea and Robinson, Andreas and Ma, Andy J. and Varfolomieiev, Anton and Alatan, Aydin and Erdem, Aykut and Ghanem, Bernard and Liu, Bin and Han, Bohyung and Martinez, Brais and Chang, Chang-Ming and Xu, Changsheng and Sun, Chong and Kim, Daijin and Chen, Dapeng and Du, Dawei and Mishra, Deepak and Yeung, Dit-Yan and Gundogdu, Erhan and Erdem, Erkut and Khan, Fahad and Porikli, Fatih and Zhao, Fei and Bunyak, Filiz and Battistone, Francesco and Zhu, Gao and Roffo, Giorgio and Sai Subrahmanyam, Gorthi R. K. and Bastos, Guilherme and Seetharaman, Guna and Medeiros, Henry and Li, Hongdong and Qi, Honggang and Bischof, Horst and Possegger, Horst and Lu, Huchuan and Lee, Hyemin and Nam, Hyeonseob and Jin Chang, Hyung and Drummond, Isabela and Valmadre, Jack and Jeong, Jae-chan and Cho, Jae-il and Lee, Jae-Yeong and Zhu, Jianke and Feng, Jiayi and Gao, Jin and Young Choi, Jin and Xiao, Jingjing and Kim, Ji-Wan and Jeong, Jiyeoup and Henriques, Joao F. and Lang, Jochen and Choi, Jongwon and Martinez, Jose M. and Xing, Junliang and Gao, Junyu and Palaniappan, Kannappan and Lebeda, Karel and Gao, Ke and Mikolajczyk, Krystian and Qin, Lei and Wang, Lijun and Wen, Longyin and Bertinetto, Luca and Kumar Rapuru, Madan and Poostchi, Mahdieh and Maresca, Mario and Danelljan, Martin and Mueller, Matthias and Zhang, Mengdan and Arens, Michael and Valstar, Michel and Tang, Ming and Baek, Mooyeol and Haris Khan, Muhammad and Wang, Naiyan and Fan, Nana and Al-Shakarji, Noor and Miksik, Ondrej and Akin, Osman and Moallem, Payman and Senna, Pedro and Torr, Philip H. S. and Yuen, Pong C. and Huang, Qingming and Martin-Nieto, Rafael and Pelapur, Rengarajan and Bowden, Richard and Laganiere, Robert and Stolkin, Rustam and Walsh, Ryan and Krah, Sebastian B. and Li, Shengkun and Zhang, Shengping and Yao, Shizeng and Hadfield, Simon and Melzi, Simone and Lyu, Siwei and Li, Siyi and Becker, Stefan and Golodetz, Stuart and Kakanuru, Sumithra and Choi, Sunglok and Hu, Tao and Mauthner, Thomas and Zhang, Tianzhu and Pridmore, Tony and Santopietro, Vincenzo and Hu, Weiming and Li, Wenbo and Huebner, Wolfgang and Lan, Xiangyuan and Wang, Xiaomeng and Li, Xin and Li, Yang and Demiris, Yiannis and Wang, Yifan and Qi, Yuankai and Yuan, Zejian and Cai, Zexiong and Xu, Zhan and He, Zhenyu and Chi, Zhizhen},
  title = {{The Visual Object Tracking VOT2016 Challenge Results}},
  booktitle = {COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II},
  year = {2016},
  series = {Lecture Notes in Computer Science},
  volume = {9914},
  pages = {777--823},
  publisher = {SPRINGER INT PUBLISHING AG},
}

Michael Felsberg, Matej Kristan, Jiri Matas, Ales Leonardis, Roman Pflugfelder, Gustav Häger, Amanda Berg, Abdelrahman Eldesokey, Jörgen Ahlberg, Luka Cehovin, Tomas Vojir, Alan Lukezic, Gustavo Fernandez, Alfredo Petrosino, Alvaro Garcia-Martin, Andres Solis Montero, Anton Varfolomieiev, Aykut Erdem, Bohyung Han, Chang-Ming Chang, Dawei Du, Erkut Erdem, Fahad Shahbaz Khan, Fatih Porikli, Fei Zhao, Filiz Bunyak, Francesco Battistone, Gao Zhu, Guna Seetharaman, Hongdong Li, Honggang Qi, Horst Bischof, Horst Possegger, Hyeonseob Nam, Jack Valmadre, Jianke Zhu, Jiayi Feng, Jochen Lang, Jose M. Martinez, Kannappan Palaniappan, Karel Lebeda, Ke Gao, Krystian Mikolajczyk, Longyin Wen, Luca Bertinetto, Mahdieh Poostchi, Mario Maresca, Martin Danelljan, Michael Arens, Ming Tang, Mooyeol Baek, Nana Fan, Noor Al-Shakarji, Ondrej Miksik, Osman Akin, Philip H. S. Torr, Qingming Huang, Rafael Martin-Nieto, Rengarajan Pelapur, Richard Bowden, Robert Laganiere, Sebastian B. Krah, Shengkun Li, Shizeng Yao, Simon Hadfield, Siwei Lyu, Stefan Becker, Stuart Golodetz, Tao Hu, Thomas Mauthner, Vincenzo Santopietro, Wenbo Li, Wolfgang Huebner, Xin Li, Yang Li, Zhan Xu, Zhenyu He, "The Thermal Infrared Visual Object Tracking VOT-TIR2016 Challenge Results", Computer Vision – ECCV 2016 Workshops. ECCV 2016., Lecture Notes in Computer Science, Vol. 9914, 824-849, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Felix Järemo-Lawin, Per-Erik Forssén, Hannes Ovrén, "Efficient Multi-frequency Phase Unwrapping Using Kernel Density Estimation", Computer Vision – ECCV 2016 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV, Lecture Notes in Computer Science, Vol. 9908, 170-185, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Martin Danelljan, Andreas Robinson, Fahad Shahbaz Khan, Michael Felsberg, "Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking", Computer Vision – ECCV 2016, Lecture Notes in Computer Science, Vol. 9909, 472-488, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Rao Muhammad Anwer, Fahad Khan, Joost van de Weijer, Jorma Laaksonen, "Combining Holistic and Part-based Deep Representations for Computational Painting Categorization", ICMR16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 339-342, 2016.

AbstractKeywordsBiBTeXDOI

Nolang Fanani, Rudolf Mester, "Propagation based tracking with uncertainty measurement in automotive applications", 2016 IEEE SOUTHWEST SYMPOSIUM ON IMAGE ANALYSIS AND INTERPRETATION (SSIAI), IEEE Southwest Symposium on Image Analysis and Interpretation, 117-120, 2016.

AbstractKeywordsBiBTeXDOI

Kristoffer Öfjäll, Michael Felsberg, Andreas Robinson, "Visual Autonomous Road Following by Symbiotic Online Learning", Intelligent Vehicles Symposium (IV), 2016 IEEE, 136-143, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Matthias Ochs, Henry Bradler, Rudolf Mester, "Enhanced Phase Correlation for Reliable and Robust Estimation of Multiple Motion Distributions", IMAGE AND VIDEO TECHNOLOGY, PSIVT 2015, Lecture Notes in Computer Science, 368-379, 2016.

AbstractKeywordsBiBTeXDOI

Amanda Berg, Michael Felsberg, Gustav Häger, Jörgen Ahlberg, "An Overview of the Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge", Swedish Symposium on Image Analysis, Svenska sällskapet för automatiserad bildanalys (SSBA), 2016.

AbstractKeywordsBiBTeXFulltext

2015

Matej Kristan, Jiri Matas, Ales Leonardis, Michael Felsberg, Luka Cehovin, Gustavo Fernandez, Tomas Vojir, Gustav Häger, Georg Nebehay, Roman Pflugfelder, Abhinav Gupta, Adel Bibi, Alan Lukezic, Alvaro Garcia-Martins, Amir Saffari, Alfredo Petrosino, Andres Solis Montero, Anton Varfolomieiev, Atilla Baskurt, Baojun Zhao, Bernard Ghanem, Brais Martinez, ByeongJu Lee, Bohyung Han, Chaohui Wang, Christophe Garcia, Chunyuan Zhang, Cordelia Schmid, Dacheng Tao, Daijin Kim, Dafei Huang, Danil Prokhorov, Dawei Du, Dit-Yan Yeung, Eraldo Ribeiro, Fahad Khan, Fatih Porikli, Filiz Bunyak, Gao Zhu, Guna Seetharaman, Hilke Kieritz, Hing Tuen Yau, Hongdong Li, Honggang Qi, Horst Bischof, Horst Possegger, Hyemin Lee, Hyeonseob Nam, Ivan Bogun, Jae-chan Jeong, Jae-il Cho, Jae-Young Lee, Jianke Zhu, Jianping Shi, Jiatong Li, Jiaya Jia, Jiayi Feng, Jin Gao, Jin Young Choi, Ji-Wan Kim, Jochen Lang, Jose M. Martinez, Jongwon Choi, Junliang Xing, Kai Xue, Kannappan Palaniappan, Karel Lebeda, Karteek Alahari, Ke Gao, Kimin Yun, Kin Hong Wong, Lei Luo, Liang Ma, Lipeng Ke, Longyin Wen, Luca Bertinetto, Mandieh Pootschi, Mario Maresca, Martin Danelljan, Mei Wen, Mengdan Zhang, Michael Arens, Michel Valstar, Ming Tang, Ming-Ching Chang, Muhammad Haris Khan, Nana Fan, Naiyan Wang, Ondrej Miksik, Philip H. S. Torr, Qiang Wang, Rafael Martin-Nieto, Rengarajan Pelapur, Richard Bowden, Robert Laganiere, Salma Moujtahid, Sam Hare, Simon Hadfield, Siwei Lyu, Siyi Li, Song-Chun Zhu, Stefan Becker, Stefan Duffner, Stephen L. Hicks, Stuart Golodetz, Sunglok Choi, Tianfu Wu, Thomas Mauthner, Tony Pridmore, Weiming Hu, Wolfgang Hubner, Xiaomeng Wang, Xin Li, Xinchu Shi, Xu Zhao, Xue Mei, Yao Shizeng, Yang Hua, Yang Li, Yang Lu, Yuezun Li, Zhaoyun Chen, Zehua Huang, Zhe Chen, Zhe Zhang, Zhenyu He, Zhibin Hong, "The Visual Object Tracking VOT2015 challenge results", Proceedings 2015 IEEE International Conference on Computer Vision Workshops ICCVW 2015, 564-586, 2015.

AbstractKeywordsBiBTeXDOIFulltext

BIBTEX

@inproceedings{diva2:1078694,
  author = {Kristan, Matej and Matas, Jiri and Leonardis, Ales and Felsberg, Michael and Cehovin, Luka and Fernandez, Gustavo and Vojir, Tomas and Häger, Gustav and Nebehay, Georg and Pflugfelder, Roman and Gupta, Abhinav and Bibi, Adel and Lukezic, Alan and Garcia-Martins, Alvaro and Saffari, Amir and Petrosino, Alfredo and Solis Montero, Andres and Varfolomieiev, Anton and Baskurt, Atilla and Zhao, Baojun and Ghanem, Bernard and Martinez, Brais and Lee, ByeongJu and Han, Bohyung and Wang, Chaohui and Garcia, Christophe and Zhang, Chunyuan and Schmid, Cordelia and Tao, Dacheng and Kim, Daijin and Huang, Dafei and Prokhorov, Danil and Du, Dawei and Yeung, Dit-Yan and Ribeiro, Eraldo and Khan, Fahad and Porikli, Fatih and Bunyak, Filiz and Zhu, Gao and Seetharaman, Guna and Kieritz, Hilke and Tuen Yau, Hing and Li, Hongdong and Qi, Honggang and Bischof, Horst and Possegger, Horst and Lee, Hyemin and Nam, Hyeonseob and Bogun, Ivan and Jeong, Jae-chan and Cho, Jae-il and Lee, Jae-Young and Zhu, Jianke and Shi, Jianping and Li, Jiatong and Jia, Jiaya and Feng, Jiayi and Gao, Jin and Young Choi, Jin and Kim, Ji-Wan and Lang, Jochen and Martinez, Jose M. and Choi, Jongwon and Xing, Junliang and Xue, Kai and Palaniappan, Kannappan and Lebeda, Karel and Alahari, Karteek and Gao, Ke and Yun, Kimin and Hong Wong, Kin and Luo, Lei and Ma, Liang and Ke, Lipeng and Wen, Longyin and Bertinetto, Luca and Pootschi, Mandieh and Maresca, Mario and Danelljan, Martin and Wen, Mei and Zhang, Mengdan and Arens, Michael and Valstar, Michel and Tang, Ming and Chang, Ming-Ching and Haris Khan, Muhammad and Fan, Nana and Wang, Naiyan and Miksik, Ondrej and Torr, Philip H. S. and Wang, Qiang and Martin-Nieto, Rafael and Pelapur, Rengarajan and Bowden, Richard and Laganiere, Robert and Moujtahid, Salma and Hare, Sam and Hadfield, Simon and Lyu, Siwei and Li, Siyi and Zhu, Song-Chun and Becker, Stefan and Duffner, Stefan and Hicks, Stephen L. and Golodetz, Stuart and Choi, Sunglok and Wu, Tianfu and Mauthner, Thomas and Pridmore, Tony and Hu, Weiming and Hubner, Wolfgang and Wang, Xiaomeng and Li, Xin and Shi, Xinchu and Zhao, Xu and Mei, Xue and Shizeng, Yao and Hua, Yang and Li, Yang and Lu, Yang and Li, Yuezun and Chen, Zhaoyun and Huang, Zehua and Chen, Zhe and Zhang, Zhe and He, Zhenyu and Hong, Zhibin},
  title = {{The Visual Object Tracking VOT2015 challenge results}},
  booktitle = {Proceedings 2015 IEEE International Conference on Computer Vision Workshops ICCVW 2015},
  year = {2015},
  pages = {564--586},
  publisher = {IEEE},
}

Christian Conrad, Rudolf Mester, "Learning Relative Photometric Differences of Pairs of Cameras", 2015 12TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), 2015.

AbstractKeywordsBiBTeXDOI

Marc Barnada, Christian Conrad, Henry Bradler, Matthias Ochs, Rudolf Mester, "Estimation of Automotive Pitch, Yaw, and Roll using Enhanced Phase Correlation on Multiple Far-field Windows", 2015 IEEE Intelligent Vehicles Symposium (IV), 481-486, 2015.

AbstractKeywordsBiBTeXDOI

Henry Bradler, Birthe Anne Wiegand, Rudolf Mester, "The Statistics of Driving Sequences - and what we can learn from them", 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOP (ICCVW), 106-114, 2015.

AbstractKeywordsBiBTeXDOI

Nolang Fanani, Marc Barnada, Rudolf Mester, "Motion Priors Estimation for Robust Matching Initialization in Automotive Applications", Advances in Visual Computing, Lecture Notes in Computer Science, Vol. 9474, 115-126, 2015.

AbstractKeywordsBiBTeXDOI

Martin Danelljan, Gustav Häger, Fahad Shahbaz Khan, Michael Felsberg, "Convolutional Features for Correlation Filter Based Visual Tracking", 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), 621-629, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Amanda Berg, Jörgen Ahlberg, Michael Felsberg, "A thermal infrared dataset for evaluation of short-term tracking methods", Swedish Symposium on Image Analysis, Svenska sällskapet för automatiserad bildanalys (SSBA), 2015.

AbstractKeywordsBiBTeXFulltext

Peter Pinggera, Uwe Franke, Rudolf Mester, "High-Performance Long Range Obstacle Detection Using Stereo Vision", 2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), IEEE International Conference on Intelligent Robots and Systems, 1308-1313, 2015.

AbstractKeywordsBiBTeXDOI

Michael Felsberg, Amanda Berg, Gustav Häger, Jörgen Ahlberg, Matej Kristan, Jiri Matas, Ales Leonardis, Luka Cehovin, Gustavo Fernandez, Tomas Vojır, Georg Nebehay, Roman Pflugfelder, Alan Lukezic, Alvaro Garcia-Martin8, Amir Saffari, Ang Li, Andres Solıs Montero, Baojun Zhao, Cordelia Schmid, Dapeng Chen, Dawei Du, Fahad Shahbaz Khan, Fatih Porikli, Gao Zhu, Guibo Zhu, Hanqing Lu, Hilke Kieritz, Hongdong Li, Honggang Qi, Jae-chan Jeong, Jae-il Cho, Jae-Yeong Lee, Jianke Zhu, Jiatong Li, Jiayi Feng, Jinqiao Wang, Ji-Wan Kim, Jochen Lang, Jose M. Martinez, Kai Xue, Karteek Alahari, Liang Ma, Lipeng Ke, Longyin Wen, Luca Bertinetto, Martin Danelljan, Michael Arens, Ming Tang, Ming-Ching Chang, Ondrej Miksik, Philip H S Torr, Rafael Martin-Nieto, Robert Laganiere, Sam Hare, Siwei Lyu, Song-Chun Zhu, Stefan Becker, Stephen L Hicks, Stuart Golodetz, Sunglok Choi, Tianfu Wu, Wolfgang Hubner, Xu Zhao, Yang Hua, Yang Li, Yang Lu, Yuezun Li, Zejian Yuan, Zhibin Hong, "The Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge Results", Proceedings of the IEEE International Conference on Computer Vision, IEEE International Conference on Computer Vision. Proceedings, 639-651, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Freddie Åström, Michael Felsberg, Hanno Scharr, "Adaptive sharpening of multimodal distributions", Colour and Visual Computing Symposium (CVCS), 2015, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Mikael Persson, Tommaso Piccini, Michael Felsberg, Rudolf Mester, "Robust Stereo Visual Odometry from Monocular Techniques", 2015 IEEE Intelligent Vehicles Symposium (IV), Intelligent Vehicle, IEEE Symposium, 686-691, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Martin Danelljan, Gustav Häger, Fahad Shahbaz Khan, Michael Felsberg, "Learning Spatially Regularized Correlation Filters for Visual Tracking", Proceedings of the International Conference in Computer Vision (ICCV), 2015, IEEE International Conference on Computer Vision. Proceedings, 4310-4318, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Fahad Shahbaz Khan, Muhammad Anwer Rao, Joost van de Weijer, Michael Felsberg, Jorma Laaksonen, "Deep Semantic Pyramids for Human Attributes and Action Recognition", Image Analysis, Lecture Notes in Computer Science, Vol. 9127, 341-353, 2015.

AbstractKeywordsBiBTeXDOI

Kristoffer Öfjäll, Michael Felsberg, "Online learning of autonomous driving using channel representations of multi-modal joint distributions", Proceedings of SSBA, Swedish Symposium on Image Analysis, 2015, 2015.

KeywordsBiBTeX

Giulia Meneghetti, Martin Danelljan, Michael Felsberg, Klas Nordberg, "Image alignment for panorama stitching in sparsely structured environments", Image Analysis. SCIA 2015., Lecture Notes in Computer Science, Vol. 9127, 428-439, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Tommaso Piccini, Mikael Persson, Klas Nordberg, Michael Felsberg, Rudolf Mester, "Good Edgels to Track: Beating the Aperture Problem with Epipolar Geometry", COMPUTER VISION - ECCV 2014 WORKSHOPS, PT II, Lecture Notes in Computer Science, Vol. 8926, 652-664, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Jörgen Ahlberg, Amanda Berg, "Evaluating Template Rescaling in Short-Term Single-Object Tracking", 17th IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS), Karlsruhe, Germany, August 25, 2015, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Matej Kristan, Roman P. Pflugfelder, Ales Leonardis, Jiri Matas, Luka Cehovin, Georg Nebehay, Tomas Vojir, Gustavo Fernandez, Alan Lukezi, Aleksandar Dimitriev, Alfredo Petrosino, Amir Saffari, Bo Li, Bohyung Han, CherKeng Heng, Christophe Garcia, Dominik Pangersic, Gustav Häger, Fahad Shahbaz Khan, Franci Oven, Horst Possegger, Horst Bischof, Hyeonseob Nam, Jianke Zhu, JiJia Li, Jin Young Choi, Jin-Woo Choi, Joao F. Henriques, Joost van de Weijer, Jorge Batista, Karel Lebeda, Kristoffer Ofjall, Kwang Moo Yi, Lei Qin, Longyin Wen, Mario Edoardo Maresca, Martin Danelljan, Michael Felsberg, Ming-Ming Cheng, Philip Torr, Qingming Huang, Richard Bowden, Sam Hare, Samantha YueYing Lim, Seunghoon Hong, Shengcai Liao, Simon Hadfield, Stan Z. Li, Stefan Duffner, Stuart Golodetz, Thomas Mauthner, Vibhav Vineet, Weiyao Lin, Yang Li, Yuankai Qi, Zhen Lei, ZhiHeng Niu, "The Visual Object Tracking VOT2014 Challenge Results", COMPUTER VISION - ECCV 2014 WORKSHOPS, PT II, Lecture Notes in Computer Science, Vol. 8926, 191-217, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Martin Danelljan, Gustav Häger, Fahad Shahbaz Khan, Michael Felsberg, "Coloring Channel Representations for Visual Tracking", 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015. Proceedings, Lecture Notes in Computer Science, Vol. 9127, 117-129, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Amanda Berg, Jörgen Ahlberg, Michael Felsberg, "A Thermal Object Tracking Benchmark", 12th IEEE International Conference on Advanced Video- and Signal-based Surveillance, Karlsruhe, Germany, August 25-28 2015, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Hannes Ovrén, Per-Erik Forssén, "Gyroscope-based video stabilisation with auto-calibration", 2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), IEEE International Conference on Robotics and Automation ICRA, 2090-2097, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Amanda Berg, Kristoffer Öfjäll, Jörgen Ahlberg, Michael Felsberg, "Detecting Rails and Obstacles Using a Train-Mounted Thermal Camera", Image Analysis, Lecture Notes in Computer Science, Vol. 9127, 492-503, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Joost van de Weijer, Fahad Khan, "An Overview of Color Name Applications in Computer Vision", COMPUTATIONAL COLOR IMAGING, CCIW 2015, 16-22, 2015.

AbstractKeywordsBiBTeXDOI

Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg, Karl Granström, Fredrik Heintz, Piotr Rudol, Mariusz Wzorek, Jonas Kvarnström, Patrick Doherty, "A Low-Level Active Vision Framework for Collaborative Unmanned Aircraft Systems", COMPUTER VISION - ECCV 2014 WORKSHOPS, PT I, Lecture Notes in Computer Science, Vol. 8925, 223-237, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Vasileios Zografos, Reiner Lenz, Erik Ringaby, Michael Felsberg, Klas Nordberg, "Fast segmentation of sparse 3D point trajectories using group theoretical invariants", COMPUTER VISION - ACCV 2014, PT IV, Lecture Notes in Computer Science, Vol. 9006, 675-691, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Kristoffer Öfjäll, Michael Felsberg, "Weighted Update and Comparison for Channel-Based Distribution Field Tracking", COMPUTER VISION - ECCV 2014 WORKSHOPS, PT II, Lecture Notes in Computer Science, Vol. 8926, 218-231, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Freddie Åström, George Baravdish, Michael Felsberg, "A Tensor Variational Formulation of Gradient Energy Total Variation", ENERGY MINIMIZATION METHODS IN COMPUTER VISION AND PATTERN RECOGNITION, EMMCVPR 2015, Lecture Notes in Computer Science, 307-320, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Freddie Åström, Michael Felsberg, "On the Choice of Tensor Estimation for Corner Detection, Optical Flow and Denoising", COMPUTER VISION - ACCV 2014 WORKSHOPS, PT II, Lecture Notes in Computer Science, Vol. 9009, 16-30, 2015.

AbstractKeywordsBiBTeXDOIFulltext

2014

Amanda Berg, Jörgen Ahlberg, "Classifying district heating network leakages in aerial thermal imagery", Swedish Symposium on Image Analysis, Svenska sällskapet för automatiserad bildanalys (SSBA), 2014.

AbstractKeywordsBiBTeXFulltext

Rudolf Mester, Christian Conrad, "When patches match - a statistical view on matching under illumination variation", 2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), International Conference on Pattern Recognition, 4364-4369, 2014.

AbstractKeywordsBiBTeXDOI

Rudolf Mester, "Motion Estimation Revisited: an Estimation-Theoretic Approach", 2014 IEEE SOUTHWEST SYMPOSIUM ON IMAGE ANALYSIS AND INTERPRETATION (SSIAI 2014), 113-116, 2014.

AbstractKeywordsBiBTeXDOI

Fahad Khan, De Weijer J. Van, A.D. Bagdanov, Michael Felsberg, "Scale coding bag-of-words for action recognition", Pattern Recognition (ICPR), 2014 22nd International Conference on, International Conference on Pattern Recognition, 1514-1519, 2014.

AbstractKeywordsBiBTeXDOIFulltext

Martin Lesmana, Axel Landgren, Per-Erik Forssén, Dinesh K. Pai, "Active Gaze Stabilization", Proceedings of the 2014 Indian Conference on Computer Vision Graphics and Image Processing, 81:1-81:8, 2014.

AbstractKeywordsBiBTeXDOIFulltext

Kristoffer Öfjäll, Michael Felsberg, "Online Learning and Mode Switching for Autonomous Driving from Demonstration", Proceedings of SSBA, Swedish Symposium on Image Analysis, 2014, 2014.

KeywordsBiBTeX

Martin Danelljan, Gustav Häger, Fahad Khan, Michael Felsberg, "Accurate Scale Estimation for Robust Visual Tracking", Proceedings of the British Machine Vision Conference 2014, 2014.

AbstractKeywordsBiBTeXDOIFulltext

Amanda Berg, Jörgen Ahlberg, "Classification and temporal analysis of district heating leakages in thermal images", Proceedings of The 14th International Symposium on District Heating and Cooling, 2014.

AbstractKeywordsBiBTeXFulltext

Amanda Berg, Jörgen Ahlberg, "Classification of leakage detections acquired by airborne thermography of district heating networks", 2014 8th IAPR Workshop on Pattern Recognition in Remote Sensing (PRRS), IAPR Workshop on Pattern Recognition in Remote Sensing, 1-4, 2014.

AbstractKeywordsBiBTeXDOIFulltext

Johan Hedborg, Andreas Robinson, Michael Felsberg, "Robust Three-View Triangulation Done Fast", Proceedings: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2014, IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 152-157, 2014.

AbstractKeywordsBiBTeXDOIFulltext

Kristoffer Öfjäll, Michael Felsberg, "Biologically Inspired Online Learning of Visual Autonomous Driving", Proceedings British Machine Vision Conference 2014, 137-156, 2014.

AbstractKeywordsBiBTeXDOIFulltext

Erik Ringaby, Per-Erik Forssén, "A Virtual Tripod for Hand-held Video Stacking on Smartphones", 2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL PHOTOGRAPHY (ICCP), IEEE International Conference on Computational Photography, 2014.

AbstractKeywordsBiBTeXDOI

Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg, Joost van de Weijer, "Adaptive Color Attributes for Real-Time Visual Tracking", Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2014, IEEE Conference on Computer Vision and Pattern Recognition. Proceedings, 1090-1097, 2014.

AbstractKeywordsBiBTeXDOIFulltext

Christian Heinemann, Freddie Åström, George Baravdish, Kai Krajsek, Michael Felsberg, Hanno Scharr, "Using Channel Representations in Regularization Terms: A Case Study on Image Diffusion", Proceedings of the 9th International Conference on Computer Vision Theory and Applications, 48-55, 2014.

AbstractKeywordsBiBTeXDOIFulltext

2013

Matej Kristan, Roman Pflugfelder, Ales Leonardis, Jiri Matas, Fatih Porikli, Luka Cehovin, Georg Nebehay, Gustavo Fernandez, Tomas Vojir, Adam Gatt, Ahmad Khajenezhad, Ahmed Salahledin, Ali Soltani-Farani, Ali Zarezade, Alfredo Petrosino, Anthony Milton, Behzad Bozorgtabar, Bo Li, Chee Seng Chan, CherKeng Heng, Dale Ward, David Kearney, Dorothy Monekosso, Hakki Can Karaimer, Hamid R. Rabiee, Jianke Zhu, Jin Gao, Jingjing Xiao, Junge Zhang, Junliang Xing, Kaiqi Huang, Karel Lebeda, Lijun Cao, Mario Edoardo Maresca, Mei Kuan Lim, Mohamed ELHelw, Michael Felsberg, Paolo Remagnino, Richard Bowden, Roland Goecke, Rustam Stolkin, Samantha YueYing Lim, Sara Maher, Sebastien Poullot, Sebastien Wong, Shinichi Satoh, Weihua Chen, Weiming Hu, Xiaoqin Zhang, Yang Li, ZhiHeng Niu, "The Visual Object Tracking VOT2013 challenge results", 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 98-111, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Kristoffer Öfjäll, Michael Felsberg, "Integrating Learning and Optimization for Active Vision Inverse Kinematics", Proceedings of SSBA, Swedish Symposium on Image Analysis, 2013, 2013.

KeywordsBiBTeX

Rahat Khan, Joost Van de Weijer, Fahad Shahbaz Khan, Damien Muselet, Christophe Ducottet, Cecile Barat, "Discriminative Color Descriptors", Computer Vision and Pattern Recognition (CVPR), 2013, IEEE Conference on Computer Vision and Pattern Recognition. Proceedings, 2866-2873, 2013.

AbstractKeywordsBiBTeXDOI

Fahad Shahbaz Khan, Joost Van de Weijer, Sadiq Ali, Michael Felsberg, "Evaluating the Impact of Color on Texture Recognition", Computer Analysis of Images and Patterns, Lecture Notes in Computer Science, Vol. 8047, 154-162, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Michael Felsberg, "Enhanced Distribution Field Tracking using Channel Representations", Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCVW), 2013, 121-128, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Bertil Grelsson, Michael Felsberg, "Probabilistic Hough Voting for Attitude Estimation from Aerial Fisheye Images", Image Analysis, Lecture Notes in Computer Science, Vol. 7944, 478-488, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Philipp Koschorrek, Tommaso Piccini, Per Öberg, Michael Felsberg, Lars Nielsen, Rudolf Mester, "A multi-sensor traffic scene dataset with omnidirectional video", 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 727-734, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Jens Eisenbach, Matthias Mertz, Christian Conrad, Rudolf Mester, "Reducing Camera Vibrations and Photometric Changes in Surveillance Video", 10th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), August 27-30, Krakow, Poland, 69-74, 2013.

AbstractKeywordsBiBTeXDOI

Christian Conrad, Matthias Mertz, Rudolf Mester, "Contour-relaxed Superpixels", EMMCVPR 2013. 9th International Conference Energy Minimization Methods in Computer Vision and Pattern Recognition, August 19-21, Lund, Sweden, Lecture Notes in Computer Science, Vol. 8081, 280-293, 2013.

AbstractKeywordsBiBTeXDOI

Rudolf Mester, Christian Conrad, "Learning Multi-View Correspondences via Subspace-Based Temporal Coincidences", Proceeding Scandinavian Conference on Image Analysis, 2013, Lecture Notes in Computer Science, Vol. 7944, 456-467, 2013.

AbstractKeywordsBiBTeXDOI

Reiner Lenz, Vasileios Zografos, "Fisher Information and the Combination of RGB channels", 4th International Workshop, CCIW 2013, Chiba, Japan, March 3-5, 2013. Proceedings, Lecture Notes in Computer Science, Vol. 7786, 250-264, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Freddie Åström, Vasileios Zografos, Michael Felsberg, "Density Driven Diffusion", 18th Scandinavian Conferences on Image Analysis, 2013, Lecture Notes in Computer Science, Vol. 7944, 718-730, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Freddie Åström, Michael Felsberg, George Baravdish, Claes Lundström, "Targeted Iterative Filtering", Fourth International Conference on Scale Space and Variational Methods in Computer Vision (SSVM 2013), 2-6 June 2013, Schloss Seggau, Graz region, Austria, Lecture Notes in Computer Science, Vol. 7893, 1-11, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Johan Hedborg, Michael Felsberg, "Fast Iterative Five point Relative Pose Estimation", IEEE Workshop on Robot Vision (WoRV 2013), January 15-17, 2013, Clearwater, FL, USA, 60-67, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Vasileios Zografos, Liam Ellis, Rudolf Mester, "Discriminative Subspace Clustering", 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013), June 23-28, 2013, Portland, Oregon, USA, 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013.

AbstractKeywordsBiBTeXDOI

Bertil Grelsson, Michael Felsberg, Folke Isaksson, "Efficient 7D Aerial Pose Estimation", 2013 IEEE Workshop on Robot Vision (WORV), 88-95, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Kristoffer Öfjäll, Felsberg Michael, "Rapid Explorative Direct Inverse Kinematics Learning of Relevant Locations for Active Vision", IEEE Workshop on Robot Vision(WORV) 2013, 14-19, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Hannes Ovrén, Per-Erik Forssén, David Törnqvist, "Why Would I Want a Gyroscope on my RGB-D Sensor?", Proceedings of 2013 IEEE Workshop on Robot Vision (WORV), 68-75, 2013.

AbstractKeywordsBiBTeXDOIFulltext

Liam Ellis, Nicolas Pugeault, Kristoffer Öfjäll, Johan Hedborg, Richard Bowden, Michael Felsberg, "Autonomous Navigation and Sign Detector Learning", IEEE Workshop on Robot Vision(WORV) 2013, 144-151, 2013.

AbstractKeywordsBiBTeXDOIFulltext

2012

Nenad Markuš, Miroslav Frljak, Igor Pandžić, Jörgen Ahlberg, Robert Forchheimer, "High-performance face tracking", ACM 3rd International Symposium on Facial Analysis and Animation, 2012.

AbstractKeywordsBiBTeXDOIFulltext

Kristoffer Öfjäll, Michael Felsberg, "Combining Vision, Machine Learning and Automatic Control to Play the Labyrinth Game", Proceedings of SSBA, Swedish Symposium on Image Analysis, 2012, 2012.

AbstractKeywordsBiBTeXFulltext

David Dederscheck, T. Muller, Rudolf Mester, "Illumination invariance for driving scene optical flow using comparagram preselection", IEEE Intelligent Vehicles Symposium (IV), Proceedings, IEEE Intelligent Vehicles Symposium, Proceedings, Vol. 4, 742-747, 2012.

AbstractKeywordsBiBTeXDOI

Vasileios Zografos, "Enhancing motion segmentation by combination of complementary affinities", Proceedings of the 21st Internationa Conference on Pattern Recognition, 2198-2201, 2012.

AbstractKeywordsBiBTeXFulltext

Reiner Lenz, Vasileios Zografos, "RGB Filter design using the properties of the weibull manifold", CGIV 2012 Sixth European Conference on Colour in Graphics, Imaging, and Vision, 200-205, 2012.

AbstractKeywordsBiBTeXFulltext

Abstract

Combining the channels of a multi-band image with the help of a pixelwise weighted sum is one of the basic operations in color and multispectral image processing. A typical example is the conversion of RGB- to intensity images. Usually the weights are given by some standard values or chosen heuristically. This does not take into account neither the statistical nature of the image source nor the intended further processing of the scalar image. In this paper we will present a framework in which we specify the statistical properties of the input data with the help of a representative collection of image patches. On the output side we specify the intended processing of the scalar image with the help of a filter kernel with zero-mean filter coefficients. Given the image patches and the filter kernel we use the Fisher information of the manifold of two-parameter Weibull distributions to introduce the trace of the Fisher information matrix as a cost function on the space of weight vectors of unit length. We will illustrate the properties of the method with the help of a database of scanned leaves and some color images from the internet. For the green leaves we find that the result of the mapping is similar to standard mappings like Matlab’s RGB2Gray weights. We then change the colour of the leaf using a global shift in the HSV representation of the original image and show how the proposed mapping adapts to this color change. This is also confirmed with other natural images where the new mapping reveals much more subtle details in the processed image. In the last experiment we show that the mapping emphasizes visually salient points in the image whereas the standard mapping only emphasizes global intensity changes. The proposed approach to RGB filter design provides thus a new methodology based only on the properties of the image statistics and the intended post-processing. It adapts to color changes of the input images and, due to its foundation in the statistics of extreme-value distributions, it is suitable for detecting salient regions in an image.

Fahad Shahbaz Khan, Rao Muhammad Anwer, Joost van de Weijer, Andrew D. Bagdanov, Maria Vanrell, Antonio M. Lopez, "Color Attributes for Object Detection", Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2012, 3306-3313, 2012.

AbstractKeywordsBiBTeXDOI

Liam Ellis, Vasileios Zografos, "Online Learning for Fast Segmentation of Moving Objects", ACCV 2012, Lecture Notes in Computer Science, Vol. 7725, 52-65, 2012.

AbstractKeywordsBiBTeXDOIFulltext

Marcus Wallenberg, Per-Erik Forssén, "Teaching Stereo Perception to YOUR Robot", Proceedings of 23rd British Machine Vision Conference, 1-12, 2012.

AbstractKeywordsBiBTeXDOIFulltext

Freddie Åström, George Baravdish, Michael Felsberg, "On Tensor-Based PDEs and their Corresponding Variational Formulations with Application to Color Image Denoising", ECCV 2012: 12th European Conference on Computer Vision, 7-12 October, Firenze, Italy, Lecture Notes in Computer Science, Vol. 7574, 215-228, 2012.

AbstractKeywordsBiBTeXDOIFulltext

Maria Magnusson, Olof Dahlqvist Leinhard, Helene van Ettinger-Veenstra, Peter Lundberg, "FMRI Using 3D PRESTO-CAN - A Novel Method Based on Golden Angle Hybrid Radial-Cartesian Sampling of K-Space", ISMRM, Melbourne, Australia, 5-11 May, 2012, 2012.

KeywordsBiBTeX

Johan Hedborg, Per-Erik Forssén, Michael Felsberg, Erik Ringaby, "Rolling Shutter Bundle Adjustment", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012, Computer Vision and Pattern Recognition, 1434-1441, 2012.

AbstractKeywordsBiBTeXDOI

2011

David Sandberg, Per-Erik Forssén, Jens Ogniewski, "Model-Based Video Coding using Colour and Depth Cameras", Digital Image Computing, 158-163, 2011.

AbstractKeywordsBiBTeXDOI

Erik Ringaby, Per-Erik Forssén, "Scan Rectification for Structured Light Range Sensors with Rolling Shutters", IEEE International Conference on Computer Vision, International Conference on Computer Vision (ICCV), 1575-1582, 2011.

AbstractKeywordsBiBTeXDOI

Gustav Hanning, Nicklas Forslöw, Per-Erik Forssén, Erik Ringaby, David Törnqvist, Jonas Callmer, "Stabilizing Cell Phone Video using Inertial Measurement Sensors", The Second IEEE International Workshop on Mobile Vision, 1-8, 2011.

AbstractKeywordsBiBTeXDOI

Johan Hedborg, Erik Ringaby, Per-Erik Forssén, Michael Felsberg, "Structure and Motion Estimation from Rolling Shutter Video", IEEE International Conference onComputer Vision Workshops (ICCV Workshops), 2011, 17-23, 2011.

AbstractKeywordsBiBTeXDOIFulltext

Maria Magnusson, Alexandr Malusek, Arif Muhammad, Gudrun Alm Carlsson, "Iterative Reconstruction for QuantitativeTissue Decomposition in Dual-Energy CT", Proceedings of the 17th Scandinavian Conference, SCIA 2011, Ystad, Sweden, May 2011., Lecture Notes in Computer Science, Vol. 6688, 479-488, 2011.

AbstractKeywordsBiBTeXDOIFulltext

Maria Magnusson, Alexandr Malusek, Arif Muhammad, Gudrun Alm Carlsson, "Determination of Quantitative Tissue Composition by Iterative Reconstruction on 3D DECT Volumes", Proc 11:th International Meeting on Fully Three-Dimensional Image Reconstruction in Radiology and Nuclear Medicine, Potsdam, Germany, 2011.

KeywordsBiBTeXFulltext

Maria Magnusson, Olof Dahlqvist Leinhard, Peter Lundberg, "A 3D-Plus-Time Radial-Cartesian Hybrid Sampling of K-Space With High Temporal Resolution and Maintained Image Quality for MRI and FMRI", ISMRM, Montreal 2011, 2011.

KeywordsBiBTeX

Gustav Ahlman, Maria Magnusson, Olof Dahlqvist Leinhard, Peter Lundberg, "Increased temporal resolution in radial-Cartesian sampling of k-space by implementation of parallel imaging", ESMRMB 2011, 28th Annual Scientific Meeting, 6-8 October 2011, Leipzig, Germany, 2011.

KeywordsBiBTeX

Anette Karlsson, Maria Magnusson, Olof Dahlqvist Leinhard, Peter Lundberg, "Successful Motion Correction in Reconstruction of Radial MRI", ESMRMB, Leipzig 2011, 2011.

KeywordsBiBTeX

Vasileios Zografos, Reiner Lenz, "Spatio-chromatic image content descriptors and their analysis using Extreme Value Theory", Image analysis, Lecture Notes in Computer Science, Vol. 6688, 579-591, 2011.

AbstractKeywordsBiBTeXDOIFulltext

Vasileios Zografos, Klas Nordberg, "Fast and accurate motion segmentation using linear combination of views", BMVC 2011, 12.1-12.11, 2011.

AbstractKeywordsBiBTeXDOIFulltext

Marcus Wallenberg, Michael Felsberg, Per-Erik Forssén, Babette Dellen, "Leaf Segmentation using the Kinect", Proceedings of SSBA 2011 Symposium on Image Analysis, 2011.

AbstractKeywordsBiBTeX

Marcus Wallenberg, Michael Felsberg, Per-Erik Forssén, Babette Dellen, "Channel Coding for Joint Colour and Depth Segmentation", Proceedings of Pattern Recognition 33rd DAGM Symposium, Frankfurt/Main, Germany, August 31 - September 2, Lecture Notes in Computer Science, Vol. 6835, 306-315, 2011.

AbstractKeywordsBiBTeXDOIFulltext

Erik Ringaby, Per-Erik Forssén, "Rectifying rolling shutter video from hand-held devices", Proceedings SSBA´11 Symposium on Image Analysis, 2011.

AbstractKeywordsBiBTeX

Tohid Ardeshiri, Fredrik Larsson, Fredrik Gustafsson, Thomas B. Schön, Michael Felsberg, "Bicycle Tracking Using Ellipse Extraction", Proceedings of the 14thInternational Conference on Information Fusion, 2011, 1-8, 2011.

AbstractKeywordsBiBTeX

Andreas Krebs, Johan Wiklund, Michael Felsberg, "Optimization of Quadrature Filters Based on the Numerical Integration of Improper Integrals", Pattern Recognition, Lecture Notes in Computer Science, Vol. 6835, 91-100, 2011.

AbstractKeywordsBiBTeXDOI

Fredrik Larsson, Michael Felsberg, "Using Fourier Descriptors and Spatial Models for Traffic Sign Recognition", Image Analysis, Lecture Notes in Computer Science, Vol. 6688, 238-249, 2011.

AbstractKeywordsBiBTeXDOIFulltext

Åström Freddie, Felsberg Michael, Lenz Reiner, "Color Persistent Anisotropic Diffusion of Images", Image Analysis, Lecture Notes in Computer Science, Vol. 6688, 262-272, 2011.

AbstractKeywordsBiBTeXDOIFulltext

Liam Ellis, Michael Felsberg, Richard Bowden, "Affordance mining: Forming perception through action", Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Lecture Notes in Computer Science, Vol. 6495, 525-538, 2011.

AbstractKeywordsBiBTeXDOIFulltext

2010

Marcus Wallenberg, Per-Erik Forssén, "A Research Platform for Embodied Visual Object Recognition", Proceedings of SSBA 2010 Symposium on Image Analysis, Centre for Image Analysis Report Series, Vol. 34, 137-140, 2010.

AbstractKeywordsBiBTeXFulltext

Erik Ringaby, Jörgen Ahlberg, Per-Erik Forssén, Niclas Wadströmer, "Co-alignmnent of Aerial Push-broom Strips using Trajectory Smoothness Constraints", SSBA10, Symposium on Image Analysis 11-12 March, Uppsala, 63-66, 2010.

AbstractKeywordsBiBTeX

Per-Erik Forssén, Erik Ringaby, "Rectifying rolling shutter video from hand-held devices", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010, 507-514, 2010.

AbstractKeywordsBiBTeXDOI

Erik Ringaby, Jörgen Ahlberg, Niclas Wadströmer, Per-Erik Forssén, "Co-aligning Aerial Hyperspectral Push-broom Strips for Change Detection", Proc. SPIE 7835, Electro-Optical Remote Sensing, Photonic Technologies, and Applications IV, Proceedings Spie, Vol. 7835, Art.nr. 7835B-36-, 2010.

AbstractKeywordsBiBTeXDOI

Maria Magnusson, Olof Dahlqvist Leinhard, Patrik Brynolfsson, Per Thyr, Peter Lundberg, "3D Magnetic Resonance Imaging of the Human Brain - Novel Radial Sampling, Filtering and Reconstruction", Proc of the 12th IASTED International Conference on Signal and Image Processing (SIP 2010), August 23 - 25, 2010, Lahaina, Maui, USA, ACTA Press, Track: 710-042-(8 pages), 2010.

AbstractKeywordsBiBTeXFulltext

Vasileios Zografos, Klas Nordberg, Liam Ellis, "Sparse motion segmentation using multiple six-point consistencies.", The 2nd International Workshop on Video Event Categorization, Tagging and Retrieval (VECTaR 2010), Lecture Notes in Computer Science, Vol. 6468, 338-348, 2010.

AbstractKeywordsBiBTeXDOIFulltext

Klas Nordberg, Vasileios Zografos, "Multibody motion segmentation using the geometry of 6 points in 2D images.", International Conference on Pattern Recognition, International Conference on Pattern Recognition, 1783-1787, 2010.

AbstractKeywordsBiBTeXDOIFulltext

Michael Felsberg, Fredrik Larsson, "Learning object tracking in image sequences", International Conference on Cognitive Systems, 2010.

KeywordsBiBTeX

Michael Felsberg, Fredrik Larsson, Han Wang, Anders Ynnerman, Thomas Schön, "Torchlight Navigation", Proceedings of the 20th International Conferenceon Pattern Recognition, International Conference on Pattern Recognition, 302-306, 2010.

AbstractKeywordsBiBTeXDOIFulltext

Michael Felsberg, "Incremental computation of feature hierarchies", Pattern Recognition, Lecture Notes in Computer Science, Vol. 6376, 523-532, 2010.

AbstractKeywordsBiBTeXDOIFulltext

Michael Felsberg, Fredrik Larsson, Wang Han, Anders Ynnerman, Thomas Schön, "Torch Guided Navigation", Proceedings of the 2010 SSBA Symposium, 8-9, 2010.

AbstractKeywordsBiBTeX

Michael Felsberg, "Efficient Computation of Feature Hierarchies using Framelets", Inverse Problems and Applications, 2010.

KeywordsBiBTeX

Johan Wiklund, Klas Nordberg, Michael Felsberg, "Software architecture and middleware for artificial cognitive systems", International Conference on Cognitive Systems, 2010.

KeywordsBiBTeX

Johan Hedborg, Michael Felsberg, "Fast and Robust Relative Pose Estimation for Forward and Sideways Motions", SSBA, 2010.

KeywordsBiBTeX

Michael Felsberg, Affan Shaukat, David Windridge, "Online Learning in Perception-Action Systems", ECCV 2010 Workshop on Vision for Cognitive Tasks, 2010.

AbstractKeywordsBiBTeXFulltext

2009

Maria Andersson, Joakim Rydell, Jörgen Ahlberg, "Estimation of crowd behaviour using sensor networks and sensor fusion", 12th International Conference on Information Fusion (FUSION), 396-403, 2009.

AbstractKeywordsBiBTeX

Marcus Wallenberg, "A Simple Single-Camera Gaze Tracker using Infrared Illumination", Proceedings of SSBA 2009 Symposium on Image Analysis, 53-56, 2009.

AbstractKeywordsBiBTeX

Ajmal Muhammad, Peter Johansson, Robert Forchheimer, "Effect of Buffer Placement on Performance When Communicating Over a Rate-Variable Channel", ICSNC 2009, 2009.

AbstractKeywordsBiBTeXDOI

Johan Wiklund, Vincent Nicolas, Patrice R. Alface, Mats Andersson, Hans Knutsson, "T-flash: Tensor Visualization in Medical Studio", Tensors in Image Processing and Computer Vision, Advances in Pattern Recognition, 455-466, 2009.

AbstractKeywordsBiBTeXDOIFulltext

Vasileios Zografos, "Comparison of Optimisation Algorithms for Deformable Template Matching", Advances in Visual Computing, Lecture notes in computer science, Vol. 5876, 1097-1108, 2009.

AbstractKeywordsBiBTeXDOI

Alain Pagani, Didier Stricker, Michael Felsberg, "Integral P-channels for fast and robust region matching", Image Processing (ICIP), 2009 16th IEEE International Conference, 213-216, 2009.

AbstractKeywordsBiBTeXDOIFulltext

Michael Felsberg, "Spatio-featural scale-space", Swedish Symposium on Image Analysis - SSBA'2009, 18-20 March, Halmstad, Sweden, 2009.

AbstractKeywordsBiBTeX

Fredrik Larsson, Michael Felsberg, Per-Erik Forssén, "Patch Contour Matching by Correlating Fourier Descriptors", Digital Image Computing: Techniques and Applications (DICTA), 40-46, 2009.

AbstractKeywordsBiBTeXDOIFulltext

Johan Hedborg, Per-Erik Forssén, "Fast and Accurate Ego-Motion Estimation", Swedish Symposium on Image Analysis - SSBA'2009, March 18-20, Halmstad, Sweden, 2009.

AbstractKeywordsBiBTeX

Erik Ringaby, "Optical Flow Computation on CUDA", SSBA, 81-84, 2009.

AbstractKeywordsBiBTeX

Maria Magnusson, Olof Dahlqvist Leinhard, Patrik Brynolfsson, Peter Lundberg, "Improved temporal resolution in radial k-space sampling using an hourglass filter", ISMRM 17th Scientific Meeting & Exhibition, 2009.

AbstractKeywordsBiBTeX

Maria Magnusson, Olof Dahlqvist Leinhard, Patrik Brynolfsson, Peter Lundberg, "Radial k-space sampling: step response using different filtering techniques", ISMRM Workshop on Data sampling and Image Reconstruction, 2009.

AbstractKeywordsBiBTeX

Johan Hedborg, Per-Erik Forssén, Michael Felsberg, "Fast and Accurate Structure and Motion Estimation", International Symposium on Visual Computing, Lecture Notes in Computer Science, Vol. Volume 5875, 211-222, 2009.

AbstractKeywordsBiBTeXDOIFulltext

Klas Nordberg, "A minimal parameterization of the trifocal tensor", IEEE Computer Science Conference on Computer Vision and Pattern Recognition (CVPR), 1224-1230, 2009.

AbstractKeywordsBiBTeXDOI

Michael Felsberg, Fredrik Larsson, "Learning Higher-Order Markov Models for ObjectTracking in Image Sequences", Proceedings of the 5th International Symposium on Advances in Visual Computing: Part II, Lecture Notes in Computer Science, Vol. 5876, 184-195, 2009.

AbstractKeywordsBiBTeXDOIFulltext

Fredrik Larsson, Per-Erik Forssén, Michael Felsberg, "Using Fourier descriptors for local region matching", SSBA, 2009.

KeywordsBiBTeX

Fredrik Viksten, Per-Erik Forssén, Björn Johansson, Anders Moe, "Comparison of Local Image Descriptors for Full 6 Degree-of-Freedom Pose Estimation", IEEE ICRA, 2009, 2779-2786, 2009.

AbstractKeywordsBiBTeXDOIFulltext

Michael Felsberg, "Spatio-featural scale-space", Scale Space and Variational Methods in Computer Vision, Lecture Notes in Computer Science, Vol. 5567, 808-819, 2009.

AbstractKeywordsBiBTeXDOIFulltext

2008

Jörgen Ahlberg, Dejan Arsic, Todor Ganchev, Anna Linderhed, Paolo Menezes, Stavros Ntalampiras, Tadeusz Olma, Ilyas Potamitis, Julien Ros, "Prometheus: Prediction and interpretation of human behaviour based on probabilistic structures and heterogeneous sensors", European Conference on Artificial Intelligence (ECAI), 2008.

AbstractKeywordsBiBTeXFulltext

Johan Hedborg, Per-Erik Forssén, "Synthetic Ground Truth for Feature Trackers", Swedish Symposium on Image Analysis 2008, 2008.

AbstractKeywordsBiBTeXFulltext

Klas Nordberg, "Efficient Three-view Triangulation Based on 3D Optimization", Proceedings of the British Machine Vision Conference 2008, 19.1-19.10, 2008.

AbstractKeywordsBiBTeXDOI

Fredrik Viksten, Klas Nordberg, Mikael Kalms, "Point-of-Interest Detection for Range Data", International Conference on Pattern Recognition (ICPR), Pattern Recognition, 1-4, 2008.

AbstractKeywordsBiBTeXDOI

Fredrik Larsson, Erik Jonsson, Michael Felsberg, "Learning Floppy Robot Control", SSBA,2008, 39-42, 2008.

KeywordsBiBTeX

Per-Erik Forssén, David Meger, Kevin Lai, Scott Helmer, James J. Little, David G. Lowe, "Informed Visual Search: Combining Attention and Object Recognition", Proceedings - IEEE International Conference on Robotics and Automation<em></em>, Robotics and Automation, 935-942, 2008.

AbstractKeywordsBiBTeXDOI

Klas Nordberg, "Efficient Triangulation Based on 3D Euclidean Optimization", International Conference on Pattern Recognition (ICPR), IEEE Computer Society, 1-4, 2008.

AbstractKeywordsBiBTeXDOI

Klas Nordberg, "Learning based on subspace voting", Swedish Symposium on Image Analysis (SSBA), 2008.

AbstractKeywordsBiBTeX

Michael Felsberg, "On Second Order Operators and Quadratic Operators", Proceedings - International Conference on Pattern Recognition, 1-4, 2008.

AbstractKeywordsBiBTeXDOIFulltext

Michael Felsberg, Fredrik Larsson, "Learning Bayesian tracking for motion estimation", ECCV Workshop: Machine Learning for Vision-based Motion Analysis, 2008.

KeywordsBiBTeXFulltext

Michael Felsberg, Gösta Granlund, "Fusing Dynamic Percepts and Symbols in Cognitive Systems", International Conference on Cognitive Systems, 2008.

KeywordsBiBTeXFulltext

Michael Felsberg, "On the Relation Between Anisotropic Diffusion and Iterated Adaptive Filtering", Pattern Recognition, Lecture Notes in Computer Science, Vol. 5096, 436-445, 2008.

AbstractKeywordsBiBTeXDOIFulltext

Michael Felsberg, "A Novel two-step Method for CT Reconstruction", Bildverarbeitung für die Medizin, Informatik aktuell, 303-307, 2008.

AbstractKeywordsBiBTeXDOIFulltext

2007

Johan Sunnegårdh, Per-Erik Danielsson, "A new anti-aliased projection operator for iterative CT reconstruction", Proceedings of the Ninth International Meeting on Fully Three-dimensional Image Reconstruction in Radiology and Nuclear Medicine, Lindau, Germany, July 9-13, 2007, 2007.

AbstractKeywordsBiBTeXFulltext

Fredrik Viksten, Klas Nordberg, "A Geometry-Based Local Descriptor for Range Data", Proceedings of the 9th Biennial Conference of the Australian Pattern Recognition Society on Digital Image Computing Techniques and Applications, 210-217, 2007.

AbstractKeywordsBiBTeXDOIFulltext

Per-Erik Forssén, "Learning Saccadic Gaze Control via Motion Prediction", IEEE Canadian CRV,2007, 2007.

KeywordsBiBTeX

Scott Helmer, David Meger, Per-Erik Forssén, Tristram Southey, Sancho McCann, Pooyan Fazli, James J. Little, David G. Lowe, "The UBC Semantic Robot Vision System", AAAI,2007, 2007.

KeywordsBiBTeX

Per-Erik Forssén, David G. Lowe, "Shape Descriptors for Maximally Stable Extremal Regions", IEEE ICCV,2007, 2007.

KeywordsBiBTeX

Per-Erik Forssén, David G. Lowe, "Maximally Stable Colour Regions for Recognition and Matching", -, 2007.

KeywordsBiBTeX

Klas Nordberg, "Point matching constraints in two and three views", Symposium of the German Association for Pattern Recognition (DAGM), LNCS, Vol. 4713, 2007.

AbstractKeywordsBiBTeXDOI

Klas Nordberg, "A linear mapping for stereo triangulation", Scandiavian Conference on Image Analysis (SCIA), LNCS, Vol. 4522, 2007.

AbstractKeywordsBiBTeXDOI

Klas Nordberg, "Single-View Matching Constraints", Advances in Visual Computing, Lecture Notes in Computer Science, Vol. 4842, 397-406, 2007.

AbstractKeywordsBiBTeXDOI

Maria Magnusson, "Projection generation through voxel volumes considering signal processing theory", Fully 3D 2007, Ninth International Meeting on Fully Three-dimensional Image Reconstruction in Radiology and Nuclear Medicine,2007, 2007.

AbstractKeywordsBiBTeX

Erik Jonsson, Michael Felsberg, "Accurate Interpolation in Appearance-Based Pose Estimation", Svenska Sällskapet för Automatiserad Bildanalys SSBA Symposium,2007, 13-16, 2007.

KeywordsBiBTeX

Per-Erik Forssén, Hagen Spies, "Multiple Motion Estimation using Channel Matrices", International Workshop on Complex Motion IWCM,2004, 54-, 2007.

AbstractKeywordsBiBTeX

Michael Felsberg, "Extending Graph-Cut to Continuous Value Domain Minimization", SSBA,2007, 2007.

KeywordsBiBTeX

Fredrik Larsson, Erik Jonsson, Michael Felsberg, "Visual Servoing Based on Learned Inverse Kinematics", SSBA,2007, 21-24, 2007.

KeywordsBiBTeX

Michael Felsberg, Johan Wiklund, Erik Jonsson, Anders Moe, Gösta Granlund, "Exploratory Learning Strucutre in Artificial Cognitive Systems", International Cognitive Vision Workshop, 2007.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

One major goal of the COSPAL project is to develop an artificial cognitive system architecture with the capability of exploratory learning. Exploratory learning is a strategy that allows to apply generalization on a conceptual level, resulting in an extension of competences. Whereas classical learning methods aim at best possible generalization, i.e., concluding from a number of samples of a problem class to the problem class itself, exploration aims at applying acquired competences to a new problem class. Incremental or online learning is an inherent requirement to perform exploratory learning.

Exploratory learning requires new theoretic tools and new algorithms. In the COSPAL project, we mainly investigate reinforcement-type learning methods for exploratory learning and in this paper we focus on its algorithmic aspect. Learning is performed in terms of four nested loops, where the outermost loop reflects the user-reinforcement-feedback loop, the intermediate two loops switch between different solution modes at symbolic respectively sub-symbolic level, and the innermost loop performs the acquired competences in terms of perception-action cycles. We present a system diagram which explains this process in more detail.

We discuss the learning strategy in terms of learning scenarios provided by the user. This interaction between user ('teacher') and system is a major difference to most existing systems where the system designer places his world model into the system. We believe that this is the key to extendable robust system behavior and successful interaction of humans and artificial cognitive systems.

We furthermore address the issue of bootstrapping the system, and, in particular, the visual recognition module. We give some more in-depth details about our recognition method and how feedback from higher levels is implemented. The described system is however work in progress and no final results are available yet. The available preliminary results that we have achieved so far, clearly point towards a successful proof of the architecture concept.

Fredrik Larsson, Erik Jonsson, Michael Felsberg, "Visual Servoing for Floppy Robots using LWPR", RoboMat,2007, 2007.

KeywordsBiBTeXFulltext

Jan-Erik Källhammer, Dick Eriksson, Gösta Granlund, Michael Felsberg, Anders Moe, Björn Johansson, Johan Wiklund, Per-Erik Forssén, "Near Zone Pedestrian Detection using a Low-Resolution FIR Sensor", Intelligent Vehicles Symposium, 2007 IEEE, Intelligent Vehicles Symposium, 2007.

AbstractKeywordsBiBTeXDOI

Michael Felsberg, "Extending Graph-Cut to Continuous Value Domain Minimization", Canadian Conference on Computer and Robot Vision,2007, 274-, 2007.

KeywordsBiBTeXFulltext

Johan Skoglund, Michael Felsberg, "Covariance estimation for SAD block matching", Image Analysis, Lecture Notes in Computer Science, Vol. 4522, 374-382, 2007.

AbstractKeywordsBiBTeXDOI

Erik Jonsson, Michael Felsberg, "Accurate Interpolation in Appearance-Based Pose Estimation", Image Analysis, Lecture Notes in Computer Science, Vol. 4522, 1-10, 2007.

AbstractKeywordsBiBTeXDOIFulltext

Per-Erik Danielsson, Johan Sunnegårdh, "Advanced linear modeling and interpolation in CT-reconstruction", Proceedings of the Ninth International Meeting on Fully Three-dimensional Image Reconstruction in Radiology and Nuclear Medicine, Lindau, Germany, July 9-13, 2007, 2007.

AbstractKeywordsBiBTeXFulltext

Michael Felsberg, Johan Hedborg, "Real-Time Visual Recognition of Objects and Scenes Using P-Channel Matching", Proceedings 15th Scandinavian Conference on Image Analysis, Lecture Notes in Computer Science, Vol. 4522, 908-917, 2007.

AbstractKeywordsBiBTeXDOIFulltext

Johan Hedborg, Johan Skoglund, Michael Felsberg, "KLT Tracking Implementation on the GPU", Proceedings SSBA 2007, 2007.

AbstractKeywordsBiBTeXFulltext

2006

Michael Felsberg, "Optical flow estimation from monogenic phase.", International Workshop on Complex Motion,2004, 2006.

KeywordsBiBTeXFulltext

Erik Jonsson, Michael Felsberg, "Soft Histograms for Belief Propagation", ECCV Workhop of the Representation and Use of Prior Knowledge in Vision,2006, 2006.

KeywordsBiBTeXFulltext

Erik Jonsson, Michael Felsberg, "Correspondence-Free Associative Learning", ICPR,2006, 2006.

KeywordsBiBTeXFulltext

Michael Felsberg, Gösta Granlund, "P-Channels: Robust Multivariate M-Estimation of Large Datasets", ICPR,2006, 2006.

KeywordsBiBTeXFulltext

Johan Skoglund, Michael Felsberg, "Evaluation of Subpixel Tracking Algorithms", International Symposium on Visual Computing,2006, 375-, 2006.

KeywordsBiBTeX

Jigna Chandaria, Graham Thomas, Bogumil Bartczak, Kevin Koeser, Reinhard Koch, Mario Becker, Gabriele Bleser, Didier Stricker, Cedric Wohlleber, Michael Felsberg, Fredrik Gustafsson, Jeroen Hol, Thomas Schön, Johan Skoglund, Per Slycke, Sebastiaan Smeitz, "Real-Time Camera Tracking in the MATRIS Project", Prcoeedings of the 2006 International Broadcasting Convention, 2006.

AbstractKeywordsBiBTeX

Per-Erik Forssén, Anders Moe, "Autonomous Learning of Object Appearances using Colour Contour Frames", 3rd Canadian Conference on Computer and Robot Vision, CRV06, Québec City, Québec, Canada, 3-3, 2006.

AbstractKeywordsBiBTeXDOI

Fredrik Viksten, Robert Söderberg, Klas Nordberg, Christian Perwass, "Increasing Pose Estimation Performance using Multi-cue Integration", IEEE International Conference on Robotic and Automation (ICRA), Robotics and Automation, 3760-3767, 2006.

AbstractKeywordsBiBTeXDOI

Per-Erik Forssén, Björn Johansson, Gösta Granlund, "Channel Associative Networks for Multiple Valued Mappings", 2nd International Cognitive Vision Workshop, 4-11, 2006.

KeywordsBiBTeX

Björn Johansson, Johan Wiklund, Gösta Granlund, "Goals and status within the IVSS project", Seminar on "Cognitive vision in traffic analyses", 2006.

KeywordsBiBTeX

2005

Ullrich Köthe, Michael Felsberg, "Riesz-transforms versus derivatives: On the relationship between the boundary tensor and the energy tensor", Scale Space and PDE Methods in Computer Vision, Lecture Notes in Computer Science, Vol. 3459, 179-191, 2005.

AbstractKeywordsBiBTeX

Michael Felsberg, Ullrich Köthe, "GET: The connection between monogenic scale-space and Gaussian derivatives", Scale Space and PDE Methods in Computer Vision, Lecture Notes in Computer Science, Vol. 3459, 192-203, 2005.

AbstractKeywordsBiBTeX

Erik Jonsson, Michael Felsberg, "Reconstruction of probability density functions from channel representations", Scandinavian Conference on Image Analysis, Lecture Notes in Computer Science, Vol. 3540, 491-500, 2005.

AbstractKeywordsBiBTeX

Robert Söderberg, Klas Nordberg, Gösta Granlund, "An Invariant and Compact Representation for Unrestricted Pose Estimation", Second Iberian Conference Pattern Recognition and Image Analysis (IbPRIA), LNCS, Vol. 3522, 2005.

AbstractKeywordsBiBTeXDOI

Michael Felsberg, "Wiener channel smoothing: Robust Wiener filtering of images", Pattern Recognition, Lecture Notes in Computer Science, Vol. 3663, 468-475, 2005.

AbstractKeywordsBiBTeX

Michael Felsberg, Erik Jonsson, "Energy Tensors: Quadratic, Phase Invariant Image Operators", Pattern Recognition, Lecture Notes in Computer Science, Vol. 3663, 493-500, 2005.

AbstractKeywordsBiBTeXDOIFulltext

Johan Sunnegårdh, Per-Erik Danielsson, Maria Magnusson, "Iterative Improvement of Non-Exact Reconstruction in Cone-Beam CT", Fully 3D 2005, Eighth International Meeting on Fully Three-dimensional Image Reconstruction in Radiology and Nuclear Medicine,2005, 2005.

AbstractKeywordsBiBTeX

Per-Erik Danielsson, Maria Magnusson, Johan Sunnegårdh, "Basis and window functions in CT", Fully 3D 2005, Eighth International Meeting on Fully Three-dimensional Image Reconstruction in Radiology and Nuclear Medicine,2005, 2005.

AbstractKeywordsBiBTeX

Maria Magnusson, Per-Erik Danielsson, Johan Sunnegårdh, "Handling of Long Objects in Iterative Reconstruction from Helical Cone-Beam Projections", Fully 3D 2005, Eighth International Meeting on Fully Three-dimensional Image Reconstruction in Radiology and Nuclear Medicine,2005, 2005.

AbstractKeywordsBiBTeX

Johan Skoglund, Michael Felsberg, "Fast Image Processing Using SSE2", Fast Image Processing Using SSE2,2005, 2005.

AbstractKeywordsBiBTeX

Erik Jonsson, Michael Felsberg, "Efficient Robust Mean Value Computation of 1D Features", Efficient Robust Mean Value Computation of 1D Features,2005, 2005.

AbstractKeywordsBiBTeX

Anders Moe, "Local Single-Patch Features for Pose Estimation Using the Log-Polar Transform", Local Single-Patch Features for Pose Estimation Using the Log-Polar Transform,2005, 2005.

KeywordsBiBTeX

Michael Felsberg, Erik Jonsson, "Reconstruction of Probability Density Functions from Channel Representations", Reconstruction of Probability Density Functions from Channel Representations,2005, 2005.

AbstractKeywordsBiBTeX

Michael Felsberg, Per-Erik Forssén, Anders Moe, Gösta Granlund, "A COSPAL Subsystem: Solving a Shape-Sorter Puzzle", AAAI Fall Symposium: From Reactive to Anticipatory Cognitive Embedded Systems, FS-05-05, 65-69, 2005.

AbstractKeywordsBiBTeX

Björn Johansson, Anders Moe, "Patch-Duplets for Object Recognition and Pose Estimation", 2nd Canadian Conference on Computer and Robot Vision,2005, 2005.

KeywordsBiBTeX

Klas Nordberg, Robert Söderberg, "Detection and representation of complex local features", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 257-260, 2005.

AbstractKeywordsBiBTeX

2004

Michael Felsberg, Gösta Granlund, "POI detection using channel clustering and the 2D energy tensor", Proceedings of Pattern Recognition, 26th DAGM Symposium, Lecture Notes in Computer Science, Vol. 3175, 103-110, 2004.

AbstractKeywordsBiBTeXDOI

Norbert Krüger, Michael Felsberg, Florentin Wörgötter, "Processing Multi-modal Primitives from Image Sequences", EIS2004,2004, 2004.

KeywordsBiBTeX

Sinan Kalkan, D. Calow, Michael Felsberg, Florentin Wörgötter, M. Lappe, Norbert Krüger, "Optic Flow Statistics and Intrinsic Dimensionality", BICS2004,2004, 2004.

KeywordsBiBTeX

Per-Erik Forssén, Anders Moe, "Blobs in Epipolar Geometry", Blobs in Epipolar Geometry,2004, 82-85, 2004.

AbstractKeywordsBiBTeX

Klas Nordberg, Robert Söderberg, "Detection and estimation of features for estimation of position", Swedish Symposium on Image Analysis (SSBA), 74-77, 2004.

AbstractKeywordsBiBTeX

Klas Nordberg, Fredrik Viksten, "Estimation of a tensor based representation for geometrical 3D primitives based on motion stereo", Swedish Symposium on Image Analysis (SSBA), 13-16, 2004.

AbstractKeywordsBiBTeX

Klas Nordberg, Fredrik Viksten, "Motion based estimation and representation of 3D surfaces and boundaries", International Workshop on Complex Motion (IWCM), LNCS, Vol. 3417, 2004.

AbstractKeywordsBiBTeXDOI

Björn Johansson, Anders Moe, "Patch-Duplets for Object Recognition and Pose Estimation", Proceedings SSBA04 Symposium on Image Analysis,2004, 78-81, 2004.

KeywordsBiBTeX

Björn Svensson, Mats Andersson, Johan Wiklund, Hans Knutsson, "Issues on filter networks for efficient convolution", Proceedings of the Swedish Symposium on Image Analysis (2004), 94-97, 2004.

AbstractKeywordsBiBTeX

2003

Per-Erik Danielsson, Maria Magnusson Seger, "A Proposal for Combining FBP and ART in CT-reconstruction", Proceedings of the Seventh International Meeting on Fully Three-dimensional Image Reconstruction in Radiology and Nuclear Medicine, St Malo, France, June 30 - July 4, 2003, 2003.

AbstractKeywordsBiBTeXFulltext

Gunnar Farnebäck, "Two-Frame Motion Estimation Based on Polynomial Expansion", SCIA13, Lecture Notes in Computer Science, Vol. 2749, 363-370, 2003.

AbstractKeywordsBiBTeXFulltext

Per-Erik Forssen, "Channel Smoothing using Integer Arithmetic", Proceedings SSAB03 Symposium on Image Analysis, 2003.

AbstractKeywordsBiBTeXFulltext

Michael Felsberg, R Duits, L Florack, "The monogenic scale space on a bounded domain and its applications", Scale Spac ´03, Eds Griffin, L. D. and Lillholm, M, 2003.

AbstractKeywordsBiBTeX

Hagen Spies, Per-Erik Forssén, "Two-dimensional channel representation for multiple velocities", Proceedings of the 13th Scandinavian Conference of Image Analysis, SCIA 2003, Lecture Notes in Computer Science, Vol. 2749, 356-362, 2003.

AbstractKeywordsBiBTeXDOI

Per-Erik Forssén, Gösta Granlund, "Robust multi-scale extraction of blob features", Proceedings or the 13th Scandinavian Conference, SCIA 2003, Lecture Notes in Computer Science, Vol. 2749/2003, 769-769, 2003.

AbstractKeywordsBiBTeXDOI

Hanno Scharr, Michael Felsberg, Per-Erik Forssén, "Noise Adaptive Channel Smoothing of Low-Dose Images", Computer Vision for the Nano-Scale Workshop accompanying CVPR 2003,2003, 2003.

KeywordsBiBTeX

hagen spies, Björn Johansson, "Directional Channel Representation for Multiple Line-Endings and Intensity Levels", Proceedings of IEEE International Conference on Image Processing,2003, 265-268, 2003.

KeywordsBiBTeX

Michael Felsberg, Gösta Granlund, "Anisotropic Channel Filtering", SCIA, Lecture Notes in Computer Science, Vol. 2749, 755-762, 2003.

AbstractKeywordsBiBTeX

Michael Felsberg, Norbert Kruger, "A Probabilistic Definition of Intrinsic Dimensionality for Images", 25. DAGM Symposium Mustererkennung, Magdeburg eds Michaelis, B. and Krell, G., Lecture Notes in Computer Science, Vol. 2781, 140-147, 2003.

AbstractKeywordsBiBTeX

Norbert Kruger, Michael Felsberg, "A continuous Formulation of intrinsic Dimension", British Machine Vision Conference, 2003.

KeywordsBiBTeX

Klas Nordberg, Gunnar Farnebäck, "A Framework for Estimation of Orientation and Velocity", International Conference on Image Processing (ICIP), 2003.

AbstractKeywordsBiBTeXDOIFulltext

Remco Duits, Michael Felsberg, Luc Florack, Bram Platel, "α Scale Spaces on a Bounded Domain", Scale Space Methods in Computer Vision, Lecture Notes in Computer Science, Vol. 2695, 502-518, 2003.

AbstractKeywordsBiBTeXDOIFulltext

2002

Hagen Spies, Tobias Dierig, Christoph S. Garbe, "Local Models for Dynamic Processes in Image Sequences", Workshop Dynamic Perception, 59-64, 2002.

AbstractKeywordsBiBTeXFulltext

Michael Felsberg, "Disparity from monogenic phase", DAGM Symposium Mustererkennung, Zurich, Lecture Notes in Computer Science, Vol. 2449, 248-256, 2002.

AbstractKeywordsBiBTeXFulltext

Klas Nordberg, Patrick Doherty, Gunnar Farnebäck, Per-Erik Forssén, Gösta Granlund, Anders Moe, Johan Wiklund, "Vision for a UAV helicopter", International Conference on Intelligent Robots and Systems (IROS), Workshop on Aerial Robotics, 2002.

AbstractKeywordsBiBTeX

Björn Johansson, Gunnar Farnebäck, "A Theoretical Comparison of Different Orientation Tensors", Proceedings SSAB02 Symposium on Image Analysis,2002, 69-73, 2002.

KeywordsBiBTeX

Per Andersson, Krzysztof Kuchcinski, Klas Nordberg, Patrick Doherty, "Integrating a computational model and a run time system for image processing on a UAV", Euromicro Symposium on Digital System Design (DSD), 102-109, 2002.

AbstractKeywordsBiBTeXDOIFulltext

Michael Felsberg, Gerald Sommer, "Image Features Based on a New Approach to 2D Rotation Invariant Quadrature Filters", Computer Vision - ECCV 2002 eds A. Heyden and G. Sparr and M. Nielsen and P. Johansen, Lecture Notes in Computer Science, Vol. 2350, 369-383, 2002.

AbstractKeywordsBiBTeXFulltext

Per-Erik Forssen, "Successive Recognition using Local State Models", Proceedings SSAB02 Symposium on Image Analysis, 9-12, 2002.

AbstractKeywordsBiBTeXFulltext

Norbert Kruger, Michael Felsberg, Christian Gebken, Martin Pörksen, "An Explicit and Compact Coding of Geometric and Structural Information Applied to Stereo Processing", Vision, Modeling, and Visualization, 2002.

AbstractKeywordsBiBTeXFulltext

Maria Magnusson Seger, "Rampfilter implementation on truncated projection data. Application to 3D linear tomography for logs.", Proceedings SSAB02 Symposium on Image Analysis, 33-36, 2002.

AbstractKeywordsBiBTeXFulltext

Gunnar Farnebäck, Klas Nordberg, "Motion Detection in the WITAS Project", Swedish Symposium on Image Analysis (SSBA), 99-102, 2002.

AbstractKeywordsBiBTeXFulltext

2001

Per-Erik Forssen, "Image Analysis using Soft Histograms", Proceedings of the SSAB Symposium on Image Analysis, 109-112, 2001.

AbstractKeywordsBiBTeXFulltext

Gunnar Farnebäck, "Disparity Estimation from Local Polynomial Expansion", Proceedings of the SSAB Symposium on Image Analysis, 77-80, 2001.

AbstractKeywordsBiBTeXFulltext

Björn Johansson, Magnus Borga, Hans Knutsson, "Learning Corner Orientation Using Canonical Correlation", Proceedings of the SSAB Symposium on Image Analysis, 89-92, 2001.

AbstractKeywordsBiBTeXFulltext

Gunnar Farnebäck, "Very High Accuracy Velocity Estimation using Orientation Tensors, Parametric Motion, and Simultaneous Segmentation of the Motion Field", Proceedings of the Eighth IEEE International Conference on Computer Vision, 171-177, 2001.

AbstractKeywordsBiBTeXFulltext

2000

Gunnar Farnebäck, "Orientation Estimation Based on Weighted Projection onto Quadratic Polynomials", Vision, Modeling, and Visualization 2000: proceedings, 89-96, 2000.

AbstractKeywordsBiBTeXFulltext

Magnus Borga, Helge Malmgren, Hans Knutsson, "FSED - Feature Selective Edge Detection", ICPR15, 229-232 vol.1, 2000.

AbstractKeywordsBiBTeXDOIFulltext

Per-Erik Forssén, Gösta Granlund, "Sparse feature maps in a scale hierarchy", Algebraic Frames for the Perception-Action Cycle, Proceedings Second International Workshop, AFPAC 2000, Lecture Notes in Computer Science, Vol. 1888, 186-196, 2000.

AbstractKeywordsBiBTeXDOI

Patrick Doherty, Gösta Granlund, Krzysztof Kuchcinski, Erik Johan Sandewall, Klas Nordberg, Erik Skarman, Johan Wiklund, "The WITAS unmanned aerial vehicle project", Proceedings of the 14th European Conference on Artificial Intelligence (ECAI), 747-755, 2000.

AbstractKeywordsBiBTeX

Björn Johansson, Hans Knutsson, Gösta Granlund, "Detecting Rotational Symmetries using Normalized Convolution", Proceedings of the 15th International Conference on Pattern Recognition,2000, 496-500 vol.3, 2000.

AbstractKeywordsBiBTeXDOI

Hans Knutsson, Mats Andersson, Magnus Borga, Johan Wiklund, "Automated generation of representations in vision", International Conference on Pattern Recognition ICPR,2000, 59-66 vol.3, 2000.

AbstractKeywordsBiBTeXDOI

Gösta Granlund, Klas Nordberg, Johan Wiklund, Patrick Doherty, Erik Skarman, Erik Sandewall, "WITAS: An Intelligent Autonomous Aircraft Using Active Vision", Proceedings of the UAV 2000 International Technical Conference and Exhibition (UAV), 2000.

AbstractKeywordsBiBTeXFulltext

Selim Aksoy, Ye Ming, Michael L. Schauf, Mingzhou Song, Yalin Wang, Robert M. Haralick, Jim R. Parker, Juraj Pivovarov, Dominik Royko, Changming Sun, Gunnar Farnebäck, "Algorithm Performance Contest", Proceedings. 15th International Conference on Pattern Recognition, 2000, Pattern Recognition, Vol. vol. 4, 870-876, 2000.

AbstractKeywordsBiBTeXDOIFulltext

Gunnar Farnebäck, "Fast and Accurate Motion Estimation using Orientation Tensors and Parametric Motion Models", ICPR15, 135-139 vol.1, 2000.

AbstractKeywordsBiBTeXDOI

1999

Gunnar Farnebäck, "A Unified Framework for Bases, Frames, Subspace Bases, and Subspace Frames", Proceedings of the 11th Scandinavian Conference on Image Analysis, 341-349, 1999.

AbstractKeywordsBiBTeXFulltext

Magnus Hemmendorff, Hans Knutsson, Mats T. Andersson, Torbjörn Kronander, "Motion compensated digital subraction angiography", Proceedings of SPIE's International Symposium on Medical Imaging, vol 3661, 1999, 1999.

AbstractKeywordsBiBTeXFulltext

Hans Knutsson, Magnus Borga, "Learning Visual Operators from Examples: A New Paradigm in Image Processing", Proceedings of the 10th International Conference on Image Analysis and Processing (ICIAP'99), 1999.

AbstractKeywordsBiBTeXFulltext

Mats Andersson, Johan Wiklund, Hans Knutsson, "Filter Networks", Proceedings of Signal and Image Processing (SIP'99), 213-217, 1999.

AbstractKeywordsBiBTeXFulltext

Hans Knutsson, Mats Andersson, Johan Wiklund, "Multiple Space Filter Design", Proceedings of the SSAB symposium on image analysis, 1999.

AbstractKeywordsBiBTeXFulltext

Silvia Coradeschi, Lars Karlsson, Klas Nordberg, "Integration of vision and decision-making in an autonomous airborne vehicle for traffic surveillance", Proceedings of the International Conference on Vision Systems '99, 1999.

AbstractKeywordsBiBTeXFulltext

Gösta Granlund, "Does Vision Inevitably Have to be Active?", Proceedings of the 11th Scandinavian Conference on Image Analysis, 1999.

AbstractKeywordsBiBTeX

Abstract

There is no indication that it will ever be possible to find some simple trick that miraculously solves most problems in vision. It turns out that the processing system must be able to implement a model structure, the complexity of which is directly related to the structural complexity of the problem under consideration in the external world. It has become increasingly apparent that Vision cannot be treated in isolation from the response generation, because a very high degree of integration is required between different levels of percepts and corresponding response primitives. The response to be produced at a given instance is as much dependent upon the state of the system, as the percepts impinging upon the system. In addition, it has become apparent that many classical aspects of perception, such as geometry, probably do not belong to the percept domain of a Vision system, but to the response domain. This article will focus on what are considered crucial problems in Vision for robotics for the future, rather than on the classical solutions today. It will discuss hierarchical architectures for combination of percept and response primitives. It will discuss the concept of combined percept–response invariances as important structural elements for Vision. It will be maintained that learning is essential to obtain the necessary flexibility and adaptivity. In consequence, it will be argued that invariances for the purpose of Vision are not abstractly geometrical, but derived from the percept–response interaction with the environment. The issue of information representation becomes extremely important in distributed structures of the types foreseen, where uncertainty of information has to be stated for update of models and associated data. The question of object representation is central to the paper. Equivalence is established between the representations of response, geometry and time. Finally an integrated percept–response structure is proposed for flexible response control.

Hans Knutsson, Mats Andersson, Johan Wiklund, "Advanced Filter Design", Proceedings of the 11th Scandinavian Conference on Image Analysis, 185-193, 1999.

AbstractKeywordsBiBTeXFulltext

1998

Morgan Ulvklo, Gösta H. Granlund, Hans Knutsson, "Adaptive Reconstruction Using Multiple Views", Proceedings of the IEEE Southwest Symposium on Image Analysis and Interpretation, 47-52, 1998.

AbstractKeywordsBiBTeX

Hans Knutsson, Magnus Borga, Tomas Landelius, "Learning Multidimensional Signal Processing", Proceedings of the 14th International Conference on Pattern Recognition, vol 2, 1416-1420, 1998.

AbstractKeywordsBiBTeX

Morgan Ulvklo, Hans Knutsson, Gösta H. Granlund, "Depth Segmentation and Occluded Scene Reconstruction using Ego-motion", Proceedings of the SPIE Conference on Visual Information Processing, 112-123, 1998.

AbstractKeywordsBiBTeX

Thord Andersson, Gösta H. Granlund, Gunnar Farnebäck, Klas Nordberg, Johan Wiklund, "WITAS Project at Computer Vision Laboratory; A status report (Jan 1998)", Proceedings of the SSAB symposium on image analysis, 113-116, 1998.

AbstractKeywordsBiBTeX

Reiner Lenz, Gösta Granlund, "If I had a fisheye I would not need SO(1,n) or, Is hyperbolic geometry useful in image processing?", Proceedings from the SSAB Symposium on Image Analysis, 1998.

KeywordsBiBTeX

Magnus Borga, Hans Knutsson, "An Adaptive Stereo Algorithm Based on Canonical Correlation Analysis", Proceedings of the Second IEEE International Conference on Intelligent Processing Systems, 177-182, 1998.

AbstractKeywordsBiBTeXFulltext

1997

Gunnar Farnebäck, "Motion-based Segmentation of Image Sequences using Orientation Tensors", Proceedings of the SSAB Symposium on Image Analysis, 31-35, 1997.

AbstractKeywordsBiBTeXFulltext

Carl-Fredrik Westin, A. Bhalerao, Hans Knutsson, Ron Kikinis, "Using Local 3D Structure for Segmentation of Bone from Computer Tomography Images", Proceedings of IEEE CVPR 1997, 1997.

KeywordsBiBTeX

Gösta H. Granlund, "From Multidimensional Signals to the Generation of Responses", Algebraic Frames for the Perception-Action Cycle, eds G. Sommer and J. J. Koenderink, Lecture Notes in Computer Science, Vol. 1315, 29-53, 1997.

AbstractKeywordsBiBTeXFulltext

Silvia Coradeschi, Klas Nordberg, Lars Karlsson, "Integration of vision and reasoning in an airborne autonomous vehicle for traffic surveillance", Knowledge Based Computer Vision, Seminar-Report 196, 1997.

KeywordsBiBTeX

Klas Nordberg, Mathias Bergvall, Gösta H. Granlund, "Building Object Models from Range Data", Robotikdagar 97, 1997.

KeywordsBiBTeX

Gösta H. Granlund, "From signal to response: Issues in representation and computation", Proceedings of TFTS'97, The 2nd IEEE UK Symposium on Applications of Time-frequency and Time-scale Methods, 1997.

KeywordsBiBTeX

1996

Carl-Fredrik Westin, Carl-Johan Westelius, Hans Knutsson, Gösta Granlund, "Attention Control for Robot Vision", CVPR, 726-733, 1996.

KeywordsBiBTeX

Gösta H. Granlund, "Operations and Representations for Multidimensional Information", Proceedings of RecPad'96, The 8th Portuguese Conference on Pattern Recognition, 1996.

KeywordsBiBTeX

Gösta H. Granlund, "Response Generation and Learning Crucial Issues in Machine Vision", Machine Perception Applications. Proc. of the IAPR TC-8 Workshop in Machine Perception Applications, Technical University, Graz, Austria, 2--3 September, 1996, eds A. Pinz and W. Pölzleitner, 155-184, 1996.

KeywordsBiBTeX

Klas Nordberg, Gösta Granlund, "Equivariance and Invariance -- An Approach Based on Lie Groups", ICIP, 1996.

KeywordsBiBTeX

1995

Hans Knutsson, Mats Andersson, "Optimization of Sequential Filters", Proceedings of the SSAB Symposium on Image Analysis, 87-90, 1995.

AbstractKeywordsBiBTeX

Morgan Ulvklo, Gösta H. Granlund, Hans Knutsson, "Texture Gradient in Sparse Texture Fields", SCIA9, 885-894, 1995.

KeywordsBiBTeX

Johan Wiklund, Hans Knutsson, "A Generalized Convolver", SCIA9, 1995.

AbstractKeywordsBiBTeXFulltext

Gösta Granlund, "Biological vision: a source of challenges and ideas", DSAGM, Dansk Selskab for Genkendelse af Mønstre, 1995.

KeywordsBiBTeX

Tomas Landelius, Hans Knutsson, "Behaviorism and Reinforcement Learning", Proceedings, 2nd Swedish Conference on Connectionism, 259-270, 1995.

KeywordsBiBTeX

Jörgen Karlholm, Carl-Johan Westelius, Hans Knutsson, "Object Tracking Based on the Orientation Tensor Concept", SCIA9, Uppsala, 1995.

AbstractKeywordsBiBTeX

1994

Klas Nordberg, Gösta Granlund, Hans Knutsson, "Representation and learning of invariance", Image Processing, 1994. Proceedings. ICIP-94., IEEE International Conference, 585-589, 1994.

KeywordsBiBTeX

Hans Knutsson, Carl-Fredrik Westin, Gösta H. Granlund, "Local Multiscale Frequency and Bandwidth Estimation", ICIP, 36-40, 1994.

KeywordsBiBTeX

Hans Knutsson, Magnus Andersson, "Robust N-Dimensional Orientation Estimation using Quadrature Filters and Tensor Whitening", ICASSP, 1994.

AbstractKeywordsBiBTeXFulltext

Carl-Fredrik Westin, Klas Nordberg, Hans Knutsson, "On the Equivalence of Normalized Convolution and Normalized Differential Convolution, Vol. 5", IEEE International Conference on Acoustics, Speech, and Signal Processing, 1994, 457-460, 1994.

AbstractKeywordsBiBTeXDOI

Leif Haglund, David Fleet, "Stable Estimation of Image Orientation", Proceedings of the IEEE-ICIP, 68-72, 1994.

KeywordsBiBTeX

1993

Hans Knutsson, Carl-Fredrik Westin, "Normalized and Differential Convolution: Methods for Interpolation and Filtering of Incomplete and Uncertain Data", CVPR, 515-523, 1993.

KeywordsBiBTeX

Tomas Landelius, Hans Knutsson, "The Learning Tree, A New Concept in Learning", Proceedings of the 2nd International Conference on Adaptive and Learning Systems, SPIE, Vol. 1962, 1993.

AbstractKeywordsBiBTeX

Mats Andersson, Hans Knutsson, "Controllable 3-D Filters", Proceedings of the SSAB Symposium on Image Analysis, 1993.

KeywordsBiBTeX

Håkan Bårman, Gösta H. Granlund, "Using Simple Local Fourier Domain Models for Computer-Aided Analysis of Mammograms", SCIA8, 479-486, 1993.

KeywordsBiBTeX

Tomas Landelius, Leif Haglund, Hans Knutsson, "Depth and Velocity from Orientation Tensor Fields", Proceedings of the SSAB Symposium on Image Analysis, 1993.

KeywordsBiBTeX

Gösta H. Granlund, "Image Sequence Analysis", Mustererkennung 1993, Mustererkennung im Dienste der Gesundheit eds S.J. Pöppl and H. Handels, 1-18, 1993.

KeywordsBiBTeX

Klas Nordberg, Hans Knutsson, Gösta Granlund, "On the Equivariance of the Orientation and the Tensor Field Representation", SCIA8, 57-63, 1993.

AbstractKeywordsBiBTeXFulltext

Carl-Johan Westelius, Hans Knutsson, Gösta H. Granlund, "Hierarchical Gaze Control Using a Multi-resolution Image Sensor", Proceedings from Robotics Workshop, 1993.

KeywordsBiBTeX

Magnus Borga, "Hierarchical Reinforcement Learning", ICANN'93 eds S. Gielen and B. Kappen, 1993.

AbstractKeywordsBiBTeXFulltext

Hans Knutsson, Carl-Fredrik Westin, Carl-Johan Westelius, "Filtering of Uncertain Irregularly Sampled Multidimensional Data", Twenty-seventh Asilomar Conf. on Signals, Systems & Computers, 1301-1309, 1993.

KeywordsBiBTeX

Leif Haglund, Hans Knutsson, Gösta H. Granlund, "Scale and Orientation Adaptive Filtering", SCIA8, 1993.

AbstractKeywordsBiBTeX

Hans Knutsson, Carl-Fredrik Westin, "Robust Estimation from Sparse Feature Fields", Proceedings of EC--US Workshop, 1993.

KeywordsBiBTeX

Håkan Bårman, Gösta H. Granlund, "Hierarchical Feature Extraction for Computer- Aided Analysis of Mammograms", BIOMEDICAL IMAGE PROCESSING IV AND BIOMEDICAL VISUALIZATI0N, 1993.

KeywordsBiBTeX

Håkan Bårman, Gösta H. Granlund, "Computer-Aided Analysis of Mammograms", Proceedings Nordic symposium on PACS, Digital Radiology and Telemedicine, 76-, 1993.

KeywordsBiBTeX

Gösta H. Granlund, "Issues in Robot Vision", British Machine Vision Conference 1993, 1-14, 1993.

KeywordsBiBTeX

1992

Roland Wilson, Hans Knutsson, "Seeing Things -- Disagreements on the necessary properties of a system that `Recognizes'", Workshop on Vision, 177-189, 1992.

KeywordsBiBTeX

Hans Knutsson, "The meaninglessness of `Sit-and-stare' -- How Vision-Action-Understanding is inseparable", Workshop on Vision, 9-20, 1992.

KeywordsBiBTeX

Klas Nordberg, Hans Knutsson, "Some New Ideas in Signal Representation", Proceedings of ECCV--92, Lecture Notes in Computer Science, Vol. 588, 1992.

KeywordsBiBTeX

Johan Wiklund, Carl-Johan Westelius, Hans Knutsson, "Hierarchical Phase Based Disparity Estimation", Proceedings of 2nd Singapore International Conference on Image Processing, 1992.

KeywordsBiBTeX

Andrew Calway, Hans Knutsson, Roland Wilson, "Multiresolution Frequency Domain Algorithm for Fast Image Registration", Proc. 3rd Int. Conf. on Visual Search, 1992.

KeywordsBiBTeX

Hans Knutsson, Leif Haglund, Håkan Bårman, Gösta H. Granlund, "A Framework for Anisotropic Adaptive Filtering and Analysis of Image Sequences and Volumes", Proceedings ICASSP-92, 1992.

KeywordsBiBTeX

Carl-Johan Westelius, Hans Knutsson, Gösta H. Granlund, "Preattentive Gaze Control for Robot Vision", Proceedings of Third International Conference on Visual Search, 1992.

KeywordsBiBTeX

Carl-Fredrik Westin, Hans Knutsson, "Extraction of Local Symmetries Using Tensor Field Filtering", Proceedings of 2nd Singapore International Conference on Image Processing, 371-375, 1992.

KeywordsBiBTeX

Hans Knutsson, Håkan Bårman, Leif Haglund, "Robust Orientation Estimation in 2D, 3D and 4D Using Tensors", Proceedings of Second International Conference on Automation, Robotics and Computer Vision, ICARCV'92, 1992.

KeywordsBiBTeX

Leif Haglund, Hans Knutsson, Gösta H. Granlund, "On Scale and Orientation Adaptive Filtering", Proceedings of the SSAB Symposium on Image Analysis, 1992.

KeywordsBiBTeX

Andrew Calway, Hans Knutsson, Roland Wilson, "Multiresolution Estimation of 2-d Disparity Using a Frequency Domain Approach", Proc. British Machine Vision Conf., 1992.

KeywordsBiBTeX

Klas Nordberg, Hans Knutsson, Gösta Granlund, "Signal Representation using Operators", Proceedings of EUSIPCO--92, 1992.

KeywordsBiBTeX

Hans Knutsson, Leif Haglund, Gösta Granlund, "Adaptive Filtering of Image Sequences and Volumes", Proceedings of International Conference on Automation, Robotics and Computer Vision, 1992.

KeywordsBiBTeX

1991

Andrew Calway, Roland Wilson, "The Multiresolution Fourier Transform and its Application to Image Analysis", Proceedings of the SSAB Symposium on Image Analysis, 1991.

KeywordsBiBTeX

Leif Haglund, Håkan Bårman, Hans Knutsson, "Estimation of Velocity and Acceleration in Time Sequences", Proceedings of the 7th Scandinavian Conference on Image Analysis, 1033-1041, 1991.

KeywordsBiBTeX

Carl-Fredrik Westin, Hans Knutsson, "The Möbius Strip Parameterization for Line Segmentation", Proceedings of the SSAB Symposium on Image Analysis, 1991.

KeywordsBiBTeX

Klas Nordberg, Hans Knutsson, "Some new ideas in Signal Representation", Proceedings of the SSAB Symposium on Image Analysis, 1991.

KeywordsBiBTeX

Håkan Bårman, Leif Haglund, Hans Knutsson, Gösta H. Granlund, "Estimation of Velocity, Acceleration and Disparity in Time Sequences", Proceedings of IEEE Workshop on Visual Motion, 44-51, 1991.

KeywordsBiBTeX

Josef Bigun, Gösta H. Granlund, Johan Wiklund, "Multidimensional orientation: texture analysis and optical flow", Proceedings of the SSAB Symposium on Image Analysis, 110-113, 1991.

KeywordsBiBTeX

Carl-Johan Westelius, Hans Knutsson, Gösta H. Granlund, "Focus of attention control", Proceedings of the 7th Scandinavian Conference on Image Analysis, 667-674, 1991.

KeywordsBiBTeX

Hans Knutsson, Leif Haglund, Håkan Bårman, "A Tensor Based Approach to Structure Analysis and Enhancement in 2D, 3D and 4D", Workshop Program, Seventh Workshop on Multidimentional Signal Processing, 1991.

KeywordsBiBTeX

Håkan Bårman, Hans Knutsson, Gösta H. Granlund, "Using Principal Direction Estimates for Shape and Acceleration Description", Proceedings of the SSAB Symposium on Image Analysis, 1991.

KeywordsBiBTeX

1990

Håkan Bårman, Gösta H. Granlund, Hans Knutsson, "Tensor Field Filtering and Curvature Estimation", Proceedings of the SSAB Symposium on Image Analysis, 175-178, 1990.

KeywordsBiBTeX

Gösta H. Granlund, "Processing and Analysis of Multidimensional Information Using Adaptive Models", Proceedings of the SSAB Symposium on Image Analysis, 19-34, 1990.

KeywordsBiBTeX

Gösta H. Granlund, Hans Knutsson, "Compact Associative Representation of Visual Information", Proceedings of The 10th International Conference on Pattern Recognition, 1990.

KeywordsBiBTeX

Hans Knutsson, Gösta H. Granlund, Håkan Bårman, "A Note on Estimation of 4D Orientation", Proceedings of the SSAB Symposium on Image Analysis, 192-195, 1990.

KeywordsBiBTeX

Hans Knutsson, Leif Haglund, Gösta H. Granlund, "A New Approach to Image Enhancement Using Tensor Fields", Proceedings of the PROART Workshop on Vision, 111-115, 1990.

KeywordsBiBTeX

Hans Knutsson, Leif Haglund, Gösta H. Granlund, "Tensor Field Controlled Image Sequence Enhancement", Proceedings of the SSAB Symposium on Image Analysis, 163-167, 1990.

KeywordsBiBTeX

Carl-Johan Westelius, Gösta H. Granlund, Hans Knutsson, "Model Projection in a Feature Hierarchy", Proceedings of the SSAB Symposium on Image Analysis, 244-247, 1990.

KeywordsBiBTeX

1989

Hans Knutsson, "Representing Local Structure Using Tensors", Proceedings of the 6th Scandinavian Conference on Image Analysis, LiTH-ISY-I, Vol. 1019, 244-251, 1989.

AbstractKeywordsBiBTeXFulltext

Abstract

The fundamental problem of finding a suitable representation of the orientation of 3D surfaces is considered. A representation is regarded suitable if it meets three basic requirements: Uniqueness, Uniformity and Polar separability. A suitable tensor representation is given.

At the heart of the problem lies the fact that orientation can only be defined mod 180± , i.e the fact that a 180± rotation of a line or a plane amounts to no change at all. For this reason representing a plane using its normal vector leads to ambiguity and such a representation is consequently not suitable. The ambiguity can be eliminated by establishing a mapping between R₃ and a higherdimensional tensor space.

The uniqueness requirement implies a mapping that map all pairs of 3D vectors x and -x onto the same tensor T. Uniformity implies that the mapping implicitly carries a definition of distance between 3D planes (and lines) that is rotation invariant and monotone with the angle between the planes. Polar separability means that the norm of the representing tensor T is rotation invariant. One way to describe the mapping is that it maps a 3D sphere into 6D in such a way that the surface is uniformly uniformly stretched and all pairs of antipodal points maps onto the same tensor.

It is demonstrated that the above mapping can be realized by sampling the 3D space using a specified class of symmetrically distributed quadrature filters. It is shown that 6 quadrature filters are necessary to realize the desired mapping, the orientations of the filters given by lines trough the vertices of an icosahedron. The desired tensor representation can be obtained by simply performing a weighted summation of the quadrature filter outputs. This situation is indeed satisfying as it implies a simple implementation of the theory and that requirements on computational capacity can be kept within reasonable limits.

Noisy neigborhoods and/or linear combinations of tensors produced by the mapping will in general result in a tensor that has no direct counterpart in R3. In an adaptive hierarchical signal processing system, where information is flowing both up (increasing the level of abstraction) and down (for adaptivity and guidance), it is necessary that a meaningful inverse exists for each levelaltering operation. It is shown that the point in R₃ that corresponds to the best approximation of a given tensor is given by the largest eigenvalue times the corresponding eigenvector of the tensor.

Johan Wiklund, Leif Haglund, Hans Knutsson, Gösta H. Granlund, "Time Sequence Analysis Using Multi-Resolution Spatio-Temporal Filters", Time-Varying Image Processing and Moving Object Recognition, 2, 258-265, 1989.

AbstractKeywordsBiBTeX

Håkan Bårman, Hans Knutsson, Gösta H. Granlund, "A Filtering Strategy for Orientation and Curvature Description", The 6th Scandinavian Conference on Image Analysis, 886-889, 1989.

KeywordsBiBTeX

Gösta H. Granlund, "Magnitude Representation of Features in Image Analysis", Proceedings of the 6th Scandinavian Conference on Image Analysis : Oulu, June 19-22, 1989, 212-219, 1989.

AbstractKeywordsBiBTeXFulltext

Carl-Johan Westelius, Carl-Fredrik Westin, "A Colour Representation for Scale-spaces", The 6th Scandinavian Conference on Image Analysis, 890-893, 1989.

KeywordsBiBTeX

Leif Haglund, Hans Knutsson, Gösta H. Granlund, "On Phase Representation of Image Information", The 6th Scandinavian Conference on Image Analysis, 1082-1089, 1989.

KeywordsBiBTeX

Mats Andersson, Hans Knutsson, Gösta H. Granlund, "Implementation of Image Processing Operations from Analogue Convolver Responses", Proceedings of the SSAB Conference on Image Analysis, 67-74, 1989.

KeywordsBiBTeX

Leif Haglund, Hans Knutsson, Gösta H. Granlund, "Scale Analysis Using Phase Representation", The 6th Scandinavian Conference on Image Analysis, 1118-1125, 1989.

KeywordsBiBTeX

Carl-Johan Westelius, Carl-Fredrik Westin, "Representation of colour in image processing", Proceedings of the SSAB Conference on Image Analysis, 1989.

KeywordsBiBTeX

Hans Knutsson, Gösta H. Granlund, "Spatio-Temporal Analysis Using Tensors", Sixth Multidimensional Signal Processing Workshop, 1989.

KeywordsBiBTeX

Håkan Bårman, Gösta H. Granlund, Hans Knutsson, "A new approach to curvature estimation and description", 3rd International Conference on Image Processing and its Applications, 54-58, 1989.

KeywordsBiBTeX

Roland Wilson, Hans Knutsson, "A Multiresolution Stereopsis Algorithm Based on the Gabor Representation", 3rd International Conference on Image Processing and Its Applications, 19-22, 1989.

KeywordsBiBTeX

Gösta H. Granlund, "Processing and Analysis of Multidimensional Information Using Adaptive Models", Proceedings of the SSAB Conference on Image Analysis, 37-44, 1989.

KeywordsBiBTeX

1988

Gösta H. Granlund, "Processing and Analysis of Multi-Dimensional Information Using Adaptive Models", Proceedings from SSAB Symposium on Picture Processing, 1988.

KeywordsBiBTeX

Josef Bigun, Gösta H. Granlund, "Optical Flow Based on the Inertia Matrix of the Frequency Domain", Proceedings from SSAB Symposium on Picture Processing, 132-135, 1988.

KeywordsBiBTeX

Josef Bigun, "Recognition of Local Symmetries in Gray Value Images by Harmonic Functions", Proceedings of the 9th International Conference on Pattern Recognition, Vol. 1, 345-347, 1988.

AbstractKeywordsBiBTeXDOIFulltext

Håkan Bårman, Gösta H. Granlund, "Corner Detection Using Local Symmetry", Proceedings from SSAB Symposium on Picture Processing, 1988.

KeywordsBiBTeX

Josef Bigun, "Pattern Recognition by detection of local symmetries", Pattern Recognition and Artificial Intelligence, 75-90, 1988.

AbstractKeywordsBiBTeXFulltext

1987

Josef Bigun, Gösta H. Granlund, "Optimal Orientation Detection of Linear Symmetry", Proceedings of the IEEE First International Conference on Computer Vision, 433-438, 1987.

AbstractKeywordsBiBTeX

Josef Bigun, "Some Mathematical Tools of Computers for Vision Purposes", Proceedings of the 7th Nordic Conference on Teaching of Matematics at Technical Universities, 1987.

KeywordsBiBTeX

Johan Wiklund, Gösta H. Granlund, "Image Sequence Analysis for Object Tracking.", Proc. of The 5th Scandinavian Conference on Image Analysis, 641-648, 1987.

KeywordsBiBTeX

Gösta H. Granlund, "Imprecision of Measurements in Computer Vision Handled by Fuzzy Set Theory", 5th IEEE-ASSP and EURASIP Workshop on Multidimensional Signal Processing, 1987.

KeywordsBiBTeX

Hans Knutsson, "A Tensor Representation of 3-D Structures", 5th IEEE-ASSP and EURASIP Workshop on Multidimensional Signal Processing, 1987.

KeywordsBiBTeX

1986

Johan Wiklund, Gösta H. Granlund, "Tracking of Multiple Moving Objects", Proceedings of the Second International Workshop on Time-Varying Image Processing and Moving Object Recognition, 241-250, 1986.

KeywordsBiBTeX

Josef Bigun, Gösta H. Granlund, "Central Symmetry Modelling", Proceedings of EUSIPCO-86, Third European Signal Processing Conference, 883-886, 1986.

AbstractKeywordsBiBTeX

Hans Knutsson, "Representing and Estimating 3-D Orientation Using Quadrature Filters", Conference Publication No. 265, Second Int. Conf. on Image Processing and Its Applications, 87-91, 1986.

KeywordsBiBTeX

1985

Hans Knutsson, "Producing a Continuous and Distance Preserving 5-D Vector Representation of 3-D Orientation", IEEE Computer Society Workshop on Computer Architecture for Pattern Analysis and Image Database Management - CAPAIDM, 175-182, 1985.

KeywordsBiBTeX

Gösta H. Granlund, Jan Arvidsson, "Computer Architectures for Image Processing.", Proceedings of The 4th Scandinavian Conference on Image Analysis, 1985.

KeywordsBiBTeX

1983

Hans Knutsson, Gösta H. Granlund, "Texture Analysis Using Two-Dimensional Quadrature Filters", IEEE Computer Society Workshop on Computer Architecture for Pattern Analysis and Image Database Management - CAPAIDM, 1983.

KeywordsBiBTeX

Gösta H. Granlund, "Hierarchical Image Processing", Proceedings of SPIE Technical Conference, 1983.

KeywordsBiBTeX

1982

Gösta H. Granlund, Hans Knutsson, "Hierarchical Processing of Structural Information in Artificial Intelligence", Proceedings of 1982 IEEE Conference on Acoustics, Speech and Signal Processing, 1982.

KeywordsBiBTeX

Martin Hedlund, Gösta H. Granlund, Hans Knutsson, "A Consistency Operation for Line and Curve Enhancement", The Computer Society Conference on PR&IP, 1982.

KeywordsBiBTeX

Gösta H. Granlund, Jan Arvidsson, Hans Knutsson, "GOP, A Paradigm in Hierarchical Image Processing", Proceedings of The First IEEE Computer Society International Symposium on Medical Imaging and Image Interpretation, ISMI II'82, 1982.

KeywordsBiBTeX

Roland Wilson, Hans Knutsson, Gösta H. Granlund, "The Operational Definition of the Position of Line and Edge", The 6th International Conference on Pattern Recognition, 1982.

KeywordsBiBTeX

Roland Wilson, Hans Knutsson, Gösta H. Granlund, "Image Coding Using a Predictor Controlled by Image Content", Proceedings of 1982 IEEE Conference on Acoustics, speach and signal processing, 1982.

KeywordsBiBTeX

1981

Hans Knutsson, Roland Wilson, Gösta H. Granlund, "Anisotropic Filtering Controlled by Image Content", Proceedings of the 2nd Scandinavian Conference on Image Analysis, IEEE Acoustics, Speech, and Signal Processing Newsletter, Vol. Vol. 50, issue 1, 146-151, 1981.

AbstractKeywordsBiBTeXDOIFulltext

Martin Hedlund, Gösta H. Granlund, Hans Knutsson, "Image Filtering and Relaxation Procedures using Hierarchical Models", Proceedings of the 2nd Scandinavian Conference on Image Analysis, 1981.

KeywordsBiBTeX

Hans Knutsson, Roland Wilson, Gösta H. Granlund, "Anisotropic Filtering Operations for Image Enhancement and their Relation to the Visual System", Proceedings of IEEE Computer Society Conference on Pattern Recognition and Image Processing, 1981.

KeywordsBiBTeX

Hans Knutsson, Roland Wilson, Gösta H. Granlund, "Content-Dependent Anisotropic Filtering of Images", Proceedings of International Conference on Digital Signal Processing, 1981.

KeywordsBiBTeX

1980

Gösta Granlund, "Description of texture using the general operator approch", 5th International Conference on Pattern Recognition, 776-779, 1980.

KeywordsBiBTeX

Hans Knutsson, B. von Post, Gösta H. Granlund, "Optimization of Arithmetic Neighborhood Operations for Image Processing", Proceedings of the First Scandinavian Conference on Image Analysis, 1980.

KeywordsBiBTeX

Hans Knutsson, Gösta H. Granlund, "Fourier Domain Design of Line and Edge Detectors", Proceedings of the 5th International Conference on Pattern Recognition, 1980.

KeywordsBiBTeX

1978

L. Jilken, J. Bäcklund, Hans Knutsson, "Automatic Fatigue Threshold Value Testing", Conf. on Mechanisms of Deformation and Fracture, 1978.

KeywordsBiBTeX

1970

Gösta H. Granlund, "Pattern Processing Using Multilevel Systems", Proceedings of the Eigth Annual Allerton Conference on Circuit and System Theory, 445-453, 1970.

KeywordsBiBTeX

Conference proceedings

2019

Michael Felsberg, Per-Erik Forssén, Ida-Maria Sintorn, Jonas Unger, "Image Analysis", Image Processing, Computer Vision, Pattern Recognition, and Graphics, No. 11482, 2019.

AbstractKeywordsBiBTeXDOI

2017

Michael Felsberg, Anders Heyden, Norbert Krüger, "Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part I", Lecture Notes in Computer Science, No. 10424, 2017.

AbstractKeywordsBiBTeXDOI

Michael Felsberg, Anders Heyden, Norbert Krüger, "Computer Analysis of Images and Patterns: 17th International Conference, CAIP 2017, Ystad, Sweden, August 22-24, 2017, Proceedings, Part II", Lecture Notes in Computer Science, No. 10425, 2017.

AbstractKeywordsBiBTeXDOI

2014

Anders Heyden, Denis Laurendeau, Michael Felsberg, Magnus Borga, "Proceedings. 22nd International Conferenceon Pattern Recognition ICPR 2014, 24-28 August 2014, Stockholm, Sweden", Conference on Pattern Recognition (CPR), No. 1-6, 2014.

AbstractKeywordsBiBTeXDOI

2011

Rudolf Mester, Michael Felsberg, "Pattern Recognition: 33rd DAGM Symposium, Frankfurt/Main, Germany, August 31 - September 2, 2011, Proceedings", Lecture Notes in Computer Science, No. 6835, 2011.

AbstractKeywordsBiBTeXDOI

Theses

2023

Emil Brissman, "Learning to Analyze Visual Data Streams for Environment Perception", Linköping Studies in Science and Technology. Dissertations, No. 2283, 2023.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

A mobile robot, instructed by a human operator, acts in an environment with many other objects. However, for an autonomous robot, human instructions should be minimal and only high-level instructions, such as the ultimate task or destination. In order to increase the level of autonomy, it has become a foremost objective to mimic human vision using neural networks that take a stream of images as input and learn a specific computer vision task from large amounts of data. In this thesis, we explore several different models for surround sensing, each of which contributes to a higher understanding of the environment being possible.

As its first contribution, this thesis presents an object tracking method for video sequences, which is a crucial component in a perception system. This method predicts a fine-grained mask to separate the pixels corresponding to the target from those corresponding to the background. Rather than tracking location and size, the method tracks the initial pixels assigned to the target in this so-called video object segmentation. For subsequent time steps, the goal is to learn how the target looks using features from a neural network. We named our method A-GAME, based on the generative modeling of deep feature space, separating target and background appearances.

In the second contribution of this thesis, we detect, track, and segment all objects from a set of predefined object classes. This information is how the robot increases its capabilities to perceive the surroundings. We experiment with a graph neural network to weigh all new detections and existing tracks. This model outperforms prior works by separating visually, and semantically similar objects frame by frame.

The third contribution investigates one limitation of anchor-based detectors, which classify pre-defined bounding boxes as either negative or positive and thus provide a limited set of handled object shapes. One idea is to learn an alternative instance representation. We experiment with a neural network that predicts the distance to the nearest object contour in different directions from each pixel. The network then computes an approximated signed distance function containing the respective instance information.

Last, this thesis studies a concept within model validation. We observed that overfitting could increase performance on benchmarks. However, this opportunity is insipid for sensing systems in practice since measurements, such as length or angles, are quantities that explain the environment. The fourth contribution of this thesis is an extended validation technique for camera calibration. This technique uses a statistical model for each error difference between an observed value and a corresponding prediction of the projective model. We compute a test over the differences and detect if the projective model is incorrect.

Karl Holmquist, "Data-Driven Robot Perception in the Wild", Linköping Studies in Science and Technology. Dissertations, No. 2293, 2023.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

As technology continues to advance, the interest in the relief of humans from tedious or dangerous tasks through automation increases. Some of the tasks that have received increasing attention are autonomous driving, disaster relief, and forestry inspection. Developing and deploying an autonomous robotic system to this type of unconstrained environments —in a safe way— is highly challenging. The system requires precise control and high-level decision making. Both of which require a robust and reliable perception system to understand the surroundings correctly.

The main purpose of perception is to extract meaningful information from the environment, be it in the form of 3D maps, dense classification of the type of object and surfaces, or high-level information about the position and direction of moving objects. Depending on the limitations and application of the system, various types of sensors can be used: lidars, to collect sparse depth information; cameras, to collect dense information for different parts of the visual spectra, of-ten the red-green-blue (RGB) bands; Inertial Measurements Units (IMUs), to estimate the ego motion; microphones, to interact and respond to humans; GPS receivers, to get global position information; just to mention a few.

This thesis investigates some of the necessities to approach the requirements of this type of system. Specifically, focusing on data-driven approaches, that is, machine learning, which has been shown time and again to be the main competitor for high-performance perception tasks in recent years. Although precision requirements might be high in industrial production plants, the environment is relatively controlled and the task is fixed. Instead, this thesis is studying some of the aspects necessary for complex, unconstrained environments, primarily outdoors and potentially near humans or other systems. The term in the wild refers exactly to the unconstrained nature of these environments, where the system can easily encounter something previously unseen and where the system might interact with unknowing humans. Some examples of environments are: city traffic, disaster relief scenarios, and dense forests.

This thesis will mainly focus on the following three key aspects necessary to handle the types of tasks and situations that could occur in the wild: 1) generalizing to a new environment, 2) adapting to new tasks and requirements, and 3) modeling uncertainty in the perception system.

First, a robotic system should be able to generalize to new environments and still function reliably. Papers B and G address this by using an intermediate representation to allow the system to handle much more diverse types of environment than otherwise possible. Paper B also investigates how robust the proposed autonomous driving system was to incorrect predictions, which is one of the likely results of changing the environment.

Second, a robot should be sufficiently adaptive to allow it to learn new tasks without forgetting the previous ones. Paper E proposed a way to allow incrementally adding new semantic classes to a trained model without access to the previous training data. The approach is based on utilizing the uncertainty in the predictions to model the unknown classes, marked as background.

Finally, the perception system will always be partially flawed, either because of the lack of modeling capabilities or because of ambiguities in the sensor data. To properly take this into account, it is fundamental that the system has the ability to estimate the certainty in the predictions. Paper F proposed a method for predicting the uncertainty in the model predictions when interpolating sparse data. Paper G addresses the ambiguities that exist when estimating the 3D pose of a human from a single camera image.

2022

Oliver Stromann, "Data-Driven Classification in Road Networks", Linköping Studies in Science and Technology. Licentiate Thesis, No. 1933, 2022.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

Connected and autonomous vehicles (CAVs) are an emerging trend in the transport sector and their impact on transportation, the economy, society and the environment will be tremendous. Much like the automobile shaped the way humans travelled, lived and worked during the 20th century, CAVs have yet again the potential to affect and reform all of these areas. Besides the imminent technological challenges on the robotic aspect of making CAVs become a market-ready reality, a plethora of ethical, social and legal questions will have to be addressed along the line. Knowledge of and interaction with the surrounding infrastructure and other actors in the system will be essential for CAVs in order to pave the way for progressive solutions to urgent sustainability and mobility issues in transportation.

Road networks, i.e. the networks of roads and intersections, are the core infrastructure on which CAVs will operate. Thus, having detailed knowledge about them is key for CAVs in order to take the right decisions on both short-term actions that will affect individual traffic users in immediate situations and long-term actions that will affect entire transportation systems in the long run. Machine learning is nowadays a popular choice to extract and conglomerate knowledge from large amounts of data – and large amounts of data can be obtained about road networks. However, classical machine learning models are incapable of harnessing the graph-structured nature of road networks sufficiently.

Graph neural networks (GNNs) are machine learning models of growing popularity that can explicitly leverage the complex topological structure of node dependencies in graphs, such as the ones observed in road networks. Road networks are sparse graphs that reside in a euclidean space, and therefore different to typical graphs studied in the literature. Also, crowd-sourced road network graphs often have incomplete attributes and are generally lacking the fine-grained level of detail in their encoded information that would be required for CAVs. Identifying the best representation of road network graphs and complementing their lacking detail with auxiliary data is therefore an important research direction.

This thesis, therefore, addresses data-driven classification in road networks from two directions: A) the general approach of learning on spatial graphs of road network with GNNs, and B) complementing road network graphs with auxiliary data. Specifically, this thesis and the included papers address the exemplary task of road classifications and make the following contributions to the field:

Paper A analyses how GNNs can be applied to road networks and how the networks are best represented. Different aggregator functions are compared on final classification performances. A novel aggregator and a neighbourhood sampling method are introduced, and the line graph transformation is identified as a suitable representation of road network graphs for GNNs.

Paper B complements the road network graphs with mobility data from millions of GPS trajectories and introduces an equitemporal node spacing to create road segments of equal travel time. It further introduces remote sensing vision data as a potent complement to overcome shortcom-ings of the graph-based representation for road networks. Simple hand-crafted low-level vision features are used in this work. However, both the equitemporal node spacing and the simple vision features clearly exhibit improved classification performances.

Finally, Paper C consolidates the complement of remote sensing data to the road network graphs. Through a general visual feature encoding of state-of-the-art pretrained vision back-bones that are carefully fine-tuned to the remote sensing domain, a further performance boost on the road classification task is achieved.

Mikael Persson, "Visual Odometryin Principle and Practice", Linköping Studies in Science and Technology. Dissertations, No. 2201, 2022.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

Vision is the primary means by which we know where we are, what is nearby, and how we are moving. The corresponding computer-vision task is the simultaneous mapping of the surroundings and the localization of the camera. This goes by many names of which this thesis uses Visual Odometry. This name implies the images are sequential and emphasizes the accuracy of the pose and the real time requirements. This field has seen substantial improvements over the past decade and visual odometry is used extensively in robotics for localization, navigation and obstacle detection.

The main purpose of this thesis is the study and advancement of visual odometry systems, and makes several contributions. The first of which is a high performance stereo visual odometry system, which through geometrically supported tracking achieved top rank on the KITTI odometry benchmark.

The second is the state-of-the-art perspective three point solver. Such solvers find the pose of a camera given the projections of three known 3d points and are a core part of many visual odometry systems. By reformulating the underlying problem we avoided a problematic quartic polynomial. As a result we achieved substantially higher computational performance and numerical accuracy.

The third is a system which generalizes stereo visual odometry to the simultaneous estimation of multiple independently moving objects. The main contribution is a real time system which allows the identification of generic moving rigid objects and the prediction of their trajectories in real time, with applications to robotic navigation in in dynamic environments.

The fourth is an improved spline type continuous pose trajectory estimation framework, which simplifies the integration of general dynamic models. The framework is used to show that visual odometry systems based on continuous pose trajectories are both practical and can operate in real time.

The visual odometry pipeline is considered from both a theoretical and a practical perspective. The systems described have been tested both on benchmarks and real vehicles. This thesis places the published work into context, highlighting key insights and practical observations.

Joakim Johnander, "Dynamic Visual Learning", Linköping Studies in Science and Technology. Dissertations, No. 2196, 2022.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

Autonomous robots act in a \emph{dynamic} world where both the robots and other objects may move. The surround sensing systems of said robots therefore work with dynamic input data and need to estimate both the current state of the environment as well as its dynamics. One of the key elements to obtain a high-level understanding of the environment is to track dynamic objects. This enables the system to understand what the objects are doing; predict where they will be in the future; and in the future better estimate where they are. In this thesis, I focus on input from visual cameras, images. Images have, with the advent of neural networks, become a cornerstone in sensing systems. Image-processing neural networks are optimized to perform a specific computer vision task -- such as recognizing cats and dogs -- on vast datasets of annotated examples. This is usually referred to as \emph{offline training} and given a well-designed neural network, enough high-quality data, and a suitable offline training formulation, the neural network is expected to become adept at the specific task.

This thesis starts with a study of object tracking. The tracking is based on the visual appearance of the object, achieved via discriminative convolution filters (DCFs). The first contribution of this thesis is to decompose the filter into multiple subfilters. This serves to increase the robustness during object deformations or rotations. Moreover, it provides a more fine-grained representation of the object state as the subfilters are expected to roughly track object parts. In the second contribution, a neural network is trained directly for object tracking. In order to obtain a fine-grained representation of the object state, it is represented as a segmentation. The main challenge lies in the design of a neural network able to tackle this task. While the common neural networks excel at recognizing patterns seen during offline training, they struggle to store novel patterns in order to later recognize them. To overcome this limitation, a novel appearance learning mechanism is proposed. The mechanism extends the state-of-the-art and is shown to generalize remarkably well to novel data. In the third contribution, the method is used together with a novel fusion strategy and failure detection criterion to semi-automatically annotate visual and thermal videos.

Sensing systems need not only track objects, but also detect them. The fourth contribution of this thesis strives to tackle joint detection, tracking, and segmentation of all objects from a predefined set of object classes. The challenge here lies not only in the neural network design, but also in the design of the offline training formulation. The final approach, a recurrent graph neural network, outperforms prior works that have a runtime of the same order of magnitude.

Last, this thesis studies \emph{dynamic} learning of novel visual concepts. It is observed that the learning mechanisms used for object tracking essentially learns the appearance of the tracked object. It is natural to ask whether this appearance learning could be extended beyond individual objects to entire semantic classes, enabling the system to learn new concepts based on just a few training examples. Such an ability is desirable in autonomous systems as it removes the need of manually annotating thousands of examples of each class that needs recognition. Instead, the system is trained to efficiently learn to recognize new classes. In the fifth contribution, we propose a novel learning mechanism based on Gaussian process regression. With this mechanism, our neural network outperforms the state-of-the-art and the performance gap is especially large when multiple training examples are given.

To summarize, this thesis studies and makes several contributions to learning systems that parse dynamic visuals and that dynamically learn visual appearances or concepts.

2021

Felix Järemo Lawin, "Learning Representations for Segmentation and Registration", Linköping Studies in Science and Technology. Dissertations, No. 2151, 2021.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

In computer vision, the aim is to model and extract high-level information from visual sensor measurements such as images, videos and 3D points. Since visual data is often high-dimensional, noisy and irregular, achieving robust data modeling is challenging. This thesis presents works that address challenges within a number of different computer vision problems.

First, the thesis addresses the problem of phase unwrapping for multi-frequency amplitude modulated time-of-flight (ToF) ranging. ToF is used in depth cameras, which have many applications in 3D reconstruction and gesture recognition. While amplitude modulation in time-of-flight ranging can provide accurate measurements for the depth, it also causes depth ambiguities. This thesis presents a method to resolve the ambiguities by estimating the likelihoods of different hypotheses for the depth values. This is achieved by performing kernel density estimation over the hypotheses in a spatial neighborhood of each pixel in the depth image. The depth hypothesis with the highest estimated likelihood can then be selected as the output depth. This approach yields improvements in the quality of the depth images and extends the effective range in both indoor and outdoor environments.

Next, point set registration is investigated, which is the problem of aligning point sets from overlapping depth images or 3D models. Robust registration is fundamental to many vision tasks, such as multi-view 3D reconstruction and object pose estimation for robotics. The thesis presents a method for handling density variations in the measured point sets. This is achieved by modeling a latent distribution representing the underlying structure of the scene. Both the model of the scene and the registration parameters are inferred in an Expectation-Maximization based framework. Secondly, the thesis introduces a method for integrating features from deep neural networks into the registration model. It is shown that the deep features improve registration performance in terms of accuracy and robustness. Additionally, improved feature representations are generated by training the deep neural network end-to-end by minimizing registration errors produced by our registration model.

Further, an approach for 3D point set segmentation is presented. As scene models are often represented using 3D point measurements, segmentation of these is important for general scene understanding. Learning models for segmentation requires a significant amount of annotated data, which is expensive and time-consuming to acquire. The approach presented in the thesis circumvents this by projecting the points into virtual camera views and render 2D images. The method can then exploit accurate convolutional neural networks for image segmentation and map the segmentation predictions back to the 3D points. This also allows for transferring learning using available annotated image data, thereby reducing the need for 3D annotations.

Finally, the thesis explores the problem of video object segmentation (VOS), where the task is to track and segment target objects in each frame of a video sequence. Accurate VOS requires a robust model of the target that can adapt to different scenarios and objects. This needs to be achieved using only a single labeled reference frame as training data for each video sequence. To address the challenges in VOS, the thesis introduces a parametric target model, optimized to predict a target label derived from the mask annotation. The target model is integrated into a deep neural network, where its predictions guide a decoder module to produce target segmentation masks. The deep network is trained on labeled video data to output accurate segmentation masks for each frame. Further, it is shown that by training the entire network model in an end-to-end manner, it can learn a representation of the target that provides increased segmentation accuracy.

Abdelrahman Eldesokey, "Uncertainty-Aware Convolutional Neural Networks for Vision Tasks on Sparse Data", Linköping Studies in Science and Technology. Dissertations, No. 2123, 2021.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

Early computer vision algorithms operated on dense 2D images captured using conventional monocular or color sensors. Those sensors embrace a passive nature providing limited scene representations based on light reflux, and are only able to operate under adequate lighting conditions. These limitations hindered the development of many computer vision algorithms that require some knowledge of the scene structure under varying conditions. The emergence of active sensors such as Time-of-Flight (ToF) cameras contributed to mitigating these limitations; however, they gave a rise to many novel challenges, such as data sparsity that stems from multi-path interference, and occlusion.

Many approaches have been proposed to alleviate these challenges by enhancing the acquisition process of ToF cameras or by post-processing their output. Nonetheless, these approaches are sensor and model specific, requiring an individual tuning for each sensor. Alternatively, learning-based approaches, i.e., machine learning, are an attractive solution to these problems by learning a mapping from the original sensor output to a refined version of it. Convolutional Neural Networks (CNNs) are one example of powerful machine learning approaches and they have demonstrated a remarkable success on many computer vision tasks. Unfortunately, CNNs naturally operate on dense data and cannot efficiently handle sparse data from ToF sensors.

In this thesis, we propose a novel variation of CNNs denoted as the Normalized Convolutional Neural Networks that can directly handle sparse data very efficiently. First, we formulate a differentiable normalized convolution layer that takes in sparse data and a confidence map as input. The confidence map provides information about valid and missing pixels to the normalized convolution layer, where the missing values are interpolated from their valid vicinity. Afterwards, we propose a confidence propagation criterion that allows building cascades of normalized convolution layers similar to the standard CNNs. We evaluated our approach on the task of unguided scene depth completion and achieved state-of-the-art results using an exceptionally small network.

As a second contribution, we investigated the fusion of a normalized convolution network with standard CNNs employing RGB images. We study different fusion schemes, and we provide a thorough analysis for different components of the network. By employing our best fusion strategy, we achieve state-of-the-art results on guided depth completion using a remarkably small network.

Thirdly, to provide a statistical interpretation for confidences, we derive a probabilistic framework for the normalized convolutional neural networks. This framework estimates the input confidence in a self-supervised manner and propagates it to provide a statistically valid output confidence. When compared against existing approaches for uncertainty estimation in CNNs such as Bayesian Deep Learning, our probabilistic framework provides a higher quality measure of uncertainty at a significantly lower computational cost.

Finally, we attempt to employ our framework in a common task in CNNs, namely upsampling. We formulate the upsampling problem as a sparse problem, and we employ the normalized convolutional neural networks to solve it. In comparison to existing approaches, our proposed upsampler is structure-aware while being light-weight. We test our upsampler with various optical flow estimation networks, and we show that it consistently improves the results. When integrated with a recent optical flow network, it sets a new state-of-the-art on the most challenging optical flow dataset.

Andreas Robinson, "Discriminative correlation filters in robot vision", Linköping Studies in Science and Technology. Dissertations, No. 2146, 2021.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

In less than ten years, deep neural networks have evolved into all-encompassing tools in multiple areas of science and engineering, due to their almost unreasonable effectiveness in modeling complex real-world relationships. In computer vision in particular, they have taken tasks such as object recognition, that were previously considered very difficult, and transformed them into everyday practical tools. However, neural networks have to be trained with supercomputers on massive datasets for hours or days, and this limits their ability adjust to changing conditions.

This thesis explores discriminative correlation filters, originally intended for tracking large objects in video, so-called visual object tracking. Unlike neural networks, these filters are small and can be quickly adapted to changes, with minimal data and computing power. At the same time, they can take advantage of the computing infrastructure developed for neural networks and operate within them.

The main contributions in this thesis demonstrate the versatility and adaptability of correlation filters for various problems, while complementing the capabilities of deep neural networks. In the first problem, it is shown that when adopted to track small regions and points, they outperform the widely used Lucas-Kanade method, both in terms of robustness and precision.

In the second problem, the correlation filters take on a completely new task. Here, they are used to tell different places apart, in a 16 by 16 square kilometer region of ocean near land. Given only a horizon profile - the coast line silhouette of islands and islets as seen from an ocean vessel - it is demonstrated that discriminative correlation filters can effectively distinguish between locations.

In the third problem, it is shown how correlation filters can be applied to video object segmentation. This is the task of classifying individual pixels as belonging either to a target or the background, given a segmentation mask provided with the first video frame as the only guidance. It is also shown that discriminative correlation filters and deep neural networks complement each other; where the neural network processes the input video in a content-agnostic way, the filters adapt to specific target objects. The joint function is a real-time video object segmentation method.

Finally, the segmentation method is extended beyond binary target/background classification to additionally consider distracting objects. This addresses the fundamental difficulty of coping with objects of similar appearance.

Gustav Häger, "Learning visual perception for autonomous systems", Linköping Studies in Science and Technology. Dissertations, No. 2138, 2021.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

In the last decade, developments in hardware, sensors and software have made it possible to create increasingly autonomous systems. These systems can be as simple as limited driver assistance software lane-following in cars, or limited collision warning systems for otherwise manually piloted drones. On the other end of the spectrum exist fully autonomous cars, boats or helicopters. With increasing abilities to function autonomously, the demands to operate with minimal human supervision in unstructured environments increase accordingly.

Common to most, if not all, autonomous systems is that they require an accurate model of the surrounding world. While there is currently a large number of possible sensors useful to create such models available, cameras are one of the most versatile. From a sensing perspective cameras have several advantages over other sensors in that they require no external infrastructure, are relatively cheap and can be used to extract such information as the relative positions of other objects, their movements over time, create accurate maps and locate the autonomous system within these maps.

Using cameras to produce a model of the surroundings require solving a number of technical problems. Often these problems have a basis in recognizing that an object or region of interest is the same over time or in novel viewpoints. In visual tracking this type of recognition is required to follow an object of interest through a sequence of images. In geometric problems it is often a requirement to recognize corresponding image regions in order to perform 3D reconstruction or localization.

The first set of contributions in this thesis is related to the improvement of a class of on-line learned visual object trackers based on discriminative correlation filters. In visual tracking estimation of the objects size is important for reliable tracking, the first contribution in this part of the thesis investigates this problem. The performance of discriminative correlation filters is highly dependent on what feature representation is used by the filter. The second tracking contribution investigates the performance impact of different features derived from a deep neural network.

A second set of contributions relate to the evaluation of visual object trackers. The first of these are the visual object tracking challenge. This challenge is a yearly comparison of state-of-the art visual tracking algorithms. A second contribution is an investigation into the possible issues when using bounding-box representations for ground-truth data.

In real world settings tracking typically occur over longer time sequences than is common in benchmarking datasets. In such settings it is common that the model updates of many tracking algorithms cause the tracker to fail silently. For this reason it is important to have an estimate of the trackers performance even in cases when no ground-truth annotations exist. The first of the final three contributions investigates this problem in a robotics setting, by fusing information from a pre-trained object detector in a state-estimation framework. An additional contribution describes how to dynamically re-weight the data used for the appearance model of a tracker. A final contribution investigates how to obtain an estimate of how certain detections are in a setting where geometrical limitations can be imposed on the search region. The proposed solution learns to accurately predict stereo disparities along with accurate assessments of each predictions certainty.

2019

Amanda Berg, "Learning to Analyze what is Beyond the Visible Spectrum", Linköping Studies in Science and Technology. Dissertations, No. 2024, 2019.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

Thermal cameras have historically been of interest mainly for military applications. Increasing image quality and resolution combined with decreasing camera price and size during recent years have, however, opened up new application areas. They are now widely used for civilian applications, e.g., within industry, to search for missing persons, in automotive safety, as well as for medical applications. Thermal cameras are useful as soon as there exists a measurable temperature difference. Compared to cameras operating in the visual spectrum, they are advantageous due to their ability to see in total darkness, robustness to illumination variations, and less intrusion on privacy.

This thesis addresses the problem of automatic image analysis in thermal infrared images with a focus on machine learning methods. The main purpose of this thesis is to study the variations of processing required due to the thermal infrared data modality. In particular, three different problems are addressed: visual object tracking, anomaly detection, and modality transfer. All these are research areas that have been and currently are subject to extensive research. Furthermore, they are all highly relevant for a number of different real-world applications.

The first addressed problem is visual object tracking, a problem for which no prior information other than the initial location of the object is given. The main contribution concerns benchmarking of short-term single-object (STSO) visual object tracking methods in thermal infrared images. The proposed dataset, LTIR (Linköping Thermal Infrared), was integrated in the VOT-TIR2015 challenge, introducing the first ever organized challenge on STSO tracking in thermal infrared video. Another contribution also related to benchmarking is a novel, recursive, method for semi-automatic annotation of multi-modal video sequences. Based on only a few initial annotations, a video object segmentation (VOS) method proposes segmentations for all remaining frames and difficult parts in need for additional manual annotation are automatically detected. The third contribution to the problem of visual object tracking is a template tracking method based on a non-parametric probability density model of the object's thermal radiation using channel representations.

The second addressed problem is anomaly detection, i.e., detection of rare objects or events. The main contribution is a method for truly unsupervised anomaly detection based on Generative Adversarial Networks (GANs). The method employs joint training of the generator and an observation to latent space encoder, enabling stratification of the latent space and, thus, also separation of normal and anomalous samples. The second contribution is the previously unaddressed problem of obstacle detection in front of moving trains using a train-mounted thermal camera. Adaptive correlation filters are updated continuously and missed detections of background are treated as detections of anomalies, or obstacles. The third contribution to the problem of anomaly detection is a method for characterization and classification of automatically detected district heat leakages for the purpose of false alarm reduction.

Finally, the thesis addresses the problem of modality transfer between thermal infrared and visual spectrum images, a previously unaddressed problem. The contribution is a method based on Convolutional Neural Networks (CNNs), enabling perceptually realistic transformations of thermal infrared to visual images. By careful design of the loss function the method becomes robust to image pair misalignments. The method exploits the lower acuity for color differences than for luminance possessed by the human visual system, separating the loss into a luminance and a chrominance part.

Bertil Grelsson, "Vision-based Localization and Attitude Estimation Methods in Natural Environments", Linköping Studies in Science and Technology. Dissertations, No. 1977, 2019.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

Over the last decade, the usage of unmanned systems such as Unmanned Aerial Vehicles (UAVs), Unmanned Surface Vessels (USVs) and Unmanned Ground Vehicles (UGVs) has increased drastically, and there is still a rapid growth. Today, unmanned systems are being deployed in many daily operations, e.g. for deliveries in remote areas, to increase efficiency of agriculture, and for environmental monitoring at sea. For safety reasons, unmanned systems are often the preferred choice for surveillance missions in hazardous environments, e.g. for detection of nuclear radiation, and in disaster areas after earthquakes, hurricanes, or during forest fires. For safe navigation of the unmanned systems during their missions, continuous and accurate global localization and attitude estimation is mandatory.

Over the years, many vision-based methods for position estimation have been developed, primarily for urban areas. In contrast, this thesis is mainly focused on vision-based methods for accurate position and attitude estimates in natural environments, i.e. beyond the urban areas. Vision-based methods possess several characteristics that make them appealing as global position and attitude sensors. First, vision sensors can be realized and tailored for most unmanned vehicle applications. Second, geo-referenced terrain models can be generated worldwide from satellite imagery and can be stored onboard the vehicles. In natural environments, where the availability of geo-referenced images in general is low, registration of image information with terrain models is the natural choice for position and attitude estimation. This is the problem area that I addressed in the contributions of this thesis.

The first contribution is a method for full 6DoF (degrees of freedom) pose estimation from aerial images. A dense local height map is computed using structure from motion. The global pose is inferred from the 3D similarity transform between the local height map and a digital elevation model. Aligning height information is assumed to be more robust to season variations than feature-based matching.

The second contribution is a method for accurate attitude (pitch and roll angle) estimation via horizon detection. It is one of only a few methods that use an omnidirectional (fisheye) camera for horizon detection in aerial images. The method is based on edge detection and a probabilistic Hough voting scheme. The method allows prior knowledge of the attitude angles to be exploited to make the initial attitude estimates more robust. The estimates are then refined through registration with the geometrically expected horizon line from a digital elevation model. To the best of our knowledge, it is the first method where the ray refraction in the atmosphere is taken into account, which enables the highly accurate attitude estimates.

The third contribution is a method for position estimation based on horizon detection in an omnidirectional panoramic image around a surface vessel. Two convolutional neural networks (CNNs) are designed and trained to estimate the camera orientation and to segment the horizon line in the image. The MOSSE correlation filter, normally used in visual object tracking, is adapted to horizon line registration with geometric data from a digital elevation model. Comprehensive field trials conducted in the archipelago demonstrate the GPS-level accuracy of the method, and that the method can be trained on images from one region and then applied to images from a previously unvisited test area.

The CNNs in the third contribution apply the typical scheme of convolutions, activations, and pooling. The fourth contribution focuses on the activations and suggests a new formulation to tune and optimize a piecewise linear activation function during training of CNNs. Improved classification results from experiments when tuning the activation function led to the introduction of a new activation function, the Shifted Exponential Linear Unit (ShELU).

2018

Hannes Ovrén, "Continuous Models for Cameras and Inertial Sensors", Linköping Studies in Science and Technology. Dissertations, No. 1951, 2018.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

Using images to reconstruct the world in three dimensions is a classical computer vision task. Some examples of applications where this is useful are autonomous mapping and navigation, urban planning, and special effects in movies. One common approach to 3D reconstruction is ”structure from motion” where a scene is imaged multiple times from different positions, e.g. by moving the camera. However, in a twist of irony, many structure from motion methods work best when the camera is stationary while the image is captured. This is because the motion of the camera can cause distortions in the image that lead to worse image measurements, and thus a worse reconstruction. One such distortion common to all cameras is motion blur, while another is connected to the use of an electronic rolling shutter. Instead of capturing all pixels of the image at once, a camera with a rolling shutter captures the image row by row. If the camera is moving while the image is captured the rolling shutter causes non-rigid distortions in the image that, unless handled, can severely impact the reconstruction quality.

This thesis studies methods to robustly perform 3D reconstruction in the case of a moving camera. To do so, the proposed methods make use of an inertial measurement unit (IMU). The IMU measures the angular velocities and linear accelerations of the camera, and these can be used to estimate the trajectory of the camera over time. Knowledge of the camera motion can then be used to correct for the distortions caused by the rolling shutter. Another benefit of an IMU is that it can provide measurements also in situations when a camera can not, e.g. because of excessive motion blur, or absence of scene structure.

To use a camera together with an IMU, the camera-IMU system must be jointly calibrated. The relationship between their respective coordinate frames need to be established, and their timings need to be synchronized. This thesis shows how to automatically perform this calibration and synchronization, without requiring e.g. calibration objects or special motion patterns.

In standard structure from motion, the camera trajectory is modeled as discrete poses, with one pose per image. Switching instead to a formulation with a continuous-time camera trajectory provides a natural way to handle rolling shutter distortions, and also to incorporate inertial measurements. To model the continuous-time trajectory, many authors have used splines. The ability for a spline-based trajectory to model the real motion depends on the density of its spline knots. Choosing a too smooth spline results in approximation errors. This thesis proposes a method to estimate the spline approximation error, and use it to better balance camera and IMU measurements, when used in a sensor fusion framework. Also proposed is a way to automatically decide how dense the spline needs to be to achieve a good reconstruction.

Another approach to reconstruct a 3D scene is to use a camera that directly measures depth. Some depth cameras, like the well-known Microsoft Kinect, are susceptible to the same rolling shutter effects as normal cameras. This thesis quantifies the effect of the rolling shutter distortion on 3D reconstruction, depending on the amount of motion. It is also shown that a better 3D model is obtained if the depth images are corrected using inertial measurements.

Martin Danelljan, "Learning Convolution Operators for Visual Tracking", Linköping Studies in Science and Technology. Dissertations, No. 1926, 2018.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

Visual tracking is one of the fundamental problems in computer vision. Its numerous applications include robotics, autonomous driving, augmented reality and 3D reconstruction. In essence, visual tracking can be described as the problem of estimating the trajectory of a target in a sequence of images. The target can be any image region or object of interest. While humans excel at this task, requiring little effort to perform accurate and robust visual tracking, it has proven difficult to automate. It has therefore remained one of the most active research topics in computer vision.

In its most general form, no prior knowledge about the object of interest or environment is given, except for the initial target location. This general form of tracking is known as generic visual tracking. The unconstrained nature of this problem makes it particularly difficult, yet applicable to a wider range of scenarios. As no prior knowledge is given, the tracker must learn an appearance model of the target on-the-fly. Cast as a machine learning problem, it imposes several major challenges which are addressed in this thesis.

The main purpose of this thesis is the study and advancement of the, so called, Discriminative Correlation Filter (DCF) framework, as it has shown to be particularly suitable for the tracking application. By utilizing properties of the Fourier transform, a correlation filter is discriminatively learned by efficiently minimizing a least-squares objective. The resulting filter is then applied to a new image in order to estimate the target location.

This thesis contributes to the advancement of the DCF methodology in several aspects. The main contribution regards the learning of the appearance model: First, the problem of updating the appearance model with new training samples is covered. Efficient update rules and numerical solvers are investigated for this task. Second, the periodic assumption induced by the circular convolution in DCF is countered by proposing a spatial regularization component. Third, an adaptive model of the training set is proposed to alleviate the impact of corrupted or mislabeled training samples. Fourth, a continuous-space formulation of the DCF is introduced, enabling the fusion of multiresolution features and sub-pixel accurate predictions. Finally, the problems of computational complexity and overfitting are addressed by investigating dimensionality reduction techniques.

As a second contribution, different feature representations for tracking are investigated. A particular focus is put on the analysis of color features, which had been largely overlooked in prior tracking research. This thesis also studies the use of deep features in DCF-based tracking. While many vision problems have greatly benefited from the advent of deep learning, it has proven difficult to harvest the power of such representations for tracking. In this thesis it is shown that both shallow and deep layers contribute positively. Furthermore, the problem of fusing their complementary properties is investigated.

The final major contribution of this thesis regards the prediction of the target scale. In many applications, it is essential to track the scale, or size, of the target since it is strongly related to the relative distance. A thorough analysis of how to integrate scale estimation into the DCF framework is performed. A one-dimensional scale filter is proposed, enabling efficient and accurate scale estimation.

2017

Marcus Wallenberg, "Embodied Visual Object Recognition", Linköping Studies in Science and Technology. Dissertations, No. 1811, 2017.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

Object recognition is a skill we as humans often take for granted. Due to our formidable object learning, recognition and generalisation skills, it is sometimes hard to see the multitude of obstacles that need to be overcome in order to replicate this skill in an artificial system. Object recognition is also one of the classical areas of computer vision, and many ways of approaching the problem have been proposed. Recently, visually capable robots and autonomous vehicles have increased the focus on embodied recognition systems and active visual search. These applications demand that systems can learn and adapt to their surroundings, and arrive at decisions in a reasonable amount of time, while maintaining high object recognition performance. This is especially challenging due to the high dimensionality of image data. In cases where end-to-end learning from pixels to output is needed, mechanisms designed to make inputs tractable are often necessary for less computationally capable embodied systems.Active visual search also means that mechanisms for attention and gaze control are integral to the object recognition procedure. Therefore, the way in which attention mechanisms should be introduced into feature extraction and estimation algorithms must be carefully considered when constructing a recognition system.This thesis describes work done on the components necessary for creating an embodied recognition system, specifically in the areas of decision uncertainty estimation, object segmentation from multiple cues, adaptation of stereo vision to a specific platform and setting, problem-specific feature selection, efficient estimator training and attentional modulation in convolutional neural networks. Contributions include the evaluation of methods and measures for predicting the potential uncertainty reduction that can be obtained from additional views of an object, allowing for adaptive target observations. Also, in order to separate a specific object from other parts of a scene, it is often necessary to combine multiple cues such as colour and depth in order to obtain satisfactory results. Therefore, a method for combining these using channel coding has been evaluated. In order to make use of three-dimensional spatial structure in recognition, a novel stereo vision algorithm extension along with a framework for automatic stereo tuning have also been investigated. Feature selection and efficient discriminant sampling for decision tree-based estimators have also been implemented. Finally, attentional multi-layer modulation of convolutional neural networks for recognition in cluttered scenes has been evaluated. Several of these components have been tested and evaluated on a purpose-built embodied recognition platform known as Eddie the Embodied.

2016

Kristoffer Öfjäll, "Adaptive Supervision Online Learning for Vision Based Autonomous Systems", Linköping Studies in Science and Technology. Dissertations, No. 1749, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

Driver assistance systems in modern cars now show clear steps towards autonomous driving and improvements are presented in a steady pace. The total number of sensors has also decreased from the vehicles of the initial DARPA challenge, more resembling a pile of sensors with a car underneath. Still, anyone driving a tele-operated toy using a video link is a demonstration that a single camera provides enough information about the surronding world.

Most lane assist systems are developed for highway use and depend on visible lane markers. However, lane markers may not be visible due to snow or wear, and there are roads without lane markers. With a slightly different approach, autonomous road following can be obtained on almost any kind of road. Using realtime online machine learning, a human driver can demonstrate driving on a road type unknown to the system and after some training, the system can seamlessly take over. The demonstrator system presented in this work has shown capability of learning to follow different types of roads as well as learning to follow a person. The system is based solely on vision, mapping camera images directly to control signals.

Such systems need the ability to handle multiple-hypothesis outputs as there may be several plausible options in similar situations. If there is an obstacle in the middle of the road, the obstacle can be avoided by going on either side. However the average action, going straight ahead, is not a viable option. Similarly, at an intersection, the system should follow one road, not the average of all roads.

To this end, an online machine learning framework is presented where inputs and outputs are represented using the channel representation. The learning system is structurally simple and computationally light, based on neuropsychological ideas presented by Donald Hebb over 60 years ago. Nonetheless the system has shown a cabability to learn advanced tasks. Furthermore, the structure of the system permits a statistical interpretation where a non-parametric representation of the joint distribution of input and output is generated. Prediction generates the conditional distribution of the output, given the input.

The statistical interpretation motivates the introduction of priors. In cases with multiple options, such as at intersections, a prior can select one mode in the multimodal distribution of possible actions. In addition to the ability to learn from demonstration, a possibility for immediate reinforcement feedback is presented. This allows for a system where the teacher can choose the most appropriate way of training the system, at any time and at her own discretion.

The theoretical contributions include a deeper analysis of the channel representation. A geometrical analysis illustrates the cause of decoding bias commonly present in neurologically inspired representations, and measures to counteract it. Confidence values are analyzed and interpreted as evidence and coherence. Further, the use of the truncated cosine basis function is motivated.

Finally, a selection of applications is presented, such as autonomous road following by online learning and head pose estimation. A method founded on the same basic principles is used for visual tracking, where the probabilistic representation of target pixel values allows for changes in target appearance.

Amanda Berg, "Detection and Tracking in Thermal Infrared Imagery", Linköping Studies in Science and Technology. Thesis, No. 1744, 2016.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

Thermal cameras have historically been of interest mainly for military applications. Increasing image quality and resolution combined with decreasing price and size during recent years have, however, opened up new application areas. They are now widely used for civilian applications, e.g., within industry, to search for missing persons, in automotive safety, as well as for medical applications. Thermal cameras are useful as soon as it is possible to measure a temperature difference. Compared to cameras operating in the visual spectrum, they are advantageous due to their ability to see in total darkness, robustness to illumination variations, and less intrusion on privacy.

This thesis addresses the problem of detection and tracking in thermal infrared imagery. Visual detection and tracking of objects in video are research areas that have been and currently are subject to extensive research. Indications oftheir popularity are recent benchmarks such as the annual Visual Object Tracking (VOT) challenges, the Object Tracking Benchmarks, the series of workshops on Performance Evaluation of Tracking and Surveillance (PETS), and the workshops on Change Detection. Benchmark results indicate that detection and tracking are still challenging problems.

A common belief is that detection and tracking in thermal infrared imagery is identical to detection and tracking in grayscale visual imagery. This thesis argues that the preceding allegation is not true. The characteristics of thermal infrared radiation and imagery pose certain challenges to image analysis algorithms. The thesis describes these characteristics and challenges as well as presents evaluation results confirming the hypothesis.

Detection and tracking are often treated as two separate problems. However, some tracking methods, e.g. template-based tracking methods, base their tracking on repeated specific detections. They learn a model of the object that is adaptively updated. That is, detection and tracking are performed jointly. The thesis includes a template-based tracking method designed specifically for thermal infrared imagery, describes a thermal infrared dataset for evaluation of template-based tracking methods, and provides an overview of the first challenge on short-term,single-object tracking in thermal infrared video. Finally, two applications employing detection and tracking methods are presented.

2015

Freddie Åström, "Variational Tensor-Based Models for Image Diffusion in Non-Linear Domains", Linköping Studies in Science and Technology. Dissertations, No. 1646, 2015.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

This dissertation addresses the problem of adaptive image filtering.

Although the topic has a long history in the image processing community, researchers continuously present novel methods to obtain ever better image restoration results.

With an expanding market for individuals who wish to share their everyday life on social media, imaging techniques such as compact cameras and smart phones are important factors. Naturally, every producer of imaging equipment desires to exploit cheap camera components while supplying high quality images. One step in this pipeline is to use sophisticated imaging software including, e.g., noise reduction to reduce manufacturing costs, while maintaining image quality.

This thesis is based on traditional formulations such as isotropic and tensor-based anisotropic diffusion for image denoising. The difference from main-stream denoising methods is that this thesis explores the effects of introducing contextual information as prior knowledge for image denoising into the filtering schemes. To achieve this, the adaptive filtering theory is formulated from an energy minimization standpoint. The core contributions of this work is the introduction of a novel tensor-based functional which unifies and generalises standard diffusion methods. Additionally, the explicit Euler-Lagrange equation is derived which, if solved, yield the stationary point for the minimization problem. Several aspects of the functional are presented in detail which include, but are not limited to, tensor symmetry constraints and convexity. Also, the classical problem of finding a variational formulation to a given tensor-based partial differential equation is studied.

The presented framework is applied in problem formulation that includes non-linear domain transformation, e.g., visualization of medical images.

Additionally, the framework is also used to exploit locally estimated probability density functions or the channel representation to drive the filtering process.

Furthermore, one of the first truly tensor-based formulations of total variation is presented. The key to the formulation is the gradient energy tensor, which does not require spatial regularization of its tensor components. It is shown empirically in several computer vision applications, such as corner detection and optical flow, that the gradient energy tensor is a viable replacement for the commonly used structure tensor. Moreover, the gradient energy tensor is used in the traditional tensor-based anisotropic diffusion scheme. This approach results in significant improvements in computational speed when the scheme is implemented on a graphical processing unit compared to using the commonly used structure tensor.

2014

Kristoffer Öfjäll, "Online Learning for Robot Vision", Linköping Studies in Science and Technology. Thesis, No. 1678, 2014.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

In tele-operated robotics applications, the primary information channel from the robot to its human operator is a video stream. For autonomous robotic systems however, a much larger selection of sensors is employed, although the most relevant information for the operation of the robot is still available in a single video stream. The issue lies in autonomously interpreting the visual data and extracting the relevant information, something humans and animals perform strikingly well. On the other hand, humans have great diculty expressing what they are actually looking for on a low level, suitable for direct implementation on a machine. For instance objects tend to be already detected when the visual information reaches the conscious mind, with almost no clues remaining regarding how the object was identied in the rst place. This became apparent already when Seymour Papert gathered a group of summer workers to solve the computer vision problem 48 years ago [35].

Articial learning systems can overcome this gap between the level of human visual reasoning and low-level machine vision processing. If a human teacher can provide examples of what to be extracted and if the learning system is able to extract the gist of these examples, the gap is bridged. There are however some special demands on a learning system for it to perform successfully in a visual context. First, low level visual input is often of high dimensionality such that the learning system needs to handle large inputs. Second, visual information is often ambiguous such that the learning system needs to be able to handle multi modal outputs, i.e. multiple hypotheses. Typically, the relations to be learned are non-linear and there is an advantage if data can be processed at video rate, even after presenting many examples to the learning system. In general, there seems to be a lack of such methods.

This thesis presents systems for learning perception-action mappings for robotic systems with visual input. A range of problems are discussed, such as vision based autonomous driving, inverse kinematics of a robotic manipulator and controlling a dynamical system. Operational systems demonstrating solutions to these problems are presented. Two dierent approaches for providing training data are explored, learning from demonstration (supervised learning) and explorative learning (self-supervised learning). A novel learning method fullling the stated demands is presented. The method, qHebb, is based on associative Hebbian learning on data in channel representation. Properties of the method are demonstrated on a vision-based autonomously driving vehicle, where the system learns to directly map low-level image features to control signals. After an initial training period, the system seamlessly continues autonomously. In a quantitative evaluation, the proposed online learning method performed comparably with state of the art batch learning methods.

Erik Ringaby, "Geometric Models for Rolling-shutter and Push-broom Sensors", Linköping Studies in Science and Technology. Dissertations, No. 1615, 2014.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

Almost all cell-phones and camcorders sold today are equipped with a CMOS (Complementary Metal Oxide Semiconductor) image sensor and there is also a general trend to incorporate CMOS sensors in other types of cameras. The CMOS sensor has many advantages over the more conventional CCD (Charge-Coupled Device) sensor such as lower power consumption, cheaper manufacturing and the potential for onchip processing. Nearly all CMOS sensors make use of what is called a rolling shutter readout. Unlike a global shutter readout, which images all the pixels at the same time, a rolling-shutter exposes the image row-by-row. If a mechanical shutter is not used this will lead to geometric distortions in the image when either the camera or the objects in the scene are moving. Smaller cameras, like those in cell-phones, do not have mechanical shutters and systems that do have them will not use them when recording video. The result will look wobbly (jello eect), skewed or otherwise strange and this is often not desirable. In addition, many computer vision algorithms assume that the camera used has a global shutter and will break down if the distortions are too severe.

In airborne remote sensing it is common to use push-broom sensors. These sensors exhibit a similar kind of distortion as that of a rolling-shutter camera, due to the motion of the aircraft. If the acquired images are to be registered to maps or other images, the distortions need to be suppressed.

The main contributions in this thesis are the development of the three-dimensional models for rolling-shutter distortion correction. Previous attempts modelled the distortions as taking place in the image plane, and we have shown that our techniques give better results for hand-held camera motions. The basic idea is to estimate the camera motion, not only between frames, but also the motion during frame capture. The motion is estimated using image correspondences and with these a non-linear optimisation problem is formulated and solved. All rows in the rollingshutter image are imaged at dierent times, and when the motion is known, each row can be transformed to its rectied position. The same is true when using depth sensors such as the Microsoft Kinect, and the thesis describes how to estimate its 3D motion and how to rectify 3D point clouds.

In the thesis it has also been explored how to use similar techniques as for the rolling-shutter case, to correct push-broom images. When a transformation has been found, the images need to be resampled to a regular grid in order to be visualised. This can be done in many ways and dierent methods have been tested and adapted to the push-broom setup.

In addition to rolling-shutter distortions, hand-held footage often has shaky camera motion. It is possible to do ecient video stabilisation in combination with the rectication using rotation smoothing. Apart from these distortions, motion blur is a big problem for hand-held photography. The images will be blurry due to the camera motion and also noisy if taken in low light conditions. One of the contributions in the thesis is a method which uses gyroscope measurements and feature tracking to combine several images, taken with a smartphone, into one resulting image with less blur and noise. This enables the user to take photos which would have otherwise required a tripod.

Bertil Grelsson, "Global Pose Estimation from Aerial Images: Registration with Elevation Models", Linköping Studies in Science and Technology. Thesis, No. 1672, 2014.

AbstractKeywordsBiBTeXDOIFulltext

Abstract

Over the last decade, the use of unmanned aerial vehicles (UAVs) has increased drastically. Originally, the use of these aircraft was mainly military, but today many civil applications have emerged. UAVs are frequently the preferred choice for surveillance missions in disaster areas, after earthquakes or hurricanes, and in hazardous environments, e.g. for detection of nuclear radiation. The UAVs employed in these missions are often relatively small in size which implies payload restrictions.

For navigation of the UAVs, continuous global pose (position and attitude) estimation is mandatory. Cameras can be fabricated both small in size and light in weight. This makes vision-based methods well suited for pose estimation onboard these vehicles. It is obvious that no single method can be used for pose estimation in all dierent phases throughout a ight. The image content will be very dierent on the runway, during ascent, during ight at low or high altitude, above urban or rural areas, etc. In total, a multitude of pose estimation methods is required to handle all these situations. Over the years, a large number of vision-based pose estimation methods for aerial images have been developed. But there are still open research areas within this eld, e.g. the use of omnidirectional images for pose estimation is relatively unexplored.

The contributions of this thesis are three vision-based methods for global egopositioning and/or attitude estimation from aerial images. The rst method for full 6DoF (degrees of freedom) pose estimation is based on registration of local height information with a geo-referenced 3D model. A dense local height map is computed using motion stereo. A pose estimate from navigation sensors is used as an initialization. The global pose is inferred from the 3D similarity transform between the local height map and the 3D model. Aligning height information is assumed to be more robust to season variations than feature matching in a single-view based approach.

The second contribution is a method for attitude (pitch and roll angle) estimation via horizon detection. It is one of only a few methods in the literature that use an omnidirectional (sheye) camera for horizon detection in aerial images. The method is based on edge detection and a probabilistic Hough voting scheme. In a ight scenario, there is often some knowledge on the probability density for the altitude and the attitude angles. The proposed method allows this prior information to be used to make the attitude estimation more robust.

The third contribution is a further development of method two. It is the very rst method presented where the attitude estimates from the detected horizon in omnidirectional images is rened through registration with the geometrically expected horizon from a digital elevation model. It is one of few methods where the ray refraction in the atmosphere is taken into account, which contributes to the highly accurate pose estimates. The attitude errors obtained are about one order of magnitude smaller than for any previous vision-based method for attitude estimation from horizon detection in aerial images.

2013

Marcus Wallenberg, "Components of Embodied Visual Object Recognition: Object Perception and Learning on a Robotic Platform", Linköping Studies in Science and Technology. Thesis, No. 1607, 2013.

AbstractKeywordsBiBTeXFulltext

Abstract

Object recognition is a skill we as humans often take for granted. Due to our formidable object learning, recognition and generalisation skills, it is sometimes hard to see the multitude of obstacles that need to be overcome in order to replicate this skill in an artificial system. Object recognition is also one of the classical areas of computer vision, and many ways of approaching the problem have been proposed. Recently, visually capable robots and autonomous vehicles have increased the focus on embodied recognition systems and active visual search. These applications demand that systems can learn and adapt to their surroundings, and arrive at decisions in a reasonable amount of time, while maintaining high object recognition performance. Active visual search also means that mechanisms for attention and gaze control are integral to the object recognition procedure. This thesis describes work done on the components necessary for creating an embodied recognition system, specifically in the areas of decision uncertainty estimation, object segmentation from multiple cues, adaptation of stereo vision to a specific platform and setting, and the implementation of the system itself. Contributions include the evaluation of methods and measures for predicting the potential uncertainty reduction that can be obtained from additional views of an object, allowing for adaptive target observations. Also, in order to separate a specific object from other parts of a scene, it is often necessary to combine multiple cues such as colour and depth in order to obtain satisfactory results. Therefore, a method for combining these using channel coding has been evaluated. Finally, in order to make use of three-dimensional spatial structure in recognition, a novel stereo vision algorithm extension along with a framework for automatic stereo tuning have also been investigated. All of these components have been tested and evaluated on a purpose-built embodied recognition platform known as Eddie the Embodied.

Freddie Åström, "A Variational Approach to Image Diffusion in Non-Linear Domains", Linköping Studies in Science and Technology. Thesis, No. 1594, 2013.

AbstractKeywordsBiBTeXFulltext

Abstract

Image filtering methods are designed to enhance noisy images captured in situations that are problematic for the camera sensor. Such noisy images originate from unfavourable illumination conditions, camera motion, or the desire to use only a low dose of ionising radiation in medical imaging. Therefore, in this thesis work I have investigated the theory of partial differential equations (PDE) to design filtering methods that attempt to remove noise from images. This is achieved by modeling and deriving energy functionals which in turn are minimized to attain a state of minimum energy. This state is obtained by solving the so called Euler-Lagrange equation. An important theoretical contribution of this work is that conditions are put forward determining when a PDE has a corresponding energy functional. This is in particular described in the case of the structure tensor, a commonly used tensor in computer vision.A primary component of this thesis work is to model adaptive image filtering such that any modification of the image is structure preserving, but yet is noise suppressing. In color image filtering this is a particular challenge since artifacts may be introduced at color discontinuities. For this purpose a non-Euclidian color opponent transformation has been analysed and used to separate the standard RGB color space into uncorrelated components.A common approach to achieve adaptive image filtering is to select an edge stopping function from a set of functions that have proven to work well in the past. The purpose of the edge stopping function is to inhibit smoothing of image features that are desired to be retained, such as lines, edges or other application dependent characteristics. Thus, a step from ad-hoc filtering based on experience towards an application-driven filtering is taken, such that only desired image features are processed. This improves what is characterised as visually relevant features, a topic which this thesis covers, in particular for medical imaging.The notion of what are relevant features is a subjective measure may be different from a layman's opinion compared to a professional's. Therefore, we advocate that any image filtering method should yield an improvement not only in numerical measures but also a visual improvement should be experienced by the respective end-user

2012

Erik Ringaby, "Geometric Computer Vision for Rolling-shutter and Push-broom Sensors", Linköping Studies in Science and Technology. Thesis, No. 1535, 2012.

AbstractKeywordsBiBTeXFulltext

Abstract

Almost all cell-phones and camcorders sold today are equipped with a CMOS (Complementary Metal Oxide Semiconductor) image sensor and there is also a general trend to incorporate CMOS sensors in other types of cameras. The sensor has many advantages over the more conventional CCD (Charge-Coupled Device) sensor such as lower power consumption, cheaper manufacturing and the potential for on-chip processing. Almost all CMOS sensors make use of what is called a rolling shutter. Compared to a global shutter, which images all the pixels at the same time, a rolling-shutter camera exposes the image row-by-row. This leads to geometric distortions in the image when either the camera or the objects in the scene are moving. The recorded videos and images will look wobbly (jello effect), skewed or otherwise strange and this is often not desirable. In addition, many computer vision algorithms assume that the camera used has a global shutter, and will break down if the distortions are too severe.

In airborne remote sensing it is common to use push-broom sensors. These sensors exhibit a similar kind of distortion as a rolling-shutter camera, due to the motion of the aircraft. If the acquired images are to be matched with maps or other images, then the distortions need to be suppressed.

The main contributions in this thesis are the development of the three dimensional models for rolling-shutter distortion correction. Previous attempts modelled the distortions as taking place in the image plane, and we have shown that our techniques give better results for hand-held camera motions.

The basic idea is to estimate the camera motion, not only between frames, but also the motion during frame capture. The motion can be estimated using inter-frame image correspondences and with these a non-linear optimisation problem can be formulated and solved. All rows in the rolling-shutter image are imaged at different times, and when the motion is known, each row can be transformed to the rectified position.

In addition to rolling-shutter distortions, hand-held footage often has shaky camera motion. It has been shown how to do efficient video stabilisation, in combination with the rectification, using rotation smoothing.

In the thesis it has been explored how to use similar techniques as for the rolling-shutter case in order to correct push-broom images, and also how to rectify 3D point clouds from e.g. the Kinect depth sensor.

Johan Hedborg, "Motion and Structure Estimation From Video", Linköping Studies in Science and Technology. Dissertations, No. 1449, 2012.

AbstractKeywordsBiBTeXFulltext

Abstract

Digital camera equipped cell phones were introduced in Japan in 2001, they quickly became popular and by 2003 outsold the entire stand-alone digital camera market. In 2010 sales passed one billion units and the market is still growing. Another trend is the rising popularity of smartphones which has led to a rapid development of the processing power on a phone, and many units sold today bear close resemblance to a personal computer. The combination of a powerful processor and a camera which is easily carried in your pocket, opens up a large eld of interesting computer vision applications.

The core contribution of this thesis is the development of methods that allow an imaging device such as the cell phone camera to estimates its own motion and to capture the observed scene structure. One of the main focuses of this thesis is real-time performance, where a real-time constraint does not only result in shorter processing times, but also allows for user interaction.

In computer vision, structure from motion refers to the process of estimating camera motion and 3D structure by exploring the motion in the image plane caused by the moving camera. This thesis presents several methods for estimating camera motion. Given the assumption that a set of images has known camera poses associated to them, we train a system to solve the camera pose very fast for a new image. For the cases where no a priory information is available a fast minimal case solver is developed. The solver uses ve points in two camera views to estimate the cameras relative position and orientation. This type of minimal case solver is usually used within a RANSAC framework. In order to increase accuracy and performance a renement to the random sampling strategy of RANSAC is proposed. It is shown that the new scheme doubles the performance for the ve point solver used on video data. For larger systems of cameras a new Bundle Adjustment method is developed which are able to handle video from cell phones.

Demands for reduction in size, power consumption and price has led to a redesign of the image sensor. As a consequence the sensors have changed from a global shutter to a rolling shutter, where a rolling shutter image is acquired row by row. Classical structure from motion methods are modeled on the assumption of a global shutter and a rolling shutter can severely degrade their performance. One of the main contributions of this thesis is a new Bundle Adjustment method for cameras with a rolling shutter. The method accurately models the camera motion during image exposure with an interpolation scheme for both position and orientation.

The developed methods are not restricted to cellphones only, but is rather applicable to any type of mobile platform that is equipped with cameras, such as a autonomous car or a robot. The domestic robot comes in many avors, everything from vacuum cleaners to service and pet robots. A robot equipped with a camera that is capable of estimating its own motion while sensing its environment, like the human eye, can provide an eective means of navigation for the robot. Many of the presented methods are well suited of robots, where low latency and real-time constraints are crucial in order to allow them to interact with their environment.

2011

Fredrik Larsson, "Shape Based Recognition – Cognitive Vision Systems in Traffic Safety Applications", Linköping Studies in Science and Technology. Dissertations, No. 1395, 2011.

AbstractKeywordsBiBTeXFulltext

Abstract

Traffic accidents are globally the number one cause of death for people 15-29 years old and is among the top three causes for all age groups 5-44 years. Much of the work within this thesis has been carried out in projects aiming for (cognitive) driver assistance systems and hopefully represents a step towards improving traffic safety.

The main contributions are within the area of Computer Vision, and more specifically, within the areas of shape matching, Bayesian tracking, and visual servoing with the main focus being on shape matching and applications thereof. The different methods have been demonstrated in traffic safety applications, such as bicycle tracking, car tracking, and traffic sign recognition, as well as for pose estimation and robot control.

One of the core contributions is a new method for recognizing closed contours, based on complex correlation of Fourier descriptors. It is shown that keeping the phase of Fourier descriptors is important. Neglecting the phase can result in perfect matches between intrinsically different shapes. Another benefit of keeping the phase is that rotation covariant or invariant matching is achieved in the same way. The only difference is to either consider the magnitude, for rotation invariant matching, or just the real value, for rotation covariant matching, of the complex valued correlation.

The shape matching method has further been used in combination with an implicit star-shaped object model for traffic sign recognition. The presented method works fully automatically on query images with no need for regions-of-interests. It is shown that the presented method performs well for traffic signs that contain multiple distinct contours, while some improvement still is needed for signs defined by a single contour. The presented methodology is general enough to be used for arbitrary objects, as long as they can be defined by a number of regions.

Another contribution has been the extension of a framework for learning based Bayesian tracking called channel based tracking. Compared to earlier work, the multi-dimensional case has been reformulated in a sound probabilistic way and the learning algorithm itself has been extended. The framework is evaluated in car tracking scenarios and is shown to give competitive tracking performance, compared to standard approaches, but with the advantage of being fully learnable.

The last contribution has been in the field of (cognitive) robot control. The presented method achieves sufficient accuracy for simple assembly tasks by combining autonomous recognition with visual servoing, based on a learned mapping between percepts and actions. The method demonstrates that limitations of inexpensive hardware, such as web cameras and low-cost robotic arms, can be overcome using powerful algorithms.

All in all, the methods developed and presented in this thesis can all be used for different components in a system guided by visual information, and hopefully represents a step towards improving traffic safety.

2009

Johan Hedborg, "Pose Estimation and Structure Analysis of Image Sequences", Linköping Studies in Science and Technology. Thesis, No. 1418, 2009.

AbstractKeywordsBiBTeXFulltext

Fredrik Larsson, "Methods for Visually Guided Robotic Systems: Matching, Tracking and Servoing", Linköping Studies in Science and Technology. Thesis, No. 1416, 2009.

AbstractKeywordsBiBTeXFulltext

Johan Sunnegårdh, "Iterative Filtered Backprojection Methods for Helical Cone-Beam CT", Linköping Studies in Science and Technology. Dissertations, No. 1264, 2009.

AbstractKeywordsBiBTeXFulltext

Abstract

State-of-the-art reconstruction algorithms for medical helical cone-beam Computed Tomography (CT) are of type non-exact Filtered Backprojection (FBP). They are attractive because of their simplicity and low computational cost, but they produce sub-optimal images with respect to artifacts, resolution, and noise. This thesis deals with possibilities to improve the image quality by means of iterative techniques.

The first algorithm, Regularized Iterative Weighted Filtered Backprojection (RIWFBP), is an iterative algorithm employing the non-exact Weighted FilteredBackprojection (WFBP) algorithm [Stierstorfer et al., Phys. Med. Biol. 49, 2209-2218, 2004] in the update step. We have measured and compared artifact reduction as well as resolution and noise properties for RIWFBP and WFBP. The results show that artifacts originating in the non-exactness of the WFBP algorithm are suppressed within five iterations without notable degradation in terms of resolution versus noise. Our experiments also indicate that the number of required iterations can be reduced by employing a technique known as ordered subsets.

A small modification of RIWFBP leads to a new algorithm, the Weighted Least Squares Iterative Filtered Backprojection (WLS-IFBP). This algorithm has a slightly lower rate of convergence than RIWFBP, but in return it has the attractive property of converging to a solution of a certain least squares minimization problem. Hereby, theory and algorithms from optimization theory become applicable.

Besides linear regularization, we have examined edge-preserving non-linear regularization.In this case, resolution becomes contrast dependent, a fact that can be utilized for improving high contrast resolution without degrading the signal-to-noise ratio in low contrast regions. Resolution measurements at different contrast levels and anthropomorphic phantom studies confirm this property. Furthermore, an even morepronounced suppression of artifacts is observed.

Iterative reconstruction opens for more realistic modeling of the input data acquisition process than what is possible with FBP. We have examined the possibility to improve the forward projection model by (i) multiple ray models, and (ii) calculating strip integrals instead of line integrals. In both cases, for linearregularization, the experiments indicate a trade off: the resolution is improved atthe price of increased noise levels. With non-linear regularization on the other hand, the degraded signal-to-noise ratio in low contrast regions can be avoided.

Huge input data sizes make experiments on real medical CT data very demanding. To alleviate this problem, we have implemented the most time consuming parts of the algorithms on a Graphics Processing Unit (GPU). These implementations are described in some detail, and some specific problems regarding parallelism and memory access are discussed.

2008

Erik Jonsson, "Channel-Coded Feature Maps for Computer Vision and Machine Learning", Linköping Studies in Science and Technology. Dissertations, No. 1160, 2008.

AbstractKeywordsBiBTeXFulltext

Abstract

This thesis is about channel-coded feature maps applied in view-based object recognition, tracking, and machine learning. A channel-coded feature map is a soft histogram of joint spatial pixel positions and image feature values. Typical useful features include local orientation and color. Using these features, each channel measures the co-occurrence of a certain orientation and color at a certain position in an image or image patch. Channel-coded feature maps can be seen as a generalization of the SIFT descriptor with the options of including more features and replacing the linear interpolation between bins by a more general basis function.

The general idea of channel coding originates from a model of how information might be represented in the human brain. For example, different neurons tend to be sensitive to different orientations of local structures in the visual input. The sensitivity profiles tend to be smooth such that one neuron is maximally activated by a certain orientation, with a gradually decaying activity as the input is rotated.

This thesis extends previous work on using channel-coding ideas within computer vision and machine learning. By differentiating the channel-coded feature maps with respect to transformations of the underlying image, a method for image registration and tracking is constructed. By using piecewise polynomial basis functions, the channel coding can be computed more efficiently, and a general encoding method for N-dimensional feature spaces is presented.

Furthermore, I argue for using channel-coded feature maps in view-based pose estimation, where a continuous pose parameter is estimated from a query image given a number of training views with known pose. The optimization of position, rotation and scale of the object in the image plane is then included in the optimization problem, leading to a simultaneous tracking and pose estimation algorithm. Apart from objects and poses, the thesis examines the use of channel coding in connection with Bayesian networks. The goal here is to avoid the hard discretizations usually required when Markov random fields are used on intrinsically continuous signals like depth for stereo vision or color values in image restoration.

Channel coding has previously been used to design machine learning algorithms that are robust to outliers, ambiguities, and discontinuities in the training data. This is obtained by finding a linear mapping between channel-coded input and output values. This thesis extends this method with an incremental version and identifies and analyzes a key feature of the method -- that it is able to handle a learning situation where the correspondence structure between the input and output space is not completely known. In contrast to a traditional supervised learning setting, the training examples are groups of unordered input-output points, where the correspondence structure within each group is unknown. This behavior is studied theoretically and the effect of outliers and convergence properties are analyzed.

All presented methods have been evaluated experimentally. The work has been conducted within the cognitive systems research project COSPAL funded by EC FP6, and much of the contents has been put to use in the final COSPAL demonstrator system.

2007

Johan Sunnegårdh, "Combining analytical and iterative reconstruction in helical cone-beam CT", Linköping Studies in Science and Technology. Thesis, No. 1301, 2007.

AbstractKeywordsBiBTeXFulltext

Johan Skoglund, "Robust Real-Time Estimation of Region Displacements in Video Sequences", Linköping Studies in Science and Technology. Thesis, No. 1296, 2007.

AbstractKeywordsBiBTeXFulltext

Abstract

The possibility to use real-time computer vision in video sequences gives many opportunities for a system to interact with the environment. Possible ways for interaction are e.g. augmented reality like in the MATRIS project where the purpose is to add new objects into the video sequence, or surveillance where the purpose is to find abnormal events.

The increase of the speed of computers the last years has simplified this process and it is now possible to use at least some of the more advanced computer vision algorithms that are available. The computational speed of computers is however still a problem, for an efficient real-time system efficient code and methods are necessary. This thesis deals with both problems, one part is about efficient implementations using single instruction multiple data (SIMD) instructions and one part is about robust tracking.

An efficient real-time system requires efficient implementations of the used computer vision methods. Efficient implementations requires knowledge about the CPU and the possibilities given. In this thesis, one method called SIMD is explained. SIMD is useful when the same operation is applied to multiple data which usually is the case in computer vision, the same operation is executed on each pixel.

Following the position of a feature or object in a video sequence is called tracking. Tracking can be used for a number of applications. The application in this thesis is to use tracking for pose estimation. One way to do tracking is to cut out a small region around the feature, creating a patch and find the position on this patch in the other frames. To find the position, a measure of the difference between the patch and the image in a given position is used. This thesis thoroughly investigates the sum of absolute difference (SAD) error measure. The investigation involves different ways to improve the robustness and to decrease the average error. One method to estimate the average error, the covariance of the position error is proposed. An estimate of the average error is needed when different measurements are combined.

Finally, a system for camera pose estimation is presented. The computer vision part of this system is based on the result in this thesis. This presentation contains also a discussion about the result of this system.

2005

Fredrik Viksten, "Methods for vision-based robotic automation", Linköping Studies in Science and Technology. Thesis, No. 1161, 2005.

AbstractKeywordsBiBTeXFulltext

Robert Söderberg, "Compact Representations and Multi-cue Integration for Robotics", Linköping Studies in Science and Technology. Thesis, No. 1160, 2005.

AbstractKeywordsBiBTeXFulltext

Abstract

This thesis presents methods useful in a bin picking application, such as detection and representation of local features, pose estimation and multi-cue integration.

The scene tensor is a representation of multiple line or edge segments and was first introduced by Nordberg in [30]. A method for estimating scene tensors from gray-scale images is presented. The method is based on orientation tensors, where the scene tensor can be estimated by correlations of the elements in the orientation tensor with a number of 1D filters. Mechanisms for analyzing the scene tensor are described and an algorithm for detecting interest points and estimating feature parameters is presented. It is shown that the algorithm works on a wide spectrum of images with good result.

Representations that are invariant with respect to a set of transformations are useful in many applications, such as pose estimation, tracking and wide baseline stereo. The scene tensor itself is not invariant and three different methods for implementing an invariant representation based on the scene tensor is presented. One is based on a non-linear transformation of the scene tensor and is invariant to perspective transformations. Two versions of a tensor doublet is presented, which is based on a geometry of two interest points and is invariant to translation, rotation and scaling. The tensor doublet is used in a framework for view centered pose estimation of 3D objects. It is shown that the pose estimation algorithm has good performance even though the object is occluded and has a different scale compared to the training situation.

An industrial implementation of a bin picking application have to cope with several different types of objects. All pose estimation algorithms use some kind of model and there is yet no model that can cope with all kinds of situations and objects. This thesis presents a method for integrating cues from several pose estimation algorithms for increasing the system stability. It is also shown that the same framework can also be used for increasing the accuracy of the system by using cues from several views of the object. An extensive test with several different objects, lighting conditions and backgrounds shows that multi-cue integration makes the system more robust and increases the accuracy.

Finally, a system for bin picking is presented, built from the previous parts of this thesis. An eye in hand setup is used with a standard industrial robot arm. It is shown that the system works for real bin-picking situations with a positioning error below 1 mm and an orientation error below 1^o degree for most of the different situations.

2004

Per-Erik Forssén, "Low and Medium Level Vision Using Channel Representations", Linköping Studies in Science and Technology. Dissertations, No. 858, 2004.

AbstractKeywordsBiBTeXFulltext

Abstract

This thesis introduces and explores a new type of representation for low and medium level vision operations called channel representation. The channel representation is a more general way to represent information than e.g. as numerical values, since it allows incorporation of uncertainty, and simultaneous representation of several hypotheses. More importantly it also allows the representation of “no information” when no statement can be given. A channel representation of a scalar value is a vector of channel values, which are generated by passing the original scalar value through a set of kernel functions. The resultant representation is sparse and monopolar. The word sparse signifies that information is not necessarily present in all channels. On the contrary, most channel values will be zero. The word monopolar signifies that all channel values have the same sign, e.g. they are either positive or zero. A zero channel value denotes “no information”, and for non-zero values, the magnitude signifies the relevance.

In the thesis, a framework for channel encoding and local decoding of scalar values is presented. Averaging in the channel representation is identified as a regularised sampling of a probability density function. A subsequent decoding is thus a mode estimation technique.'

The mode estimation property of channel averaging is exploited in the channel smoothing technique for image noise removal. We introduce an improvement to channel smoothing, called alpha synthesis, which deals with the problem of jagged edges present in the original method. Channel smoothing with alpha synthesis is compared to mean-shift filtering, bilateral filtering, median filtering, and normalized averaging with favourable results.

A fast and robust blob-feature extraction method for vector fields is developed. The method is also extended to cluster constant slopes instead of constant regions. The method is intended for view-based object recognition and wide baseline matching. It is demonstrated on a wide baseline matching problem.

A sparse scale-space representation of lines and edges is implemented and described. The representation keeps line and edge statements separate, and ensures that they are localised by inhibition from coarser scales. The result is however still locally continuous, in contrast to non-max-suppression approaches, which introduce a binary threshold.

The channel representation is well suited to learning, which is demonstrated by applying it in an associative network. An analysis of representational properties of associative networks using the channel representation is made.

Finally, a reactive system design using the channel representation is proposed. The system is similar in idea to recursive Bayesian techniques using particle filters, but the present formulation allows learning using the associative networks.

Björn Johansson, "Low Level Operations and Learning in Computer Vision", Linköping Studies in Science and Technology. Dissertations, No. 912, 2004.

AbstractKeywordsBiBTeXFulltext

Abstract

This thesis presents some concepts and methods for low level computer vision and learning, with object recognition as the primary application.

An efficient method for detection of local rotational symmetries in images is presented. Rotational symmetries include circle patterns, star patterns, and certain high curvature patterns. The method for detection of these patterns is based on local moments computed on a local orientation description in double angle representation, which makes the detection invariant to the sign of the local direction vectors. Some methods are also suggested to increase the selectivity of the detection method. The symmetries can serve as feature descriptors and interest points for use in hierarchical matching structures for object recognition and related problems.

A view-based method for 3D object recognition and estimation of object pose from a single image is also presented. The method is based on simple feature vector matching and clustering. Local orientation regions computed at interest points are used as features for matching. The regions are computed such that they are invariant to translation, rotation, and locally invariant to scale. Each match casts a vote on a certain object pose, rotation, scale, and position, and a joint estimate is found by a clustering procedure. The method is demonstrated on a number of real images and the region features are compared with the SIFT descriptor, which is another standard region feature for the same application.

Finally, a new associative network is presented which applies the channel representation for both input and output data. This representation is sparse and monopolar, and is a simple yet powerful representation of scalars and vectors. It is especially suited for representation of several values simultaneously, a property that is inherited by the network and something which is useful in many computer vision problems. The chosen representation enables us to use a simple linear model for non-linear mappings. The linear model parameters are found by solving a least squares problem with a non-negative constraint, which gives a sparse regularized solution.

2003

Qingfen Lin, "Enhancement, Extraction, and Visualization of 3D Volume Data", Linköping Studies in Science and Technology. Dissertations, No. 824, 2003.

AbstractKeywordsBiBTeXFulltext

Abstract

Three-dimensional (3D) volume data has become increasingly common with the emergence and wide availability of modern 3D image acquisition techniques. The demand for computerized analysis and visualization techniques is constantly growing to utilize the abundant information embedded in these data.

This thesis consists of three parts. The first part presents methods of analyzing 3D volume data by using second derivatives. Harmonic functions are used to combine the non-orthogonal second derivative operators into an orthogonal basis. Three basic features, magnitude, shape, and orientation, are extracted from the second derivative responses after diagonalizing the Hessian matrix. Two applications on magnetic resonance angiography (MRA) data are presented. One of them utilizes a scale-space and the second order variation to enhance the vascular system by discriminating for string structures. The other one employs the local shape information to detect cases of stenosis.

The second part of the thesis discusses some modifications of the fast marching method in 2D and 3D space. By shifting the input and output grids relative to each other, we show that the sampled cost functions are used in a more consistent way. We present new algorithms for anisotropic fast marching which incorporate orientation information during the marching process. Three applications illustrate the usage of the fast marching methods. The first one extracts a guide wire as a minimum-cost path on a salience distance map of a line detection result of a flouroscopy image. The second application extracts the vascular tree from a whole bodyMRA volume. In the third application, a 3D guide wire is reconstructed from a pair of biplane images using the minimum-cost path formulation.

The third part of the thesis proposes a new frame-coherent volume rendering algorithm. It is an extension of the algorithm by Gudmundsson and Rand´en (1990). The new algorithm is capable of efficiently generating rotation sequences around an arbitrary axis. Essentially, it enables the ray-casting procedure to quickly approach the hull of the object using the so called shadow-lines recorded from the previous frame.

2002

Gunnar Farnebäck, "Polynomial expansion for orientation and motion estimation", Linköping Studies in Science and Technology. Dissertations, No. 790, 2002.

AbstractKeywordsBiBTeXFulltext

Abstract

This thesis introduces a new signal transform, called polynomial expansion, and based on this develops novel methods for estimation of orientation and motion. The methods are designed exclusively in the spatial domain and can be used for signals of any dimensionality.

Two important concepts in the use of the spatial domain for signal processing is projections into subspaces, e.g. the subspace of second degree polynomials, and representations by frames, e.g. wavelets. It is shown how these concepts can be unified in a least squares framework for representation of nite dimensional vectors by bases, frames, subspace bases, and subspace frames.

This framework is used to give a new derivation of normalized convolution, a method for signal analysis that takes uncertainty in signal values into account and also allows for spatial localization of the analysis functions.

Polynomial expansion is a transformation which at each point transforms the signal into a set of expansion coefficients with respect to a polynomial local signal model. The expansion coefficients are computed using normalized convolution. As a consequence polynomial expansion inherits the mechanism for handling uncertain signals and the spatial localization feature allows good control of the properties of the transform. It is shown how polynomial expansion can be computed very efficiently.

As an application of polynomial expansion, a novel method for estimation of orientation tensors is developed. A new concept for orientation representation, orientation functionals, is introduced and it is shown that orientation tensors can be considered a special case of this representation. By evaluation on a test sequence it is demonstrated that the method performs excellently.

Considering an image sequence as a spatiotemporal volume, velocity can be estimated from the orientations present in the volume. Two novel methods for velocity estimation are presented, with the common idea to combine the orientation tensors over some region for estimation of the velocity field according to a parametric motion model, e.g. affine motion. The first method involves a simultaneous segmentation and velocity estimation algorithm to obtain appropriate regions. The second method is designed for computational efficiency and uses local neighborhoods instead of trying to obtain regions with coherent motion. By evaluation on the Yosemite sequence, it is shown that both methods give substantially more accurate results than previously published methods.

Another application of polynomial expansion is a novel displacement estimation algorithm, i.e. an algorithm which estimates motion from only two consecutive frames rather than from a whole spatiotemporal volume. This approach is necessary when the motion is not temporally coherent, e.g. because the camera is affected by vibrations. It is shown how moving objects can robustly be detected in such image sequences by using the plane+parallax approach to separate out the background motion.

To demonstrate the power of being able to handle uncertain signals it is shown how normalized convolution and polynomial expansion can be computed for interlacedvideo signals. Together with the displacement estimation algorithm this gives a method to estimate motion from a single interlaced frame.

2001

Björn Johansson, "Multiscale Curvature Detection in Computer Vision", Linköping Studies in Science and Technology. Thesis, No. 877, 2001.

AbstractKeywordsBiBTeXFulltext

Henrik Turbell, "Cone-Beam Reconstruction Using Filtered Backprojection", Linköping Studies in Science and Technology. Dissertations, No. 672, 2001.

AbstractKeywordsBiBTeXFulltext

Per-Erik Forssén, "Sparse Representations for Medium Level Vision", Linköping Studies in Science and Technology. Thesis, No. 869, 2001.

AbstractKeywordsBiBTeXFulltext

Qingfen Lin, "Enhancement, Detection, and Visualization of 3D Volume Data", Linköping Studies in Science and Technology. Thesis, No. 903, 2001.

AbstractKeywordsBiBTeX

2000

Thord Andersson, "Learning in a Reactive Robotic Architecture", Linköping Studies in Science and Technology. Thesis, No. 817, 2000.

AbstractKeywordsBiBTeXFulltext

Anders Moe, "Passive Aircraft Altitude Estimation using Computer Vision", Linköping Studies in Science and Technology. Thesis, No. 847, 2000.

AbstractKeywordsBiBTeXFulltext

1999

Magnus Hemmendorff, "Single and Multiple Motion Field Estimation", Linköping Studies in Science and Technology. Thesis, No. 764, 1999.

AbstractKeywordsBiBTeXFulltext

Gunnar Farnebäck, "Spatial domain methods for orientation and velocity estimation", Linköping Studies in Science and Technology. Thesis, No. 755, 1999.

AbstractKeywordsBiBTeXFulltext

Abstract

In this thesis, novel methods for estimation of orientation and velocity are presented. The methods are designed exclusively in the spatial domain.

Two important concepts in the use of the spatial domain for signal processing is projections into subspaces, e.g. the subspace of second degree polynomials, and representations by frames, e.g. wavelets. It is shown how these concepts can be unified in a least squares framework for representation of finite dimensional vectors by bases, frames, subspace bases, and subspace frames.

This framework is used to give a new derivation of Normalized Convolution, a method for signal analysis that takes uncertainty in signal values into account and also allows for spatial localization of the analysis functions.

With the help of Normalized Convolution, a novel method for orientation estimation is developed. The method is based on projection onto second degree polynomials and the estimates are represented by orientation tensors. A new concept for orientation representation, orientation functionals, is introduced and it is shown that orientation tensors can be considered a special case of this representation. A very efficient implementation of the estimation method is presented and by evaluation on a test sequence it is demonstrated that the method performs excellently.

Considering an image sequence as a spatiotemporal volume, velocity can be estimated from the orientations present in the volume. Two novel methods for velocity estimation are presented, with the common idea to combine the orientation tensors over some region for estimation of the velocity fkield according to a motion model, e.g. affine motion. The first method involves a simultaneous segmentation and velocity estimation algorithm to obtain appropriate regions. The second method is designed for computational efficiency and uses local neighborhoods instead of trying to obtain regions with coherent motion. By evaluation on the Yosemite sequence, it is shown that both methods give substantially more accurate results than previously published methods.

1998

Magnus Borga, "Learning Multidimensional Signal Processing", Linköping Studies in Science and Technology. Dissertations, No. 531, 1998.

AbstractKeywordsBiBTeXFulltext

Abstract

The subject of this dissertation is to show how learning can be used for multidimensional signal processing, in particular computer vision. Learning is a wide concept, but it can generally be defined as a system’s change of behaviour in order to improve its performance in some sense.

Learning systems can be divided into three classes: supervised learning, reinforcement learning and unsupervised learning. Supervised learning requires a set of training data with correct answers and can be seen as a kind of function approximation. A reinforcement learning system does not require a set of answers. It learns by maximizing a scalar feedback signal indicating the system’s performance. Unsupervised learning can be seen as a way of finding a good representation of the input signals according to a given criterion.

In learning and signal processing, the choice of signal representation is a central issue. For high-dimensional signals, dimensionality reduction is often necessary. It is then important not to discard useful information. For this reason, learning methods based on maximizing mutual information are particularly interesting.

A properly chosen data representation allows local linear models to be used in learning systems. Such models have the advantage of having a small number of parameters and can for this reason be estimated by using relatively few samples. An interesting method that can be used to estimate local linear models is canonical correlation analysis (CCA). CCA is strongly related to mutual information. The relation between CCA and three other linear methods is discussed. These methods are principal component analysis (PCA), partial least squares (PLS) and multivariate linear regression (MLR). An iterative method for CCA, PCA, PLS and MLR, in particular low-rank versions of these methods, is presented.

A novel method for learning filters for multidimensional signal processing using CCA is presented. By showing the system signals in pairs, the filters can be adapted to detect certain features and to be invariant to others. A new method for local orientation estimation has been developed using this principle. This method is significantly less sensitive to noise than previously used methods.

Finally, a novel stereo algorithm is presented. This algorithm uses CCA and phase analysis to detect the disparity in stereo images. The algorithm adapts filters in each local neighbourhood of the image in a way which maximizes the correlation between the filtered images. The adapted filters are then analysed to find the disparity. This is done by a simple phase analysis of the scalar product of the filters. The algorithm can even handle cases where the images have different scales. The algorithm can also handle depth discontinuities and give multiple depth estimates for semi-transparent images.

Jörgen Karlholm, "Local Signal Models for Image Sequence Analysis", Linköping Studies in Science and Technology. Dissertations, No. 536, 1998.

AbstractKeywordsBiBTeXFulltext

Abstract

The thesis describes novel methods for image motion computation and template matching.

A multiscale algorithm for energy-based estimation and representation of local spatiotemporal structure by second order symmetric tensors is presented. An efficient spatiotemporal implementation of a signalmodellingmethod called normalized convolution is described. This provides a means to handle signals with varying degree of reliability.

As an application of the above results, a smooth pursuit motion tracking algorithm that uses observations of both targetmotion and position for camera head control and motion prediction is described. The target is detected using a novel motion field segmentation algorithm which assumes that the motion fields of the target and its immediate vicinity, at least occasionally, each can be modelled by a single parameterized motion model. A method to eliminate camera-induced background motion in the case of a pan/tilt rotating camera is suggested.

In a second application, a high-precision image motion estimation algorithm performing clustering in motion parameter space is developed. The algorithm, which can handle multiple motions by simultaneous motion parameter estimation and image segmentation, iteratively maximizes the posterior probability of the motion parameter set given the observed local spatiotemporal structure tensor field. The probabilistic formulation provides a natural way to incorporate additional prior information about the segmentation of the scene into the objective function. A simple homotopy continuation method (embedding algorithm) is used to increase the likelihood of convergence to a nearoptimal solution.

The final part of the thesis is concerned with tracking of (partially) occluded targets. An algorithm for target tracking in head-up display sequences is presented. The method generalizes cross-correlation coefficient matching by introducing a signal confidencebased distance metric. To handle target shape changes, a method for template mask shape-adaptation based on geometric transformation parameter optimisation is introduced. The presence of occluding objects makes local structure descriptors (e.g., the gradient) unreliable, which means that only pixelwise comparisons of target and template can be made, unless the local structure operators are modified to take into account the varying signal certainty. Normalized convolution provides the means for such a modification. This is demonstrated in a section on phase-based target tracking, which also contains a presentation of a generic method for tracking of occluded targets by combining normalized convolution with iterative reweighting.

1997

Tomas Landelius, "Reinforcement Learning and Distributed Local Model Synthesis", Linköping Studies in Science and Technology. Dissertations, No. 469, 1997.

AbstractKeywordsBiBTeXFulltext

Abstract

Reinforcement learning is a general and powerful way to formulate complex learning problems and acquire good system behaviour. The goal of a reinforcement learning system is to maximize a long term sum of instantaneous rewards provided by a teacher. In its extremum form, reinforcement learning only requires that the teacher can provide a measure of success. This formulation does not require a training set with correct responses, and allows the system to become better than its teacher.

In reinforcement learning much of the burden is moved from the teacher to the training algorithm. The exact and general algorithms that exist for these problems are based on dynamic programming (DP), and have a computational complexity that grows exponentially with the dimensionality of the state space. These algorithms can only be applied to real world problems if an efficient encoding of the state space can be found.

To cope with these problems, heuristic algorithms and function approximation need to be incorporated. In this thesis it is argued that local models have the potential to help solving problems in high-dimensional spaces and that global models have not. This is motivated with the biasvariance dilemma, which is resolved with the assumption that the system is constrained to live on a low-dimensional manifold in the space of inputs and outputs. This observation leads to the introduction of bias in terms of continuity and locality.

A linear approximation of the system dynamics and a quadratic function describing the long term reward are suggested to constitute a suitable local model. For problems involving one such model, i.e. linear quadratic regulation problems, novel convergence proofs for heuristic DP algorithms are presented. This is one of few available convergence proofs for reinforcement learning in continuous state spaces.

Reinforcement learning is closely related to optimal control, where local models are commonly used. Relations to present methods are investigated, e.g. adaptive control, gain scheduling, fuzzy control, and jump linear systems. Ideas from these areas are compiled in a synergistic way to produce a new algorithm for heuristic dynamic programming where function parameters and locality, expressed as model applicability, are learned on-line. Both top-down and bottom-up versions are presented.

The emerging local models and their applicability need to be memorized by the learning system. The binary tree is put forward as a suitable data structure for on-line storage and retrieval of these functions.

1996

Jörgen Karlholm, "Efficient Spatiotemporal Filtering and Modelling", Linköping Studies in Science and Technology. Thesis, No. 562, 1996.

AbstractKeywordsBiBTeXFulltext

1995

Carl-Johan Westelius, "Focus of attention and gaze control for robot vision", Linköping Studies in Science and Technology. Dissertations, No. 379, 1995.

AbstractKeywordsBiBTeXFulltext

Abstract

This thesis deals with focus of attention control in active vision systems. A framework for hierarchical gaze control in a robot vision system is presented, and an implementation for a simulated robot is described. The robot is equipped with a heterogeneously sampled imaging system, a fovea, resembling the spatially varying resolution of a human retina. The relation between foveas and multiresolution image processing as well as implications for image operations are discussed.

A stereo algorithm based on local phase differences is presented both as a stand alone algorithm and as a part of a robot vergence control system. The algorithm is fast and can handle large disparities and maintaining subpixel accuracy. The method produces robust and accurate estimates of displacement on synthetic as well as real life stereo images. Disparity filter design is discussed and a number of filters are tested, e.g. Gabor filters and lognorm quadrature filters. A design method for disparity filters having precisely one phase cycle is also presented.

A theory for sequentially defined data modified focus of attention is presented. The theory is applied to a preattentive gaze control system consisting of three cooperating control strategies. The first is an object finder that uses circular symmetries as indications for possible object and directs the fixation point accordingly. The second is an edge tracker that makes the fixation point follow structures in the scene. The third is a camera vergence control system which assures that both eyes are fixating on the same point. The coordination between the strategies is handled using potential fields in the robot parameter space.

Finally, a new focus of attention method for disregarding filter responses from already modelled structures is presented. The method is based on a filtering method, normalized convolution, originally developed for filtering incomplete and uncertain data. By setting the certainty of the input data to zero in areas of known or predicted signals, a purposive removal of operator responses can be obtained. On succeeding levels, image features from these areas become 'invisible' and consequently do not attract the attention of the system. This technique also allows the system to effectively explore new events. By cancelling known, or modeled, signals the attention of the system is shifted to new events not yet described.

Magnus Borga, "Reinforcement Learning Using Local Adaptive Models", Linköping Studies in Science and Technology. Thesis, No. 507, 1995.

AbstractKeywordsBiBTeXFulltext

1994

Klas Nordberg, "Signal Representation and Processing using Operator Groups", Linköping Studies in Science and Technology. Dissertations, No. 366, 1994.

AbstractKeywordsBiBTeXFulltext

Abstract

This thesis presents a signal representation in terms of operators. The signal is assumed to be an element of a vector space and subject to transformations of operators. The operators form continuous groups, so-called Lie groups. The representation can be used for signals in general, in particular if spatial relations are undefinied and it does not require a basis of the signal space to be useful.

Special attention is given to orthogonal operator groups which are generated by anti-Hermitian operators by means of the exponential mapping. It is shown that the eigensystem of the group generator is strongly related to properties of the corresponding operator group. For one-parameter orthogonal operator groups, a phase concept is introduced. This phase can for instance be used to distinguish between spatially even and odd signals and, therefore, corresponds to the usual phase for multi-dimensional signals.

Given one operator group that represents the variation of the signal and one operator group that represents the variation of a corresponding feature descriptor, an equivariant mapping maps the signal to the descriptor such that the two operator groups correspond. Suficient conditions are derived for a general mapping to be equivariant with respect to a pair of operator groups. These conditions are expressed in terms of the generators of the two operator groups. As a special case, second order homo-geneous mappings are considered, and examples of how second order mappings can be used to obtain different types of feature descriptors are presented, in particular for operator groups that are homomorphic to rotations in two and three dimensions, respectively. A generalization of directed quadrature lters is made. All feature extraction algorithms that are presented are discussed in terms of phase invariance.

Simple procedures that estimate group generators which correspond to one-parameter groups are derived and tested on an example. The resulting generator is evaluated by using its eigensystem in implementations of two feature extraction algorithms. It is shown that the resulting feature descriptor has good accuracy with respect to the corresponding feature value, even in the presence of signal noise.

Carl-Fredrik Westin, "A Tensor Framework for Multidimensional Signal Processing", Linköping Studies in Science and Technology. Dissertations, No. 348, 1994.

AbstractKeywordsBiBTeXFulltext

Abstract

This thesis deals with ltering of multidimensional signals. A large part of the thesis is devoted to a novel filtering method termed "Normalized convolution". The method performs local expansion of a signal in a chosen lter basis which not necessarily has to be orthonormal. A key feature of the method is that it can deal with uncertain data when additional certainty statements are available for the data and/or the lters. It is shown how false operator responses due to missing or uncertain data can be significantly reduced or eliminated using this technique. Perhaps the most well-known of such eects are the various 'edge effects' which invariably occur at the edges of the input data set. The method is an example of the signal/certainty - philosophy, i.e. the separation of both data and operator into a signal part and a certainty part. An estimate of the certainty must accompany the data. Missing data are simply handled by setting the certainty to zero. Localization or windowing of operators is done using an applicability function, the operator equivalent to certainty, not by changing the actual operator coefficients. Spatially or temporally limited operators are handled by setting the applicability function to zero outside the window.

The use of tensors in estimation of local structure and orientation using spatiotemporal quadrature filters is reviewed and related to dual tensor bases. The tensor representation conveys the degree and type of local anisotropy. For image sequences, the shape of the tensors describe the local structure of the spatiotemporal neighbourhood and provides information about local velocity. The tensor representation also conveys information for deciding if true flow or only normal flow is present. It is shown how normal flow estimates can be combined into a true flow using averaging of this tensor eld description.

Important aspects of representation and techniques for grouping local orientation estimates into global line information are discussed. The uniformity of some standard parameter spaces for line segmentation is investigated. The analysis shows that, to avoid discontinuities, great care should be taken when choosing the parameter space for a particular problem. A new parameter mapping well suited for line extraction, the Möbius strip parameterization, is de ned. The method has similarities to the Hough Transform.

Estimation of local frequency and bandwidth is also discussed. Local frequency is an important concept which provides an indication of the appropriate range of scales for subsequent analysis. One-dimensional and two-dimensional examples of local frequency estimation are given. The local bandwidth estimate is used for dening a certainty measure. The certainty measure enables the use of a normalized averaging process increasing robustness and accuracy of the frequency statements.

1993

Tomas Landelius, "Behavior Representation by Growing a Learning Tree", Linköping Studies in Science and Technology. Thesis, No. 397, 1993.

AbstractKeywordsBiBTeXFulltext

Abstract

The work presented in this thesis is based on the basic idea of learning by reinforcement, within the theory of behaviorism. The reason for this choice is the generality of such an approach, especially that the reinforcement learning paradigm allows systems to be designed which can improve their behavior beyond that of their teacher. The role of the teacher is to define the reinforcement function, which acts as a description of the problem the machine is to solve.

Learning is considered to be a bootstrapping procedure. Fragmented past experience, of what to do when performing well, is used for response generation. The new response, in its turn, adds more information to the system about the environment. Gained knowledge is represented by a behavior probability density function. This density function is approximated with a number of normal distributions which are stored in the nodes of a binary tree. The tree structure is grown by applying a recursive algorithm to the stored stimuli-response combinations, called decisions. By considering both the response and the stimulus, the system is able to bring meaning to structures in the input signal. The recursive algorithm is first applied to the whole set of stored decisions. A mean decision vector and a covariance matrix are calculated and stored in the root node. The decision space is then partitioned into two halves across the direction of maximal data variation. This procedure is now repeated recursively for each of the two halves of the decision space, forming a binary tree with mean vectors and covariance matrices in its nodes.

The tree is the system's guide to response generation. Given a stimulus, the system searches for responses likely to result in highly reinforced decisions. This is accomplished by treating the sum of the normal distributions in the leaves as distribution describing the behavior of the system. The sum of normal distributions, with the current stimulus held fixed, is finally used for random generation of the response.

This procedure makes it possible for the system to have several equally plausible responses to one stimulus. Not applying maximum likelihood principles will make the system more explorative and reduce its risk of being trapped in local minima.

The performance and complexity of the learning tree is investigated and compared to some well known alternative methods. Presented are also some simple, yet principally important, experiments verifying the behavior of the proposed algorithm.

1992

Carl-Johan Westelius, "Preattentive gaze control for robot vision", Linköping Studies in Science and Technology. Thesis, No. 322, 1992.

AbstractKeywordsBiBTeX

Mats T. Andersson, "Controllable Multi-dimensional Filters and Models in Low-Level Computer Vision", Linköping Studies in Science and Technology. Dissertations, No. 282, 1992.

AbstractKeywordsBiBTeXFulltext

1991

Carl-Fredrik Westin, "Feature extraction based on a tensor image description", Linköping Studies in Science and Technology. Thesis, No. 288, 1991.

AbstractKeywordsBiBTeXFulltext

Håkan Bårman, "Hierarchical curvature estimation in computer vision", Linköping Studies in Science and Technology. Dissertations, No. 253, 1991.

AbstractKeywordsBiBTeXFulltext

Leif Haglund, "Adaptive Multidimensional Filtering", Linköping Studies in Science and Technology. Dissertations, No. 284, 1991.

AbstractKeywordsBiBTeXFulltext

1989

Håkan Bårman, "Curvature Estimation and Description", Linköping Studies in Science and Technology. Thesis, No. 167, 1989.

AbstractKeywordsBiBTeX

Mats Andersson, "Image Feature Representation for Analogue VLSI Representation", Linköping Studies in Science and Technology. Thesis, No. 193, 1989.

KeywordsBiBTeX

Leif Haglund, "Hierarchical Scale Analysis of Images Using Phase Description", Linköping Studies in Science and Technology. Thesis, No. 168, 1989.

AbstractKeywordsBiBTeX

1988

Josef Bigün, "Local symmetry features in image processing", Linköping Studies in Science and Technology. Dissertations, No. 179, 1988.

AbstractKeywordsBiBTeXFulltext

1987

Johan Wiklund, "Image Sequence Analysis for Tracking of Moving Objects", Linköping Studies in Science and Technology. Thesis, No. 107, 1987.

AbstractKeywordsBiBTeX

1986

Josef Bigun, "Circular Symmetry Models in Image Processing", Linköping Studies in Science and Technology. Thesis, No. 85, 1986.

AbstractKeywordsBiBTeXFulltext

1982

Hans Knutsson, "Filtering and reconstruction in image processing", Linköping Studies in Science and Technology. Dissertations, No. 88, 1982.

AbstractKeywordsBiBTeXFulltext

Other

2021

Sanath Narayan, Hisham Cholakkal, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, Ling Shao, "D2-Net", Weakly-Supervised Action Localization via Discriminative Embeddingsand Denoised Activations, No. 2012.06440, 2021.

AbstractBiBTeX

Kanchana Ranasinghe, Muzammal Naseer, Munawar Hayat, Salman Khan, Fahad Shahbaz Khan, "Orthogonal Projection Loss", -, No. 2103.14021, 2021.

AbstractBiBTeX

Sanath Narayan, Akshita Gupta, Salman Khan, Fahad Shahbaz Khan, Ling Shao, Mubarak Shah, "Discriminative Region-based Multi-Label Zero-Shot Learning", -, No. 2108.09301, 2021.

AbstractBiBTeX

Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Mubarak Shah, "Handwriting Transformers", -, No. 2104.03964, 2021.

AbstractBiBTeX

KJ Joseph, Salman Khan, Fahad Shahbaz Khan, Vineeth N Balasubramanian, "Towards Open World Object Detection", -, No. 2103.02603, 2021.

AbstractBiBTeX

2018

Klas Nordberg, "Introduction to Representations and Estimation in Geometry", -, 2018.

AbstractKeywordsBiBTeXFulltext

2001

Björn Johansson, "Rotational Symmetries, a Quick Tutorial", -, 2001.

KeywordsBiBTeXFulltext

Reports

2017

Michael Felsberg, "Five years after the Deep Learning revolution of computer vision: State of the art methods for online image and video analysis", -, 2017.

AbstractKeywordsBiBTeXFulltext

Bertil Grelsson, Michael Felsberg, "Performance boost in Convolutional Neural Networks by tuning shifted activation functions", -, 2017.

AbstractKeywordsBiBTeX

Abdelrahman Eldesokey, "Normalized Convolutional Neural Networks for Sparse Data", LiTH-ISY-R, No. 3096, 2017.

KeywordsBiBTeX

2011

Michael Felsberg, Fredrik Larsson, Han Wang, Anders Ynnerman, Thomas Schön, "Torchlight Navigation", LiTH-ISY-R, No. 3004, 2011.

AbstractKeywordsBiBTeXFulltext

Fredrik Larsson, "Automatic 3D Model Construction for Turn-Table Sequences - A Simplification", LiTH-ISY-R, No. 3022, 2011.

AbstractKeywordsBiBTeXFulltext

2010

Maria Magnusson, "Short on camera geometry and camera calibration", LiTH-ISY-R, No. 3070, 2010.

AbstractKeywordsBiBTeXFulltext

Fredrik Viksten, Per-Erik Forssén, "Maximally Robust Range Regions", LiTH-ISY-R, No. 2961, 2010.

AbstractKeywordsBiBTeX

Klas Nordberg, Fredrik Viksten, "A local geometry based descriptor for 3D data: Addendum on rank and segment extraction", LiTH-ISY-R, No. 2951, 2010.

AbstractKeywordsBiBTeX

2008

Erik Jonsson, "Object Recognition using Channel-Coded Feature Maps: C++ Implementation Documentation", LiTH-ISY-R, No. 2838, 2008.

AbstractKeywordsBiBTeXFulltext

2006

Michael Felsberg, Johan Wiklund, Erik Jonsson, Anders Moe, Gösta Granlund, "Exploratory Learning Structure in Artificial Cognitive Systems", LiTH-ISY-R, No. 2738, 2006.

AbstractKeywordsBiBTeXFulltext

Abstract

We discuss the learning strategy in terms of learning scenarios provided by the user. This interaction between user (’teacher’) and system is a major difference to most existing systems where the system designer places his world model into the system. We believe that this is the key to extendable robust system behavior and successful interaction of humans and artificial cognitive systems.

We furthermore address the issue of bootstrapping the system, and, in particular, the visual recognition module.We give some more in-depth details about our recognition method and how feedback from higher levels is implemented. The described system is however work in progress and no final results are available yet. The available preliminary results that we have achieved so far, clearly point towards a successful proof of the architecture concept.

2005

Gerald Sommer, Gösta Granlund, Oliver Granert, Martin Krause, Klas Nordberg, Christian Perwass, Robert Söderberg, Fredrik Viksten, Marco Chavarria, "Information Society Technologies (IST) programme: Final Report", -, 2005.

AbstractKeywordsBiBTeXFulltext

Per-Erik Forssen, Anders Moe, "Contour Descriptors for View-Based Object Recognition", LiTH-ISY-R, No. 2706, 2005.

AbstractKeywordsBiBTeX

Olle Seger, Maria Magnusson Seger, "The MATLAB/C program take - a program for simulation of X-ray projections from 3D volume data. Demonstration of beam-hardening artefacts in subsequent CT reconstruction.", LiTH-ISY-R, No. 2682, 2005.

AbstractKeywordsBiBTeXFulltext

Abstract

The MATLAB/C program take version 3.1 is a program for simulation of X-ray projections from 3D volume data. It is based on an older C version by Muller-Merbach as well as an extended C version by Turbell. The program can simulate 2D X-ray projections from 3D objects. These data can then be input to 3D reconstruction algorithms. Here however, we only demonstrate a couple of 2D reconstruction algorithms, written in MATLAB. Simple MATLAB examples show how to generate the take projections followed by subsequent reconstruction. Compared to the old take version, the C code have been carefully revised. A preliminary, rather untested feature of using a polychromatic X-ray source with different energy levels was already included in the old take version. The current polychromatic feature X-ray is however carefully tested. For example, it has been compared with the results from the program described by Malusek et al. We also demonstrate experiments with a polychromatic X-ray source and a Plexiglass object giving the beam-hardening artefact. Detector sensitivity for different energy levels is not included in take. However, in section~\refsec:realexperiment, we describe a technique to include the detector sensitivity into the energy spectrum. Finally, an experiment with comparison of real and simulated data were performed. The result wasn't completely successful, but we still demonstrate it. Contemporary analytical reconstruction methods for helical cone-beam CT have to be designed to handle the Long Object Problem. Normally, a moderate amount of over-scanning is sufficient for reconstruction of a certain Region-of-interest (ROI). Unfortunately, for iterative methods, it seems that the useful ROI will diminish for every iteration step. The remedies proposed here are twofold. Firstly, we use careful extrapolation and masking of projection data. Secondly, we generate and utilize projection data from incompletely reconstructed volume parts, which is rather counter-intuitive and contradictory to our initial assumptions. The results seem very encouraging. Even voxels close to the boundary in the original ROI are as well enhanced by the iterative loop as the middle part.

Erik Jonsson, Michael Felsberg, Gösta Granlund, "Incremental Associative Learning", LiTH-ISY-R, No. 2691, 2005.

AbstractKeywordsBiBTeX

Per-Erik Forssén, Björn Johansson, Gösta Granlund, "Learning under Perceptual Aliasing", LiTH-ISY-R-2705, 2005.

KeywordsBiBTeX

Björn Johansson, Anders Moe, "Object Recognition in 3D Laser Radar Data using Plane triplets", LiTH-ISY-R, No. 2708, 2005.

AbstractKeywordsBiBTeXFulltext

2004

Michael Felsberg, "The GET Operator", LiTH-ISY-R, No. 2633, 2004.

AbstractKeywordsBiBTeXFulltext

Björn Johansson, Robert Söderberg, "A Repeatability Test for Two Orientation Based Interest Point Detectors", LiTH-ISY-R, No. 2606, 2004.

AbstractKeywordsBiBTeXFulltext

Michael Felsberg, Per-Erik Forssen, Hanno Scharr, "Efficient Robust Smoothing of Low-Level Signal Features", LiTH-ISY-R, No. 2619, 2004.

AbstractKeywordsBiBTeX

Per-Erik Danielsson, Maria Magnusson Seger, "Combining Fourier and iterative methods in computer tomography: Analysis of an iteration scheme. The 2D-case", LiTH-ISY-R, No. 2634, 2004.

AbstractKeywordsBiBTeXFulltext

Michael Felsberg, Per-Erik Forssen, Hanno Scharr, "B-Spline Channel Smoothing for Robust Estimation", LiTH-ISY-R, No. 2579, 2004.

AbstractKeywordsBiBTeXFulltext

Klas Nordberg, "A fourth order tensor for representation of orientation and position of oriented segments", LiTH-ISY-R, No. 2587, 2004.

AbstractKeywordsBiBTeXFulltext

Per-Erik Forssen, Anders Moe, "Automatic Estimation of Epipolar Geometry from Blob Features", LiTH-ISY-R, No. 2620, 2004.

AbstractKeywordsBiBTeXFulltext

Björn Johansson, Tommy Elfving, Vladimir Kozlov, Yair Censor, Gösta Granlund, "The Application of an Oblique-Projected Landweber Method to a Model of Supervised Learning", LiTH-ISY-R, No. 2623, 2004.

AbstractKeywordsBiBTeX

2003

Hagen Spies, "Gradient Channel Matrices for Orientation Estimation", LiTH-ISY-R, No. 2540, 2003.

AbstractKeywordsBiBTeXFulltext

Michael Felsberg, Norbert Kruger, "A Probabilistic Definition of Intrinsic Dimensionality for Images", LiTH-ISY-R, No. 2520, 2003.

AbstractKeywordsBiBTeX

Hagen Spies, "Covariances of Linear Filter Outputs in Computer Vision", LiTH-ISY-R, No. 2504, 2003.

AbstractKeywordsBiBTeXFulltext

Björn Johansson, Anders Moe, "Patch-Duplets for Object Recognition and Pose Estimation", LiTH-ISY-R, No. 2553, 2003.

AbstractKeywordsBiBTeXFulltext

2002

Klas Nordberg, "The structure tensor in projective spaces", LiTH-ISY-R, No. 2424, 2002.

AbstractKeywordsBiBTeXFulltext

Michael Felsberg, Hanno Scharr, Per-Erik Forssen, "The B-Spline Channel Representation: Channel Algebra and Channel Based Diffusion Filtering", LiTH-ISY-R, No. 2461, 2002.

AbstractKeywordsBiBTeXFulltext

Per-Erik Danielsson, Maria Magnusson Seger, Henrik Turbell, "The PI-methods for Helical Cone-Beam Tomography", LiTH-ISY-R, No. 2428, 2002.

AbstractKeywordsBiBTeXFulltext

Per-Erik Forssen, Gösta Granlund, Johan Wiklund, "Channel Representation of Colour Images", LiTH-ISY-R, No. 2418, 2002.

AbstractKeywordsBiBTeXFulltext

Michael Felsberg, Gerald Sommer, "The Poisson Scale-Space: A Unified Approach to Phase-Based Image Processing in Scale-Space", LiTH-ISY-R, No. 2453, 2002.

AbstractKeywordsBiBTeX

Per-Erik Forssen, "Observations Concerning Reconstructions with Local Support", LiTH-ISY-R, No. 2425, 2002.

AbstractKeywordsBiBTeXFulltext

Björn Johansson, "Representing Multiple Orientations in 2D with Orientation Channel Histograms", LiTH-ISY-R, No. 2475, 2002.

AbstractKeywordsBiBTeXFulltext

Gösta Granlund, Per-Erik Forssén, Björn Johansson, "HiperLearn: A High Performance Learning Architecture", LiTH-ISY-R, No. 2409, 2002.

AbstractKeywordsBiBTeX

2001

Klas Nordberg, Gunnar Farnebäck, "Rank complement of diagonalizable matrices using polynomial functions", LiTH-ISY-R, No. 2369, 2001.

AbstractKeywordsBiBTeXFulltext

Per-Erik Forssen, "Window Matching using Sparse Templates", LiTH-ISY-R, No. 2392, 2001.

AbstractKeywordsBiBTeXFulltext

Per-Erik Forssen, "Autonomous Navigation using Active Perception", LiTH-ISY-R, No. 2395, 2001.

AbstractKeywordsBiBTeXFulltext

Björn Johansson, "On Classification: Simultaneously Reducing Dimensionality and Finding Automatic Representation using Canonical Correlation", LiTH-ISY-R, No. 2375, 2001.

AbstractKeywordsBiBTeXFulltext

Björn Johansson, "On Sparse Associative Networks: A Least Squares Formulation", LiTH-ISY-R, No. 2368, 2001.

AbstractKeywordsBiBTeXFulltext

2000

Gösta H. Granlund, "The Use of Dynamics to Establish Knowledge of Invariant Structure", LiTH-ISY-R, No. 2240, 2000.

KeywordsBiBTeX

Per-Erik Forssen, Björn Johansson, "Fractal Coding by Means of Local Feature Histograms", LiTH-ISY-R, No. 2295, 2000.

AbstractKeywordsBiBTeXFulltext

Per-Erik Forssen, "Updating Camera Location and Heading using a Sparse Displacement Field", LiTH-ISY-R, No. 2318, 2000.

AbstractKeywordsBiBTeXFulltext

Gösta H. Granlund, "Channel Representation of Information", LiTH-ISY-R, No. 2236, 2000.

KeywordsBiBTeX

Björn Johansson, "Curvature Detection using Polynomial Fitting on Local Orientation", LiTH-ISY-R, No. 2312, 2000.

AbstractKeywordsBiBTeXFulltext

Gösta H. Granlund, "Context Controllable Linkage Models", LiTH-ISY-R, No. 2238, 2000.

KeywordsBiBTeX

Frans Lundberg, "Maximum Entropy Matching: An Approach to Fast Template Matching", LiTH-ISY-R, No. 2313, 2000.

AbstractKeywordsBiBTeXFulltext

Abstract

One important problem in image analysis is the localization of a template in a larger image. Applications where the solution of this problem can be used include: tracking, optical flow, and stereo vision. The matching method studied here solve this problem by defining a new similarity measurement between a template and an image neighborhood. This similarity is computed for all possible integer positions of the template within the image. The position for which we get the highest similarity is considered to be the match. The similarity is not necessarily computed using the original pixel values directly, but can of course be derived from higher level image features.

The similarity measurement can be computed in differentways and the simplest approach are correlation-type algorithms. Aschwanden and Guggenb¨uhl [2] have done a comparison between such algorithms. One of best and simplest algorithms they tested is normalized cross-correlation (NCC). Therefore this algorithm has been used to compare with the PAIRS algorithm that is developed by the author and described in this text. It uses a completely different similarity measurement based on sets of bits extracted from the template and the image.

This work is done withinWITAS which is a project dealing with UAV’s (unmanned aerial vehicles). Two specific applications of the developed template matching algorithm have been studied.

One application is tracking of cars in video sequences from a helicopter.
The other one is computing optical flow in such video sequences in order to detect moving objects, especially vehicles on roads.

The video from the helicopter is in color (RGB) and this fact is used in the presented tracking algorithm. The PAIRS algorithm have been applied to these two applications and the results are reported.

A part of this text will concern a general approach to template matching called Maximum Entropy Matching (MEM) that is developed here. The main idea of MEM is that the more data we compare on a computer the longer it takes and therefore the data that we compare should have maximum average information, that is, maximum entropy. We will see that this approach can be useful to create template matching algorithms which are in the order of 10 times faster then correlation (NCC) without decreasing the performance.

Gösta Granlund, "The Dichotomy of Strategies for Spatial-Cognitive Information Processing", LiTH-ISY-R, No. 2241, 2000.

KeywordsBiBTeX

Björn Johansson, "Backprojection of Some Image Symmetries Based on a Local Orientation Description", LiTH-ISY-R, No. 2311, 2000.

AbstractKeywordsBiBTeXFulltext

Gösta H. Granlund, "Learning Through Response-Driven Association", LiTH-ISY-R, No. 2237, 2000.

KeywordsBiBTeX

Gösta H. Granlund, "Low Level Image Interpretation Using Associative Mapping", LiTH-ISY-R, No. 2239, 2000.

KeywordsBiBTeX

Björn Johansson, "A Survey on: Contents Based Search in Image Databases", LiTH-ISY-R, No. 2215, 2000.

AbstractKeywordsBiBTeXFulltext

1999

Todd Reed, "A Baseline System for Image and Map Registration using Sparse Hierarchical Features", LiTH-ISY-R, No. 2138, 1999.

KeywordsBiBTeX

Thord Andersson, Silvia Coradeschi, Alessandro Saffiotti, "Fuzzy matching of visual cues in an unmanned airborne vehicle", -, 1999.

AbstractKeywordsBiBTeXFulltext

1998

Magnus Borga, Hans Knutsson, "An Adaptive Stereo Algorithm Based on Canonical Correlation Analysis", LiTH-ISY-R, No. 2013, 1998.

KeywordsBiBTeX

Gösta Granlund, "Does Vision Inevitably Have to be Active?", LiTH-ISY-R, No. 2068, 1998.

KeywordsBiBTeX

Morgan Ulvklo, Gösta H. Granlund, Hans Knutsson, "Adaptive Reconstruction using Multiple Views", LiTH-ISY-R, No. 2036, 1998.

AbstractKeywordsBiBTeXFulltext

Hans Knutsson, Magnus Borga, Tomas Landelius, "Learning Multidimensional Signal Processing", LiTH-ISY-R, No. 2039, 1998.

AbstractKeywordsBiBTeX

Mats Andersson, Johan Wiklund, Hans Knutsson, "Sequential Filter Trees for Efficient 2D 3D and 4D Orientation Estimation", LiTH-ISY-R, No. 2070, 1998.

AbstractKeywordsBiBTeXFulltext

1997

Magnus Borga, Tomas Landelius, Hans Knutsson, "A Unified Approach to PCA, PLS, MLR and CCA", LiTH-ISY-R, No. 1992, 1997.

AbstractKeywordsBiBTeXFulltext

Jörgen Karlholm, "Tracking of occluded targets in head-up display sequences", LiTH-ISY-R, No. 1993, 1997.

KeywordsBiBTeX

Morgan Ulvklo, Magnus Uppsäll, "Adaptive Reconstruction using Multiple Views - Results and Applications", -, 1997.

KeywordsBiBTeX

1996

Tomas Landelius, Hans Knutsson, "Reinforcement Learning Adaptive Control and Explicit Criterion Maximization", LiTH-ISY-R, No. 1829, 1996.

AbstractKeywordsBiBTeXFulltext

Tomas Landelius, Hans Knutsson, "Greedy adaptive critics for LPQ [dvs LQR] problems: Convergence Proofs", LiTH-ISY-R, No. 1896, 1996.

AbstractKeywordsBiBTeXFulltext

Hans Knutsson, Magnus Borga, Tomas Landelius, "Generalized Eigenproblem for Stochastic Process Covariances", LiTH-ISY-R, No. 1916, 1996.

AbstractKeywordsBiBTeXFulltext

Johan Wiklund, Hans Knutsson, "A Generalized Convolver", LiTH-ISY-R, No. 1830, 1996.

AbstractKeywordsBiBTeXFulltext

Gunnar Farnebäck, Hans Knutsson, Gösta Granlund, "Detection of point-shaped targets", LiTH-ISY-R, No. 1921, 1996.

AbstractKeywordsBiBTeXFulltext

Tomas Landelius, Magnus Borga, Hans Knutsson, "Reinforcement Learning Trees", LiTH-ISY-R, No. 1828, 1996.

AbstractKeywordsBiBTeXFulltext

1995

Roland Wilson, Hans Knutsson, "Seeing Things II", LiTH-ISY-R, No. 1787, 1995.

KeywordsBiBTeX

Klas Nordberg, Hans Knutsson, Gösta Granlund, "Local Curvature from Gradients of the Orientation Tensor Field", LiTH-ISY-R, No. 1783, 1995.

AbstractKeywordsBiBTeXFulltext

Hans Knutsson, Magnus Borga, Tomas Landelius, "Learning Canonical Correlations", LiTH-ISY-R, No. 1761, 1995.

AbstractKeywordsBiBTeXFulltext

Tomas Landelius, Hans Knutsson, Magnus Borga, "On-Line Singular Value Decomposition of Stochastic Process Covariances", LiTH-ISY-R, No. 1762, 1995.

AbstractKeywordsBiBTeXFulltext

1994

Jörgen Karlholm, Carl-Johan Westelius, Carl-Fredrik Westin, Hans Knutsson, "Object Tracking Based on the Orientation Tensor Concept", LiTH-ISY-R, No. 1658, 1994.

AbstractKeywordsBiBTeXFulltext

Klas Nordberg, Gösta Granlund, Hans Knutsson, "Representation and Learning of Invariance", LiTH-ISY-R, No. 1552, 1994.

AbstractKeywordsBiBTeXFulltext

Carl-Fredrik Westin, Carl-Johan Westelius, Johan Wiklund, Hans Knutsson, Gösta Granlund, "ESPRIT Basic Research Action 7108, Vision as Process, DR.B.2: Integration of Multi-level Control Loops and FOA", -, 1994.

KeywordsBiBTeX

Magnus Borga, Hans Knutsson, "A Binary Competition Tree for Reinforcement Learning", LiTH-ISY-R, No. 1623, 1994.

AbstractKeywordsBiBTeXFulltext

Tomas Landelius, Hans Knutsson, "A Dynamic Tree Structure for Incremental Reinforcement Learning of Good Behavior", LiTH-ISY-R, No. 1628, 1994.

AbstractKeywordsBiBTeXFulltext

1993

Klas Nordberg, Hans Knutsson, Gösta Granlund, "On the Equivariance of the Orientation and the Tensor Field Representation", LiTH-ISY-R, No. 1530, 1993.

AbstractKeywordsBiBTeXFulltext

Rasmus Larsen, "Thoughts on Bayesian Estimation of Motion Vector Fields", LiTH-ISY-R, No. 1521, 1993.

KeywordsBiBTeX

Gösta Granlund, "ESPRIT Project BRA 3038: Vision as Process, Final Report", LiTH-ISY-R, No. 1473, 1993.

KeywordsBiBTeX

Carl-Fredrik Westin, Carl-Johan Westelius, "ESPRIT Basic Research Action 7108, Vision as Process, DR.B.1: Integration of Low-level FOA \& Control Mechanisms", -, 1993.

KeywordsBiBTeX

Erik Granum et, "ESPRIT Basic Research Action 7108, Vision as Process, Periodic progress report", -, 1993.

KeywordsBiBTeX

Roland Wilson, Hans Knutsson, "Seeing Things [1]", LiTH-ISY-R, No. 1467, 1993.

KeywordsBiBTeX

Mats T. Andersson, Hans Knutsson, "Controllable 3-D Filters for Low Level Computer Vision", LiTH-ISY-R, No. 1526, 1993.

AbstractKeywordsBiBTeXFulltext

Johan Wiklund, Carl-Fredrik Westin, Carl-Johan Westelius, "AVS, Application Visualization System, Software Evaluation Report", LiTH-ISY-R, No. 1469, 1993.

KeywordsBiBTeX

1992

Håkan Bårman, Gösta Granlund, "Hierarchical Feature Extraction for Computer-Aided Analysis of Mammograms", LiTH-ISY-R, No. 1448, 1992.

KeywordsBiBTeX

Håkan Bårman, Hans Knutsson, Gösta H. Granlund, "A Note on Estimation of Optical Flow and Acceleration", LiTH-ISY-I, No. 1313, 1992.

KeywordsBiBTeX

Johan Wiklund, Carl-Johan Westelius, Hans Knutsson, "Hierarchical Phase Based Disparity Estimation", LiTH-ISY-I, No. 1327, 1992.

KeywordsBiBTeX

Magnus Borga, Tomas Carlsson, "A Survey of Current Techniques for Reinforcement Learning", LiTH-ISY-I, No. 1391, 1992.

AbstractKeywordsBiBTeXFulltext

Carl-Fredrik Westin, "ESPRIT Basic Research Action 3038, Vision as Process, DR.A.2.1: Model Support and Local FOA Control", -, 1992.

KeywordsBiBTeX

Klas Nordberg, "Signal Representation and Signal Processing using Operators", LiTH-ISY-I, No. 1387, 1992.

AbstractKeywordsBiBTeXFulltext

Carl-Johan Westelius, Hans Knutsson, Johan Wiklund, "Robust Vergence Control Using Scale--Space Phase Information", LiTH-ISY-I, No. 1363, 1992.

KeywordsBiBTeX

Carl-Johan Westelius, "ESPRIT Basic Research Action 3038, Vision as Process, DS.A.2.1: Software for Model Support and Local FOA Control", -, 1992.

KeywordsBiBTeX

1991

Carl-Fredrik Westin, Hans Knutsson, "ESPRI Basic Research Action 3038, Vision as Process, DR.A.1.2: Definition of feature generating procedures", -, 1991.

KeywordsBiBTeX

Johan Wiklund, Hans Knutsson, Roland Wilson, "A Hierarchical Stereo Algorithm", LiTH-ISY-I, No. 1167, 1991.

KeywordsBiBTeX

Håkan Bårman, Hans Knutsson, Gösta H. Granlund, "Using Principal Direction Estimates for Shape and Acceleration Description", LiTH-ISY-I, No. 1231, 1991.

KeywordsBiBTeX

Carl-Fredrik Westin, Hans Knutsson, "Line Segmentation by Clustering in Möbius-Hough Space", LiTH-ISY-I, No. 1221, 1991.

KeywordsBiBTeX

Carl-Johan Westelius, Gösta Granlund, "Integrated Analyzes-Control Structure for Robotic Systems", -, 1991.

KeywordsBiBTeX

Carl-Johan Westelius, Hans Knutsson, "ESPRIT Basic Research Action 3038, Vision as Process, DS.A.1.1: Preliminary Software for Feature Extraction", -, 1991.

KeywordsBiBTeX

Roland Wilson, Andrew Calway, Edward R. S. Pearson, "A generalised wavelet transform for Fourier analysis: The multiresolution Fourier transform and its application to image and audio signal analysis", LiTH-ISY-I, No. 1177, 1991.

KeywordsBiBTeX

Andrew Calway, "Incorporating Orientation Selectivity in Wavelet Transforms: For Multi--Resolution Fourier Analysis of Images", LiTH-ISY-I, No. 1243, 1991.

AbstractKeywordsBiBTeXFulltext

1990

Carl-Johan Westelius, Hans Knutsson, Gösta H. Granlund, "Focus of Attention Control", LiTH-ISY-I, No. 1140, 1990.

KeywordsBiBTeX

Carl-Fredrik Westin, Hans Knutsson, "A Parameter Mapping for Line Segmentation", LiTH-ISY-I, No. 1151, 1990.

KeywordsBiBTeX

Håkan Bårman, Gösta H. Granlund, Hans Knutsson, "Hierarchical Curvature Estimation and Description", LiTH-ISY-I, No. 1095, 1990.

KeywordsBiBTeX

1989

Håkan Bårman, Hans Knutsson, Gösta H. Granlund, "Mechanisms for Striate Cortex Organization", LiTH-ISY-I, No. 1020, 1989.

KeywordsBiBTeX

Carl-Fredrik Westin, Carl-Johan Westelius, "Brain chaos. A feature or a bug?", LiTH-ISY-I, No. 0990, 1989.

KeywordsBiBTeX

Arto Järvinen, "Information representation in neural networks -- A survey", LiTH-ISY-I, No. 0994, 1989.

AbstractKeywordsBiBTeXFulltext

Gösta H. Granlund, "Image Processing Systems and Components", LiTH-ISY-I, No. 1016, 1989.

KeywordsBiBTeX

Gösta H. Granlund, "Information Representation in Image Analysis Algorithms", LiTH-ISY-I, No. 1017, 1989.

KeywordsBiBTeX

Arto Järvinen, Johan Wiklund, "Study of information mapping in Kohonen--Networks", LiTH-ISY-I, No. 0978, 1989.

KeywordsBiBTeX

Gösta H. Granlund, "Discriminant Functions, Linear Operations and Learning", LiTH-ISY-I, No. 1015, 1989.

KeywordsBiBTeX

1988

Gösta H. Granlund, "Integrated Analysis-Response Structures for Robotics Systems", LiTH-ISY-I, No. 0932, 1988.

KeywordsBiBTeX

Gösta H. Granlund, "Magnitude Representation of Feature Variables", LiTH-ISY-I, No. 0933, 1988.

KeywordsBiBTeX

Håkan Bårman, Leif Haglund, Gösta H. Granlund, "Context Dependent Hierarchical Image Processing for Remote Sensing Data, Part Two: Contextual Classification and Segmentation", LiTH-ISY-I, No. 0924, 1988.

KeywordsBiBTeX

Gösta H. Granlund, Hans Knutsson, "Compact Associative Representation of Structural Information", LiTH-ISY-I, No. 0931, 1988.

KeywordsBiBTeX

Josef Bigun, "Impressions from Picture Processing in USA and Japan", LiTH-ISY-I, No. 0892, 1988.

KeywordsBiBTeX

Mats Andersson, Gösta H. Granlund, "A Hybrid Image Processing Architecture", LiTH-ISY-I, No. 0929, 1988.

KeywordsBiBTeX

Josef Bigun, "Detection of Linear Symmetry in Multiple Dimensions for Description of Local Orientation and Optical Flow", LiTH-ISY-I, No. 893, 1988.

AbstractKeywordsBiBTeX

Gösta H. Granlund, "Bi-Directionally Adaptive Models in Image Analysis", LiTH-ISY-I, No. 0930, 1988.

KeywordsBiBTeX

Gösta H. Granlund, "Integrated Analysis-Response Structures for Robotics Systems", LiTH-ISY-I, No. 0932, 1988.

KeywordsBiBTeX

1987

Josef Bigun, "Optimal Orientation Detection of Linear Symmetry", LiTH-ISY-I, No. 828, 1987.

AbstractKeywordsBiBTeXFulltext

Fritz Albregtsen, "Enhancing Satellite Images of the Antarctic Snow and Ice Cover by Context Dependent Anisotropic Nonstationary Filtering.", LiTH-ISY-I, No. 0852, 1987.

KeywordsBiBTeX

Josef Bigun, "Optimal Orientation Detection of Circular Symmetry.", LiTH-ISY-I, No. 0871, 1987.

KeywordsBiBTeX

1986

Josef Bigun, Gösta H. Granlund, "Central Symmetry Modelling", LiTH-ISY-I, No. 789, 1986.

AbstractKeywordsBiBTeXFulltext

Håkan Bårman, Gösta H. Granlund, Hans Knutsson, L. Näppä, "Context Dependent Hierarchical Image Processing for Remote Sensing Data.", LiTH-ISY-I, No. 0824, 1986.

KeywordsBiBTeX

Gösta H. Granlund, "Introduction to GOP Computer Vision.", LiTH-ISY-I, No. 0849, 1986.

KeywordsBiBTeX

1985

Lars Näppä, Gösta H. Granlund, "Texture Analysis and Description.", LiTH-ISY-I, No. 0775, 1985.

KeywordsBiBTeX

1984

Gösta Granlund, "Images and Computers", LiTH-ISY-I, No. 0701, 1984.

KeywordsBiBTeX

1983

Roland Wilson, Gösta Granlund, "The Uncertainty Principle in Image Processing", LiTH-ISY-I, No. 0576, 1983.

KeywordsBiBTeX

Roland Wilson, "Uncertainty, Eigenvalue Problems and Filter Design", LiTH-ISY-I, No. 0580, 1983.

KeywordsBiBTeX

Roland Wilson, "The Uncertainty Principle in Vision", LiTH-ISY-I, No. 0581, 1983.

KeywordsBiBTeX

Roland Wilson, "Quad-Tree Predictive Coding: A New Class of Image Data Compression Algorithms", LiTH-ISY-I, No. 0609, 1983.

KeywordsBiBTeX

Roland Wilson, "A Class of Local Centroid Algorithms for Classification and Quantization in Spaces of Arbitrary Dimension", LiTH-ISY-I, No. 0610, 1983.

KeywordsBiBTeX

Roland Wilson, "The Uncertainty Principle in Image Coding", LiTH-ISY-I, No. 0579, 1983.

KeywordsBiBTeX

1982

Hans Knutsson, "Design of Convolution Kernels", LiTH-ISY-I, No. 0557, 1982.

AbstractKeywordsBiBTeX

Gösta H. Granlund, "Hierarchical Distributed Data Structures and Operations", LiTH-ISY-I, No. 0512, 1982.

KeywordsBiBTeX

1981

Gösta H. Granlund, Hans Knutsson, Martin Hedlund, "Hierarchical Processing of Structural Information", LiTH-ISY-I, No. 0481, 1981.

KeywordsBiBTeX

1980

Murat Kunt, "Picture Coding with the General Operator Processor (GOP)", LiTH-ISY-I, No. 0370, 1980.

KeywordsBiBTeX

1978

Hans Knutsson, "3-D Reconstruction by Fourier Techniques with Error Estimates", LiTH-ISY-I, No. 0214, 1978.

KeywordsBiBTeX

1973

Gösta H. Granlund, "Computer Processing and Display of Chromosome Image Information", LiTH-ISY-I, No. 0023, 1973.

KeywordsBiBTeX

Student theses

2024

Kaspar Rommel, "Influence of artificial turf on football technique using motion capture and 3D modelling", Student thesis, No. , 2024.

AbstractKeywordsBiBTeXFulltext

Erica Ingerstad, Liv Kåreborn, "Planet-NeRF: Neural Radiance Fields for 3D Reconstruction on Satellite Imagery in Season Changing Environments", Student thesis, LiTH-ISY-EX--24/5631--SE, 2024.

AbstractKeywordsBiBTeXFulltext

2023

Daniel Bladh, "Deep Learning-Based Depth Estimation Models with Monocular SLAM: Impacts of Pure Rotational Movements on Scale Drift and Robustness", Student thesis, LiTH-ISY-EX--23/5630--SE, 2023.

AbstractKeywordsBiBTeXFulltext

Johannes Hägerlind, "3D-Reconstruction of the Common Murre", Student thesis, LiTH-ISY-EX--23/5576--SE, 2023.

AbstractKeywordsBiBTeXFulltext

Karl Karlsson, "Camera Distortion Calibration through Fringe Pattern Phase Analysis", Student thesis, LiTH-ISY-EX--23/5580--SE, 2023.

AbstractKeywordsBiBTeXFulltext

Felix Lindgren, "Efficient Utilization of Video Embeddings from Video-Language Models", Student thesis, LiTH-ISY-EX--23/5592--SE, 2023.

AbstractKeywordsBiBTeXFulltext

Erik Lidman, "Visual Bird's-Eye View Object Detection for Autonomous Driving", Student thesis, LiTH-ISY-EX--23/5579--SE, 2023.

AbstractKeywordsBiBTeXFulltext

Moltas Enåkander, "ISAR Imaging Enhancement Without High-Resolution Ground Truth", Student thesis, LiTH-ISY-EX--23/5572--SE, 2023.

AbstractKeywordsBiBTeXFulltext

Abstract

In synthetic aperture radar (SAR) and inverse synthetic aperture radar (ISAR), an imaging radar emits electromagnetic waves of varying frequencies towards a target and the backscattered waves are collected. By either moving the radar antenna or rotating the target and combining the collected waves, a much longer synthetic aperture can be created. These radar measurements can be used to determine the radar cross-section (RCS) of the target and to reconstruct an estimate of the target. However, the reconstructed images will suffer from spectral leakage effects and are limited in resolution. Many methods of enhancing the images exist and some are based on deep learning. Most commonly the deep learning methods rely on high-resolution ground truth data of the scene to train a neural network to enhance the radar images. In this thesis, a method that does not rely on any high-resolution ground truth data is applied to train a convolutional neural network to enhance radar images. The network takes a conventional ISAR image subject to spectral leakage effects as input and outputs an enhanced ISAR image which contains much more defined features. New RCS measurements are created from the enhanced ISAR image and the network is trained to minimise the difference between the original RCS measurements and the new RCS measurements. A sparsity constraint is added to ensure that the proposed enhanced ISAR image is sparse. The synthetic training data consists of scenes containing point scatterers that are either individual or grouped together to form shapes. The scenes are used to create synthetic radar measurements which are then used to reconstruct ISAR images of the scenes. The network is tested using both synthetic data and measurement data from a cylinder and two aeroplane models. The network manages to minimise spectral leakage and increase the resolution of the ISAR images created from both synthetic and measured RCSs, especially on measured data from target models which have similar features to the synthetic training data.

The contributions of this thesis work are firstly a convolutional neural network that enhances ISAR images affected by spectral leakage. The neural network handles complex-valued signals as a single channel and does not perform any rescaling of the input. Secondly, it is shown that it is sufficient to calculate the new RCS for much fewer frequency samples and angular positions and compare those measurements to the corresponding frequency samples and angular positions in the original RCS to train the neural network.

Kevin Bärudde, Marcus Gandal, "Industrial 3D Anomaly Detection and Localization Using Unsupervised Machine Learning", Student thesis, LiTH-ISY-EX--23/5569--SE, 2023.

AbstractKeywordsBiBTeXFulltext

Christoffer Gärdin, "Anomaly Detection with Machine Learning using CLIP in a Video Surveillance Context", Student thesis, LiTH-ISY-EX--23/5564--SE, 2023.

AbstractKeywordsBiBTeXFulltext

Johanna Carlson, Lovisa Byman, "Generation of Synthetic Traffic Sign Images using Diffusion Models", Student thesis, LiTH-ISY-EX--23/5563--SE, 2023.

AbstractKeywordsBiBTeXFulltext

Simon Hermansson, "Learning Embeddings for Fashion Images", Student thesis, LiTH-ISY-EX--23/5567--SE, 2023.

AbstractKeywordsBiBTeXFulltext

Maja Boström, "Point Cloud Registration using both Machine Learning and Non-learning Methods: with Data from a Photon-counting LIDAR Sensor", Student thesis, LiTH-ISY-EX--23/5558--SE, 2023.

AbstractKeywordsBiBTeXFulltext

Matilda Granqvist, "Infrared and Visible Image Fusion with an Unsupervised Network", Student thesis, LiTH-ISY-EX--23/5540--SE, 2023.

AbstractKeywordsBiBTeXFulltext

Matheus Vieira Bernat, "Topical Classification of Images in Wikipedia: Development of topical classification models followed by a study of the visual content of Wikipedia", Student thesis, LiTH-ISY-EX--23/5538--SE, 2023.

AbstractKeywordsBiBTeXFulltext

Evelina Hult, "Toward Equine Gait Analysis: Semantic Segmentation and 3D Reconstruction", Student thesis, LiTH-ISY-EX--23/5539--SE, 2023.

AbstractKeywordsBiBTeXFulltext

2022

Arvid Karlhede, "Online Camera-IMU Calibration", Student thesis, LiTH-ISY-EX--22/5524--SE, 2022.

AbstractKeywordsBiBTeXFulltext

Niclas Hansson, "Investigation of Registration Methods for High Resolution SAR-EO Imagery", Student thesis, LiTH-ISY-EX--22/5506--SE, 2022.

AbstractKeywordsBiBTeXFulltext

Abstract

With advancements in space technology, remote sensing applications, and computer vision, significant improvements in the data describing our planet are seen today. Researchers want to gather different kinds of data and perform data fusion techniques between them to increase our understanding of the world. Two such data types are Electro-Optical images and Synthetic Aperture Radar images. For data fusion, the images need to be accurately aligned. Researchers have investigated methods for robustly and accurately registering these images for many years. However, recent advancements in imaging systems have made the problem more complex than ever.

Currently, the imaging satellites that capture information around the globe have achieved a resolution of less than a meter per pixel. There is an increase in signal complexity for high-resolution SAR images due to how the imaging system operates. Interference between waves gives rise to speckled noise and geometric distortions, making the images very difficult to interpret. This directly affects the image registration accuracy.

In this thesis, the complexity of the problem regarding registration between SAR and EO data was described, and methods for registering the images were investigated. The methods were feature- and area-based. The feature-based method used a KAZE filter and SURF descriptor. The method found many key points but few correct correspondences. The area-based methods used FFT and MI, respectively. FFT was deemed best for higher quality images, whereas MI better dealt with the non-linear intensity difference. More complex techniques, such as dense neural networks, were excluded. No method achieved satisfying results on the entire data set, but the area-based methods accomplished complementary results.

A conclusion was drawn that the distortions in the SAR images are too significant to register accurately using only CV algorithms. Since the area-based methods achieved good results on images excluding significant distortions, future work should focus on solving the geometrical errors and increasing the registration accuracy

Hasseli Zahra, Anwia Odisho Raamen, "Automatic Quality Assessment of Dermatology Images: A Comparison Between Machine Learning and Hand-Crafted Algorithms", Student thesis, LiTH-ISY-EX–22/5486–SE, 2022.

AbstractKeywordsBiBTeXFulltext

Abstract

In recent years, pictures from handheld devices such as smartphones have been increasingly utilized as a documentation tool by medical practitioners not trained to take professional photographs. Similarly to the other types of image modalities, the images should be taken in a way to capture the vital information in the region of interest. Nevertheless, image capturing cannot always be done as desired, so images may exhibit different blur types at the region of interest. Having blurry images does not serve medical purposes, therefore, the patients might have to schedule a second appointment several days later to retake the images. A solution to this problem is to create an algorithm which immediately after capturing an image determines if it is medically useful and notifies the user of the result. The algorithm needs to perform the analysis at a reasonable speed, and at best, with a limited number of operations to make the calculations directly in the smartphone device. A large number of medical images must be available to create such an algorithm. Medical images are difficult to acquire, and it is specifically difficult to acquire blurry images since they are usually deleted.

The main objective of this thesis is to determine the medical usefulness of images taken with smartphone cameras, using both machine learning and handcrafted algorithms, with a low number of floating point operations and a high performance. Seven different algorithms (one hand-crafted and six machine learned) are created and compared regarding both number of floating point operations and performance. Fast Walsh-Hadamard transforms are the basis of the hand-crafted algorithm. The employed machine learning algorithms are both based on common convolutional neural networks (MobileNetV3 and ResNet50) and on our own designs. The issue with the low number of medical images acquired is solved by training the machine learning models on a synthetic dataset, where the non-medically useful images are generated by applying blur on the medically useful images. These models do, however, undergo evaluation using a real dataset, containing medically useful images as well as non-medically useful images.

Our results implicate that a real-time determination of the medical usefulness of images is possible on handheld devices, since our machine learned model DeepLAD-Net reaches the highest accuracy with 42 · 10⁶ floating point operations. In terms of accuracy, MobileNetV3-large is the second best model with31 times as many floating point operations as our best model.

Mikaela Hardebro, Elin Jirskog, "Transformer Based Object Detection and Semantic Segmentation for Autonomous Driving", Student thesis, LiTH-ISY-EX--22/5487--SE, 2022.

AbstractKeywordsBiBTeXFulltext

Stina Gustafsson, "Learning to Measure Invisible Fish", Student thesis, LiTH-ISY-EX--22/5517--SE, 2022.

AbstractKeywordsBiBTeXFulltext

Sara Modorato, "Tracking Under Countermeasures Using Infrared Imagery", Student thesis, LiTH-ISY-EX–22/5473–SE, 2022.

AbstractKeywordsBiBTeXFulltext

Olle Sievers, "CNN-Based Methods for Tree Species Detection in UAV Images", Student thesis, LiTH-ISY-EX–22/5502–SE, 2022.

AbstractKeywordsBiBTeXFulltext

Gustav Dahmén, Erica Strand, "Forest Growth And Volume Estimation Using Machine Learning", Student thesis, LiTH-ISY-EX--22/5508--SE, 2022.

AbstractKeywordsBiBTeXFulltext

Filip Isaksson, "Measuring Porosity in Ceramic Coating using Convolutional Neural Networks and Semantic Segmentation", Student thesis, LiTH-ISY-EX--22/5490--SE, 2022.

AbstractKeywordsBiBTeXFulltext

Hoang Tran, "Learning with Synthetically Blocked Images for Sensor Blockage Detection", Student thesis, LiTH-ISY-EX–22/5509–SE, 2022.

AbstractKeywordsBiBTeXFulltext

Marcus Nolkrantz, "Efficient multiple hypothesis tracking using a purely functional array language", Student thesis, LiTH-ISY-EX--22/5482--SE, 2022.

AbstractKeywordsBiBTeXFulltext

Emily Olsson, "Lens Distortion Correction Without Camera Access", Student thesis, LiTH-ISY-EX--22/5476--SE, 2022.

AbstractKeywordsBiBTeXFulltext

Axel Ahlqvist, "Examining Difficulties in Weed Detection", Student thesis, No. , 2022.

AbstractKeywordsBiBTeXFulltext

Felicia Castenbrandt, "Image Similarity Scoring for Medical Images in 3D", Student thesis, LiTH-ISY-EX–22/5484–SE, 2022.

AbstractKeywordsBiBTeXFulltext

Caspian Süsskind, "Deep Learning Semantic Segmentation of 3D Point Cloud Data from a Photon Counting LiDAR", Student thesis, LiTH-ISY-EX--22/5467--SE, 2022.

AbstractKeywordsBiBTeXFulltext

Therese Luong, "Windshield Distortion Modelling", Student thesis, LiTH-ISY-EX--22/5455--SE, 2022.

AbstractKeywordsBiBTeXFulltext

Fanny Forsberg, "Domain Adaptation to Meet the Reality-Gap from Simulation to Reality", Student thesis, LiTH-ISY-EX--21/5453--SE, 2022.

AbstractKeywordsBiBTeXFulltext

2021

Anton Ågren, "Automatic Colour Transfer for Geodata", Student thesis, LiTH-ISY-EX--21/5378--SE, 2021.

AbstractKeywordsBiBTeXFulltext

Simon Gustavsson, "Object Detection and Semantic Segmentation Using Self-Supervised Learning", Student thesis, LiTH-ISY-EX–21/5357–SE, 2021.

AbstractKeywordsBiBTeXFulltext

Anton Hjert, "Machine Learning for LiDAR-SLAM: In Forest Terrains", Student thesis, LiTH-ISY-Ex No. 21/5442--SE, 2021.

AbstractKeywordsBiBTeXFulltext

Abstract

Point set registration is a well-researched yet still not a very exploited area in computer vision. As the field of machine learning grows, the possibilities of application expand. This thesis investigates the possibility to expand an already implemented probabilistic machine learning approach to point set registration to more complex, larger datasets gathered in a forest environment. The system used as a starting point was created by Järemo Lawin et. al. [10]. The aim of the thesis was to investigate the possibility to register the forest data with the existing system, without ground-truth poses, with different optimizers, and to implement a SLAM pipeline. Also, older methods were used as a benchmark for evaluation, more specifically iterative closest point(ICP) and fast global registration(FGR).To enable the gathered data to be processed by the registration algorithms, preprocessing was required. Transforming the data points from the coordinate system of the sensor to world relative coordinates via LiDAR base coordinates. Subsequently, the registration was performed with different approaches. Both the KITTI odometry dataset, which RLLReg originally was evaluated with[10], and the gathered forest data were used. Data augmentation was utilized to enable ground-truth-independent training and to increase diversity in the data. In addition, the registration results were used to create a SLAM-pipeline, enabling mapping and localization in the scanned areas. The results showed great potential for using RLLReg to register forest scenes compared to other, older, approaches. Especially, the lack of ground-truth was manageable using data augmentation to create training data. Moreover, there was no evidence that AdaBound improves the system when replacing the Adam-optimizer. Finally, forest models with sensor paths plotted were generated with decent results. However, a potential for post-processing with further refinement is possible. Nevertheless, the possibility of point set registration and LiDAR-SLAM using machine learning has been confirmed.

Hannes Freij, "Hyperspectral Image Registration and Construction From Irregularly Sampled Data", Student thesis, LiTH-ISY-EX--21/5408--SE, 2021.

AbstractKeywordsBiBTeXFulltext

Alice Velander, David Gumpert Harrysson, "Do Judge a Book by its Cover! Predicting the genre of book covers using supervised deep learning. Analyzing the model predictions using explanatory artificial intelligence methods and techniques.", Student thesis, No. , 2021.

AbstractKeywordsBiBTeXFulltext

Natalie Syrén Grönfelt, "Pretraining a Neural Network for Hyperspectral Images Using Self-Supervised Contrastive Learning", Student thesis, LiTH-ISY-EX–21/5382–SE, 2021.

AbstractKeywordsBiBTeXFulltext

Malin Rudin, "Evaluation of Optical Flow for Estimation of Liquid Glass Flow Velocity", Student thesis, LiTH-ISY-EX--21/5433--SE, 2021.

AbstractKeywordsBiBTeXFulltext

Abstract

In the glass wool industry, the molten glass flow is monitored for regulation purposes. Given the progress in the computer vision field, the current monitoring solution might be replaced by a camera based solution. The aim of this thesis is to investigate the possibility of using optical flow techniques for estimation of the molten glass flow displacement.

Three glass melt flow datasets were recorded, as well as two additional melt flow datasets, using a NIR camera. The block matching techniques Full Search (FS) and Adaptive Rood Pattern Search (ARPS), as well as the local feature methods ORB and A-KAZE were considered. These four techniques were compared to RAFT, the state-of-the-art approach for optical flow estimation, using available pre-trained models, as well as an approach of using the tracking method ECO for the optical flow estimation.

The methods have been evaluated using the metrics MAE, MSE, and SSIM to compare the warped flow to the target image. In addition, ground truth for 50 frames from each dataset was manually annotated as to use the optical flow metric End-Point Error. To investigate the computational complexity the average computational time per frame was calculated.

The investigation found that RAFT does not perform well on the given data, due to the large displacements of the flows. For simulated displacements of up to about 100 pixels at full resolution, the performance is satisfactory, with results comparable to the traditional methods.

Using ECO for optical flow estimation encounters similar problems as RAFT, where the large displacement proved challenging for the tracker. Simulating smaller motions of up to 60 pixels resulted in good performance, though computation time of the used implementation is much too high for a real-time implementation.

The four traditional block matching and local feature approaches examined in this thesis outperform the state-of-the-art approaches. FS, ARPS, A-KAZE, and ORB all have similar performance on the glass flow datasets, whereas the block matching approaches fail on the alternative melt flow data as the template extraction approach is inadequate. The two local feature approaches, though working reasonably well on all datasets given full resolution, struggle to identify features on down-sampled data. This might be mitigated by fine-tuning the settings of the methods. Generally, ORB mostly outperforms A-KAZE with respect to the evaluation metrics, and is considerably faster.

Martin Björn, "Laterality Classification of X-Ray Images: Using Deep Learning", Student thesis, LiTH-ISY-EX--21/5417-SE, 2021.

AbstractKeywordsBiBTeXFulltext

HANG ZHAO, "Segmentation and synthesis of pelvic region CT images via neural networks trained on XCAT phantom data", Student thesis, No. , 2021.

AbstractKeywordsBiBTeXFulltext

Albin Konradsson, Gustav Bohman, "3D Instance Segmentation of Cluttered Scenes: A Comparative Study of 3D Data Representations", Student thesis, LiTH-ISY-EX--21/5421--SE, 2021.

AbstractKeywordsBiBTeXFulltext

Sebastian Brundin, Adam Gräns, "Efficient Recycling Of Non-Ferrous Materials Using Cross-Modal Knowledge Distillation", Student thesis, LiTH-ISY-EX--21/5403--SE, 2021.

AbstractKeywordsBiBTeXFulltext

Marcus Dahlqvist, "Adaptive Losses for Camera Pose Supervision", Student thesis, LiTH-ISY-EX--21/5422--SE, 2021.

AbstractKeywordsBiBTeXFulltext

Christian von Koch, William Anzén, "Detecting Slag Formation with Deep Learning Methods: An experimental study of different deep learning image segmentation models", Student thesis, LiTH-ISY-EX--21/5427--SE, 2021.

AbstractKeywordsBiBTeXFulltext

Gustav Wahlquist, "Improving Automatic Image Annotation Using Metadata", Student thesis, LiTH-ISY-EX–21/5398–SE, 2021.

AbstractKeywordsBiBTeXFulltext

Tim Yngesjö, "3D Reconstruction from Satellite Imagery Using Deep Learning", Student thesis, LiTH-ISY-Ex No. 21/5393–SE, 2021.

AbstractKeywordsBiBTeXFulltext

Jonas Rydgård, Marcus Bejgrowicz, "Semantic Segmentation of Building Materials in Real World Images Using 3D Information", Student thesis, LiTH-ISY-EX--21/5405--SE, 2021.

AbstractKeywordsBiBTeXFulltext

Emma Wettermark, Linda Berglund, "Multi-Modal Visual Tracking Using Infrared Imagery", Student thesis, LiTH-ISY-EX--21/5401--SE, 2021.

AbstractKeywordsBiBTeXFulltext

Per Antonsson, Jesper Johansson, "Measuring Respiratory Frequency Using Optronics and Computer Vision", Student thesis, LiTH-ISY-EX–21/5376–SE, 2021.

AbstractKeywordsBiBTeXFulltext

Abstract

This thesis investigates the development and use of software to measure respiratory frequency on cows using optronics and computer vision. It examines mainly two different strategies of image and signal processing and their performances for different input qualities. The effect of heat stress on dairy cows and the high transmission risk of pneumonia for calves make the investigation done during this thesis highly relevant since they both have the same symptom; increased respiratory frequency. The data set used in this thesis was of recorded dairy cows in different environments and from varying angles. Recordings, where the authors could determine a true breathing frequency by monitoring body movements, were accepted to the data set and used to test and develop the algorithms. One method developed in this thesis estimated the breathing rate in the frequency domain by Fast Fourier Transform and was named "N-point Fast Fourier Transform." The other method was called "Breathing Movement Zero-Crossing Counting." It estimated a signal in the time domain, whose fundamental frequency was determined by a zero-crossing algorithm as the breathing frequency. The result showed that both the developed algorithm successfully estimated a breathing frequency with a reasonable error margin for most of the data set. The zero-crossing algorithm showed the most consistent result with an error margin lower than 0.92 breaths per minute (BPM) for twelve of thirteen recordings. However, it is limited to recordings where the camera is placed above the cow. The N-point FFT algorithm estimated the breathing frequency with error margins between 0.44 and 5.20 BPM for the same recordings as the zero-crossing algorithm. This method is not limited to a specific camera angle but requires the cow to be relatively stationary to get accurate results. Therefore, it could be evaluated with the remaining three recordings of the data set. The error margins for these recordings were measured between 1.92 and 10.88 BPM. Both methods had execution time acceptable for implementation in real-time. It was, however, too incomplete a data set to determine any performance with recordings from different optronic devices.

Lovisa Nilsson, "Data-Driven Methods for Sonar Imaging", Student thesis, LiTH-ISY-EX--21/5381--SE, 2021.

AbstractKeywordsBiBTeXFulltext

Kerstin Söderqvist, "Anomaly Detection in Images and Videos Using Photo-Response Non-Uniformity", Student thesis, LiTH-ISY-EX--21/5367--SE, 2021.

AbstractKeywordsBiBTeXFulltext

Abstract

When photos and videos are increasingly used as evidence material, it is of importance to know if these materials can be used as evidence material or if the risk of them being forged is impending. This thesis investigates methods for detecting anomalous regions in images and videos using photo-response non-uniformity -- a fixed-pattern sensor noise that can be estimated from photos or videos.

For photos, experiments were performed on a method that assumes other photos from the same camera are available. For videos, experiments were performed on a method further developed from the still image method, with other videos from the same camera being available. The last experiments were performed on videos when only the video that was about to be investigated was available.

The experiments on the still image method were performed on images with three different kinds of forged regions: a forged region from somewhere else in the same photo, a forged region from a photo taken by another camera, and a forged region from the same sensor position in a photo taken by the same camera. The method should not be able to detect the third kind of forged region. Experiments performed on videos had a forged region in several adjacent frames in the video. The forged region was from another video, and it moved and changed shape between the frames.

The methods mainly consist of a classification process and some post-processing. In the classification process, features were extracted from images/videos and used in a random forest classifier. The results are presented in precision, recall, F₁ score and false positive rate.

The quality of the still images was generally better than the videos, which also resulted in better results. For the cameras used in the experiments, it seemed easier to estimate a good PRNU pattern from photos and videos from older cameras. Probably due to sensor differences and extra processing in newer camera models. How the images and videos are compressed also affects the possibility to estimate a good PRNU pattern, because important information may then be lost.

Rolf Sievert, "Instance Segmentation of Multiclass Litter and Imbalanced Dataset Handling: A Deep Learning Model Comparison", Student thesis, LiTH-ISY-EX--21/5365--SE, 2021.

AbstractKeywordsBiBTeXFulltext

Abstract

Instance segmentation has a great potential for improving the current state of littering by autonomously detecting and segmenting different categories of litter. With this information, litter could, for example, be geotagged to aid litter pickers or to give precise locational information to unmanned vehicles for autonomous litter collection. Land-based litter instance segmentation is a relatively unexplored field, and this study aims to give a comparison of the instance segmentation models Mask R-CNN and DetectoRS using the multiclass litter dataset called Trash Annotations in Context (TACO) in conjunction with the Common Objects in Context precision and recall scores. TACO is an imbalanced dataset, and therefore imbalanced data-handling is addressed, exercising a second-order relation iterative stratified split, and additionally oversampling when training Mask R-CNN. Mask R-CNN without oversampling resulted in a segmentation of 0.127 mAP, and with oversampling 0.163 mAP. DetectoRS achieved 0.167 segmentation mAP, and improves the segmentation mAP of small objects most noticeably, with a factor of at least 2, which is important within the litter domain since small objects such as cigarettes are overrepresented. In contrast, oversampling with Mask R-CNN does not seem to improve the general precision of small and medium objects, but only improves the detection of large objects. It is concluded that DetectoRS improves results compared to Mask R-CNN, as well does oversampling. However, using a dataset that cannot have an all-class representation for train, validation, and test splits, together with an iterative stratification that does not guarantee all-class representations, makes it hard for future works to do exact comparisons to this study. Results are therefore approximate considering using all categories since 12 categories are missing from the test set, where 4 of those were impossible to split into train, validation, and test set. Further image collection and annotation to mitigate the imbalance would most noticeably improve results since results depend on class-averaged values. Doing oversampling with DetectoRS would also help improve results. There is also the option to combine the two datasets TACO and MJU-Waste to enforce training of more categories.

Hanna Hamrell, "Image-to-Image Translation for Improvement of Synthetic Thermal Infrared Training Data Using Generative Adversarial Networks", Student thesis, LiTH-ISY-EX--21/5364--SE, 2021.

AbstractKeywordsBiBTeXFulltext

Erik Örjehag, "Unsupervised Learning for Structure from Motion", Student thesis, LiTH-ISY-EX--21/5361--SE, 2021.

AbstractKeywordsBiBTeXFulltext

2020

Fredrik Almin, "Detection of Non-Ferrous Materials with Computer Vision", Student thesis, LiTH-ISY-EX--20/5321--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Mimmi Lindberg, "Forensic Validation of 3D models", Student thesis, LiTH-ISY-EX--20/5346--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Ludwig Thaung, "Advanced Data Augmentation: With Generative Adversarial Networks and Computer-Aided Design", Student thesis, LiTH-ISY-EX--20/5340--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Alkazhami Emir, "Facial Identity Embeddings for Deepfake Detection in Videos", Student thesis, LiTH-ISY-EX--20/5341--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Frida Flodin, "Improved Data Association for Multi-Pedestrian Tracking Using Image Information", Student thesis, LiTH-ISY-EX--20/5329--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Tobias Löfgren, Daniel Jonsson, "Generating Synthetic Data for Evaluation and Improvement of Deep 6D Pose Estimation", Student thesis, LiTH-ISY-EX--20/5339--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Abstract

The task of 6D pose estimation with deep learning is to train networks to, from an im-age of an object, determine the rotation and translation of the object. Impressive resultshave recently been shown in deep learning based 6D pose estimation. However, many cur-rent solutions rely on real-world data when training, which as opposed to synthetic data,requires time consuming annotation. In this thesis, we introduce a pipeline for generatingsynthetic ground truth data for deep 6D pose estimation, where annotation is done auto-matically. With a 3D CAD-model, we use Blender to render 2D images of the model fromdifferent view points. We also create all other relevant data needed for pose estimation, e.g.,the poses of an object, mask images and 3D keypoints on the object. Using this pipeline, itis possible to adjust different settings to reduce the domain gap between synthetic data andreal-world data and get better pose estimation results. Such settings could be changing themethod of extracting 3D keypoints and varying the scale of the object or the light settingsin the scene.The network used to test the performance of training on our synthetic data is PVNet,which achieves state-of-the-art results for 6D pose estimation. This architecture learns tofind 2D keypoints of the object in the image, as well as 2D–3D keypoint correspondences.With these correspondences, the Perspective-n-Point (PnP) algorithm is used to extract apose. We evaluate the pose estimation of the different settings on the synthetic data andcompare these results to other state-of-the-art work. We find that using only real-worlddata for training is worse than using a combination of synthetic and real-world data. Sev-eral other findings are that varying scale and lightning, in addition to adding random back-ground images to the rendered images improves results. Four different novel keypoint se-lection methods are introduced in this work, and tried against methods used in previouswork. We observe that our methods achieve similar or better results. Finally, we use thebest possible settings from the synthetic data pipeline, but with memory limitations on theamount of training data. We are close to state-of-the-art results, and could get closer withmore data.

Sabina Serra, "Deep Learning for Semantic Segmentation of 3D Point Clouds from an Airborne LiDAR", Student thesis, LiTH-ISY-EX--20/5331--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Erik Svensson, "Transfer Learning for Friction Estimation: Using Deep Reduced Features", Student thesis, LiTH-ISY-EX--20/5312--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Erik Dahlström, "Super-Resolution Using Dynamic Cameras", Student thesis, LiTH-ISY-EX--20/5315--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Martin Persson, "Automatic Gait Recognition: using deep metric learning", Student thesis, LIU-ISY/LITH-EX-A--20/5316--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Emil Luusua, "Vehicle Detection, at a Distance: Done Efficiently via Fusion of Short- and Long-Range Images", Student thesis, LiTH-ISY-EX--20/5328--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Abstract

Object detection is a classical computer vision task, encountered in many practical applications such as robotics and autonomous driving. The latter involves serious consequences of failure and a multitude of challenging demands, including high computational efficiency and detection accuracy. Distant objects are notably difficult to detect accurately due to their small scale in the image, consisting of only a few pixels. This is especially problematic in autonomous driving, as objects should be detected at the earliest possible stage to facilitate handling of hazardous situations. Previous work has addressed small objects via use of feature pyramids and super-resolution techniques, but the efficiency of such methods is limited as computational cost increases with image resolution. Therefore, a trade-off must be made between accuracy and cost. Opportunely though, a common characteristic of driving scenarios is the predominance of distant objects in the centre of the image. Thus, the full-frame image can be downsampled to reduce computational cost, and a crop can be extracted from the image centre to preserve resolution for distant vehicles. In this way, short- and long-range images are generated. This thesis investigates the fusion of such images in a convolutional neural network, particularly the fusion level, fusion operation, and spatial alignment. A novel framework — DetSLR — is proposed for the task and examined via the aforementioned aspects. Through adoption of the framework for the well-established SSD detector and MobileNetV2 feature extractor, it is shown that the framework significantly improves upon the original detector without incurring additional cost. The fusion level is shown to have great impact on the performance of the framework, favouring high-level fusion, while only insignificant differences exist between investigated fusion operations. Finally, spatial alignment of features is demonstrated to be a crucial component of the framework.

Johan Edstedt, "Towards Understanding Capsule Networks", Student thesis, LiTH-ISY-EX--20/5309--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Nils Gehlin, Martin Antonsson, "Detecting Non-Natural Objects in a Natural Environment using Generative Adversarial Networks with Stereo Data", Student thesis, LiTH-ISY-EX--20/5324--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Mathias Kindstedt, "Exploring the Training Data for Online Learning of Autonomous Driving in a Simulated Environment", Student thesis, LiTH-ISY-EX--20/5325--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Johanna Tuvskog, "Evaluation of Face Recognition Accuracy in Surveillance Video", Student thesis, LiTH-ISY-EX--20/5302--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Björn Runow, "Deep Learning for Point Detection in Images", Student thesis, LiTH-ISY-EX--20/5295--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Ida Ejnestrand, Linnéa Jakobsson, "Object Tracking based on Eye Tracking Data: A comparison with a state-of-the-art video tracker", Student thesis, LiTH-ISY-EX--20/5294--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Ida Bjerwe, "Automatic Alignment Detection and Correction in Infrared and Visual Image Pairs", Student thesis, LiTH-ISY-EX--20/5292--SE, 2020.

AbstractKeywordsBiBTeXFulltext

David Pop, "Classification of Heart Views in Ultrasound Images", Student thesis, LiTH-ISY-EX--20/5288--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Abstract

In today’s society, we experience an increasing challenge to provide healthcare to everyone in need due to the increasing number of patients and the shortage of medical staff. Computers have contributed to mitigating this challenge by offloading the medical staff from some of the tasks. With the rise of deep learning, countless new possibilities have opened to help the medical staff even further. One domain where deep learning can be applied is analysis of ultrasound images. In this thesis we investigate the problem of classifying standard views of the heart in ultrasound images with the help of deep learning. We conduct mainly three experiments. First, we use NasNet mobile, InceptionV3, VGG16 and MobileNet, pre-trained on ImageNet, and finetune them to ultrasound heart images. We compare the accuracy of these networks to each other and to the baselinemodel, a CNN that was proposed in [23]. Then we assess a neural network’s capability to generalize to images from ultrasound machines that the network is not trained on. Lastly, we test how the performance of the networks degrades with decreasing amount of training data. Our first experiment shows that all networks considered in this study have very similar performance in terms of accuracy with Inception V3 being slightly better than the rest. The best performance is achieved when the whole network is finetuned to our problem instead of finetuning only apart of it, while gradually unlocking more layers for training. The generalization experiment shows that neural networks have the potential to generalize to images from ultrasound machines that they are not trained on. It also shows that having a mix of multiple ultrasound machines in the training data increases generalization performance. In our last experiment we compare the performance of the CNN proposed in [23] with MobileNet pre-trained on ImageNet and MobileNet randomly initialized. This shows that the performance of the baseline model suffers the least with decreasing amount of training data and that pre-training helps the performance drastically on smaller training datasets.

Karin Fritz, "Instance Segmentation of Buildings in Satellite Images", Student thesis, LiTH-ISY-EX--20/5283--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Freja Fagerblom, "Model-Agnostic Meta-Learning for Digital Pathology", Student thesis, LiTH-ISY-EX--20/5284--SE, 2020.

AbstractKeywordsBiBTeXFulltext

Denise Härnström, "Classification of Clothing Attributes Across Domains", Student thesis, LiTH-ISY-EX--20/5276--SE, 2020.

AbstractKeywordsBiBTeXFulltext

2019

Johan Thornström, "Domain Adaptation of Unreal Images for Image Classification", Student thesis, LiTH-ISY-EX–20/5282–SE, 2019.

AbstractKeywordsBiBTeXFulltext

Anna Birgersson, Klara Hellgren, "Texture Enhancement in 3D Maps using Generative Adversarial Networks", Student thesis, LiTH-ISY-EX--19/5266--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Abstract

In this thesis we investigate the use of GANs for texture enhancement. To achievethis, we have studied if synthetic satellite images generated by GANs will improvethe texture in satellite-based 3D maps.

We investigate two GANs; SRGAN and pix2pix. SRGAN increases the pixelresolution of the satellite images by generating upsampled images from low resolutionimages. As for pip2pix, the GAN performs image-to-image translation bytranslating a source image to a target image, without changing the pixel resolution.

We trained the GANs in two different approaches, named SAT-to-AER andSAT-to-AER-3D, where SAT, AER and AER-3D are different datasets provided bythe company Vricon. In the first approach, aerial images were used as groundtruth and in the second approach, rendered images from an aerial-based 3D mapwere used as ground truth.

The procedure of enhancing the texture in a satellite-based 3D map was dividedin two steps; the generation of synthetic satellite images and the re-texturingof the 3D map. Synthetic satellite images generated by two SRGAN models andone pix2pix model were used for the re-texturing. The best results were presentedusing SRGAN in the SAT-to-AER approach, in where the re-textured 3Dmap had enhanced structures and an increased perceived quality. SRGAN alsopresented a good result in the SAT-to-AER-3D approach, where the re-textured3D map had changed color distribution and the road markers were easier to distinguishfrom the ground. The images generated by the pix2pix model presentedthe worst result. As for the SAT-to-AER approach, even though the syntheticsatellite images generated by pix2pix were somewhat enhanced and containedless noise, they had no significant impact in the re-texturing. In the SAT-to-AER-3D approach, none of the investigated models based on the pix2pix frameworkpresented any successful results.

We concluded that GANs can be used as a texture enhancer using both aerialimages and images rendered from an aerial-based 3D map as ground truth. Theuse of GANs as a texture enhancer have great potential and have several interestingareas for future works.

Maria Kastberg, "Using Convolutional Neural Networks to Detect People Around Wells in South Sudan", Student thesis, LiTH-ISY-EX--19/5200--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Susanna Larsson, "Monocular Depth Estimation Using Deep Convolutional Neural Networks", Student thesis, LiTH-ISY-EX--19/5234--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Fredrik Grahn, Kristian Nilsson, "Object Detection in Domain Specific Stereo-Analysed Satellite Images", Student thesis, LiTH-ISY-EX--19/5254--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Lukas Tegendal, "Watermarking in Audio using Deep Learning", Student thesis, LiTH-ISY-EX--19/5246--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Daniel Cranston, Filip Skarfelt, "Normalized Convolution Network and Dataset Generation for Refining Stereo Disparity Maps", Student thesis, LiTH-ISY-EX--19/5252--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Jonathan Sjölund, "Detection of Frozen Video Subtitles Using Machine Learning", Student thesis, LiTH-ISY-EX--19/5206--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Annette Lef, "CAD-Based Pose Estimation - Algorithm Investigation", Student thesis, LiTH-ISY-EX--19/5239--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Martin Estgren, "Bone Fragment Segmentation Using Deep Interactive Object Selection", Student thesis, LiTH-ISY-EX--19/5197--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Linbo He, "Improving 3D Point Cloud Segmentation Using Multimodal Fusion of Projected 2D Imagery Data: Improving 3D Point Cloud Segmentation Using Multimodal Fusion of Projected 2D Imagery Data", Student thesis, LiTH-ISY-EX--19/5190--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Axel Nyström, "Evaluation of Multiple Object Tracking in Surveillance Video", Student thesis, LiTH-ISY-EX--19/5245--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Malcolm Vigren, Linus Eriksson, "End-to-End Road Lane Detection and Estimation using Deep Learning", Student thesis, LiTH-ISY-EX--19/5219--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Jakob Grönlund, Angelina Johansson, "Defect Detection and OCR on Steel", Student thesis, LiTH-ISY-EX--19/5220--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Carl Ekman, "Traffic Sign Classification Using Computationally Efficient Convolutional Neural Networks", Student thesis, LiTH-ISY-EX--19/5216--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Linnea Fridman, Victoria Nordberg, "Two Multimodal Image Registration Approaches for Positioning Purposes", Student thesis, LiTH-ISY-EX--19/5208--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Christoffer Malmgren, "A Comparative Study of Routing Methods in Capsule Networks", Student thesis, LiTH-ISY-EX--19/5188--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Andreas Norrstig, "Visual Object Detection using Convolutional Neural Networks in a Virtual Environment", Student thesis, LiTH-ISY-EX–19/5195–SE, 2019.

AbstractKeywordsBiBTeXFulltext

Goutam Bhat, "Accurate Tracking by Overlap Maximization", Student thesis, LiTH-ISY-EX--19/5189--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Viktor Ringdahl, "Stereo Camera Pose Estimation to Enable Loop Detection", Student thesis, LiTH-ISY-EX--19/5186--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Abstract

Visual Simultaneous Localization And Mapping (SLAM) allows for three dimensionalreconstruction from a camera’s output and simultaneous positioning of the camera withinthe reconstruction. With use cases ranging from autonomous vehicles to augmentedreality, the SLAM field has garnered interest both commercially and academically.

A SLAM system performs odometry as it estimates the camera’s movement throughthe scene. The incremental estimation of odometry is not error free and exhibits driftover time with map inconsistencies as a result. Detecting the return to a previously seenplace, a loop, means that this new information regarding our position can be incorporatedto correct the trajectory retroactively. Loop detection can also facilitate relocalization ifthe system loses tracking due to e.g. heavy motion blur.

This thesis proposes an odometric system making use of bundle adjustment within akeyframe based stereo SLAM application. This system is capable of detecting loops byutilizing the algorithm FAB-MAP. Two aspects of this system is evaluated, the odometryand the capability to relocate. Both of these are evaluated using the EuRoC MAV dataset,with an absolute trajectory RMS error ranging from 0.80 m to 1.70 m for the machinehall sequences.

The capability to relocate is evaluated using a novel methodology that intuitively canbe interpreted. Results are given for different levels of strictness to encompass differentuse cases. The method makes use of reprojection of points seen in keyframes to definewhether a relocalization is possible or not. The system shows a capability to relocate inup to 85% of all cases when a keyframe exists that can project 90% of its points intothe current view. Errors in estimated poses were found to be correlated with the relativedistance, with errors less than 10 cm in 23% to 73% of all cases.

The evaluation of the whole system is augmented with an evaluation of local imagedescriptors and pose estimation algorithms. The descriptor SIFT was found to performbest overall, but demanding to compute. BRISK was deemed the best alternative for afast yet accurate descriptor.

Conclusions that can be drawn from this thesis is that FAB-MAP works well fordetecting loops as long as the addition of keyframes is handled appropriately.

Victor Tranell, "Semantic Segmentation of Oblique Views in a 3D-Environment", Student thesis, LiTH-ISY-EX--18/5185--SE, 2019.

AbstractKeywordsBiBTeXFulltext

Carl Dehlin, "Visual Tracking Using Stereo Images", Student thesis, LiTH-ISY-EX–18/5181–SE, 2019.

AbstractKeywordsBiBTeXFulltext

2018

John Stynsberg, "Incorporating Scene Depth in Discriminative Correlation Filters for Visual Tracking", Student thesis, LiTH-ISY-EX–18/5178–SE, 2018.

AbstractKeywordsBiBTeXFulltext

Margareta Vi, "Object Detection Using Convolutional Neural Network Trained on Synthetic Images", Student thesis, LiTH-ISY-EX--18/5180--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Michael Sörsäter, "Active Learning for Road Segmentation using Convolutional Neural Networks", Student thesis, LiTH-ISY-EX--18/5176--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Petter Stenhagen, "Improving Realism in Synthetic Barcode Images using Generative Adversarial Networks", Student thesis, LiTH-ISY-EX--18/5169--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Adam Nyberg, "Transforming Thermal Images to Visible Spectrum Images Using Deep Learning", Student thesis, LiTH-ISY-EX–18/5167–SE, 2018.

AbstractKeywordsBiBTeXFulltext

Marcus Ekström, "Road Surface Preview Estimation Using a Monocular Camera", Student thesis, LiTH-ISY-EX--18/5173--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Tobias Grundström, "Automated Measurements of Liver Fat Using Machine Learning", Student thesis, LiTH-ISY-EX--18/5166--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Fredrik Olsson, "Feature Based Learning for Point Cloud Labeling and Grasp Point Detection", Student thesis, LiTH-ISY-EX--18/5165--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Mats Nilsson, "Building Reconstruction of Digital Height Models with the Markov Chain Monte Carlo Method", Student thesis, LiTH-ISY-EX--18/5130--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Kevin Kjellén, "Point Cloud Registration in Augmented Reality using the Microsoft HoloLens", Student thesis, LiTH-ISY-EX–18/5160–SE, 2018.

AbstractKeywordsBiBTeXFulltext

Jessica Sällqvist, "Real-time 3D Semantic Segmentation of Timber Loads with Convolutional Neural Networks", Student thesis, LiTH-ISY-EX--18/5131--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Isak Strömberg, "Characterization of creping marks in paper", Student thesis, LiTH-ISY-EX--18/5151--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Fredrik Gustafsson, Erik Linder-Norén, "Automotive 3D Object Detection Without Target Domain Annotations", Student thesis, LiTH-ISY-EX--18/5138--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Björn Kernell, "Improving Photogrammetry using Semantic Segmentation", Student thesis, LiTH-ISY-EX--18/5118--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Johanna Hultberg, "Dehazing of Satellite Images", Student thesis, LiTH-ISY-EX--18/5121--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Oliver Strömgren, "Deep Learning for Autonomous Collision Avoidance", Student thesis, LiTH-ISY-EX--18/5115--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Mattias Carlsson, "Neural Networks for Semantic Segmentation in the Food Packaging Industry", Student thesis, LiTH-ISY-EX--18/5113--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Andreas Brorsson, "Compressive Sensing: Single Pixel SWIR Imaging of Natural Scenes", Student thesis, LiTH-ISY-EX--18/5108--SE, 2018.

AbstractKeywordsBiBTeXFulltext

Carl Sundelius, "Deep Fusion of Imaging Modalities for Semantic Segmentation of Satellite Imagery", Student thesis, LiTH-ISY-EX--18/5110--SE, 2018.

AbstractKeywordsBiBTeXFulltext

2017

Marcus Fallqvist, "Automatic Volume Estimation Using Structure-from-Motion Fused with a Cellphone's Inertial Sensors", Student thesis, LiTH-ISY-EX--17/5107--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Johan Lind, "Make it Meaningful: Semantic Segmentation of Three-Dimensional Urban Scene Models", Student thesis, LiTH-ISY-EX--17/5103--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Fredrik Fridborn, "Reading Barcodes with Neural Networks", Student thesis, LiTH-ISY-EX--17/5102--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Robert Norlander, "Make it Complete: Surface Reconstruction Aided by Geometric Primitives", Student thesis, LiTH-ISY-EX--17/5096--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Emil Rundgren, "Automatic Volume Estimation of Timber from Multi-View Stereo 3D Reconstruction", Student thesis, LiTH-ISY-EX--17/5093--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Matilda Lorentzon, "Feature Extraction for Image Selection Using Machine Learning", Student thesis, LiTH-ISY-EX--17/5097--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Matthieu Zins, "Color Fusion and Super-Resolution for Time-of-Flight Cameras", Student thesis, LiTH-ISY-EX--17/5089--SE, 2017.

AbstractKeywordsBiBTeXFulltext

alexander poole, "Real-Time Image Segmentation for Augmented Reality by Combiningmulti-Channel Thresholds.", Student thesis, LiTH-ISY-EX--17/5083--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Marcus Wälivaara, "General Object Detection Using Superpixel Preprocessing", Student thesis, LiTH-ISY-EX–17/5085–SE, 2017.

AbstractKeywordsBiBTeXFulltext

Viktor Andersson, "Semantic Segmentation: Using Convolutional Neural Networks and Sparse dictionaries", Student thesis, LiTH-ISY-EX--17/5054--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Maja Ilestrand, "Automatic Eartag Recognition on Dairy Cows in Real Barn Environment", Student thesis, LiTH-ISY-EX--17/5072--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Magnus Ivarsson, "Evaluation of 3D MRI Image Registration Methods", Student thesis, LiTH-ISY-EX--17/5037--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Johan Manfredsson, "Evaluation Tool for a Road Surface Algorithm", Student thesis, LiTH-ISY-EX--17/5063--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Dennis Lundström, "Data-efficient Transfer Learning with Pre-trained Networks", Student thesis, LiTH-ISY-EX--17/5051--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Joakim Johnander, "Visual Tracking with Deformable Continuous Convolution Operators", Student thesis, LiTH-ISY-EX--17/5047--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Magnus Björnfot, "Extension of DIRA (Dual-Energy Iterative Algorithm) to 3D Helical CT", Student thesis, LiTH-ISY-EX--17/5057--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Joakim Svensk, "Evaluation of Aerial Image Stereo Matching Methods for Forest Variable Estimation", Student thesis, LiTH-ISY-EX--17/5036--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Magnus Wedberg, "Detecting Rails in Images from a Train-Mounted Thermal Camera Using a Convolutional Neural Network", Student thesis, LiTH-ISY-EX--17/5058--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Ebba Wimby Schmidt, "Navigability Assessment for Autonomous Systems Using Deep Neural Networks", Student thesis, LiTH-ISY-EX--17/5045--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Marcus Lind, "Automatic Segmentation of Knee Cartilage Using Quantitative MRI Data", Student thesis, LiTH-ISY-EX--17/5041--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Armand Moulis, "Automatic Detection and Classification of Permanent and Non-Permanent Skin Marks", Student thesis, LiTH-ISY-EX--17/5048--SE , 2017.

AbstractKeywordsBiBTeXFulltext

Benjamin Lind, "Artificial Neural Networks for Image Improvement", Student thesis, LiTH-ISY-EX–17/5025–SE, 2017.

AbstractKeywordsBiBTeXFulltext

Patrik Tosteberg, "Semantic Segmentation of Point Clouds Using Deep Learning", Student thesis, LiTH-ISY-EX--17/5029--SE, 2017.

AbstractKeywordsBiBTeXFulltext

Karl Holmquist, "SLAMIt A Sub-Map Based SLAM System: On-line creation of multi-leveled map", Student thesis, LiTH-ISY-EX--16/5021--SE, 2017.

AbstractKeywordsBiBTeXFulltext

2016

Susanna Gladh, "Visual Tracking Using Deep Motion Features", Student thesis, LiTH-ISY-EX--16/5005--SE, 2016.

AbstractKeywordsBiBTeXFulltext

Gustav Tapper, "Extraction of DTM from Satellite Images Using Neural Networks", Student thesis, LiTH-ISY-EX--16/5017--SE, 2016.

AbstractKeywordsBiBTeXFulltext

Lukas Tallund, "Handling of Rolling Shutter Effects in Monocular Semi-Dense SLAM Algorithms", Student thesis, LiTH-ISY-EX--16/5016--SE, 2016.

AbstractKeywordsBiBTeXFulltext

Madeleine Stein, "Improving Image Based Fruitcount Estimates Using Multiple View-Points", Student thesis, LiTH-ISY-EX--16/5003--SE, 2016.

AbstractKeywordsBiBTeXFulltext

Victoria Härd, "Automatic Alignment of 2D Cine Morphological Images Using 4D Flow MRI Data", Student thesis, LiTH-ISY-EX–16/4992–SE, 2016.

AbstractKeywordsBiBTeXFulltext

Abstract

Cardiovascular diseases are among the most common causes of death worldwide. One of the recently developed flow analysis technique called 4D flow magnetic resonance imaging (MRI) allows an early detection of such diseases. Due to the limited resolution and contrast between blood pool and myocardium of 4D flow images, cine MR images are often used for cardiac segmentation. The delineated structures are then transferred to the 4D Flow images for cardiovascular flow analysis. Cine MR images are however acquired with multiple breath-holds, which can be challenging for some people, especially, when a cardiovascular disease is present. Consequently, unexpected breathing motion by a patient may lead to misalignments between the acquired cine MR images.

The goal of the thesis is to test the feasibility of an automatic image registration method to correct the misalignment caused by respiratory motion in morphological 2D cine MR images by using the 4D Flow MR as the reference image. As a registration method relies on a set of optimal parameters to provide desired results, a comprehensive investigation was performed to find such parameters. Different combinations of registration parameters settings were applied on 20 datasets from both healthy volunteers and patients. The best combinations, selected on the basis of normalized cross-correlation, were evaluated using the clinical gold-standard by employing widely used geometric measures of spatial correspondence. The accuracy of the best parameters from geometric evaluation was finally validated by using simulated misalignments.

Using a registration method consisting of only translation improved the results for both datasets from healthy volunteers and patients and the simulated misalignment data. For the datasets from healthy volunteers and patients, the registration improved the results from 0.7074 ± 0.1644 to 0.7551 ± 0.0737 in Dice index and from 1.8818 ± 0.9269 to 1.5953 ± 0.5192 for point-to-curve error. These values are a mean value for all the 20 datasets.

The results from geometric evaluation on the data from both healthy volunteers and patients show that the developed correction method is able to improve the alignment of the cine MR images. This allows a reliable segmentation of 4D flow MR images for cardiac flow assessment.

Ola Grankvist, "Recognition and Registration of 3D Models in Depth Sensor Data", Student thesis, LiTH-ISY-EX--16/4993--SE, 2016.

AbstractKeywordsBiBTeXFulltext

Pontus Lindberg, "Automatisk volymmätning av virkestravar på lastbil", Student thesis, LiTH-ISY-EX--16/4955--SE, 2016.

AbstractKeywordsBiBTeXFulltext

Abstract

Automatisk travmätning är ett mätsystem som mäter vedvolymen på virkeslastbilar. Systemet består av sex stycken sensor-system. Varje sensor kalibreras först individuellt och sedan ihop för att ge ett sammanfogat världskoordinat system. Varje sensor genererar en djupbild och en reﬂektansbild, där värdena i djupbilden representerar avståndet från kameran. Uppdragsgivaren har utvecklat en algoritm som utifrån mätdatat(bilderna) uppskattar vedvolymen till en viss noggrannhet som uppfyller kraven ställda av skogsindustrin för automatisk mätning av travar på virkeslastbil. I den här rapporten undersöks om bättre mätresultat kan uppnås exempelvis med andra metoder eller kombinationer av dem.Till förfogande ﬁnns ca 125 dataset av travar där facit ﬁnns. Facit består av manuella stickprovsmätningar där varje enskild stock mätts för sig. Initialt valdes aktivt att inte sätta sig in i uppdragsgivarens algoritm för att inte bli färgad av hur de kommit fram till sina resultat. Främst används fram- och baksidebilderna av entrave för att hitta stockarna. Därefter interpoleras de funna stockarna in till mitten av traven eller så paras stockarna ihop från de båda sidorna. Ibland ﬁnns vissa problem med bilderna. Oftast är minst en av sidorna ockluderade av lastbilshytten, kranen eller en annan trave. Då gäller det att hitta uppskattning utifrån det data man ser för fylla upp de skymda områdena.I början av examensarbetet användes två metoder(MSER och Punktplanmetoden) för undersöka om man kunde uppnå bra resultat utifrån att enbart mäta datat och användadet som initial gissning till volymen. Dock upptäcktes det att värdefulla detaljer i dataseten missades för att mer noggrant bestämma vedvolymen. Exempel på sådan data är fördelningen av diametern på de funna stockändarna. Tillika tenderades kraftig överestimering när travarna innehöll en viss mängd ris och eller dåligt kvistade stockar. Därefter konstruerades en geometrisk metod, och det var den här metoden som det lades mest tid på.I ﬁgurerna nedan visas en tabell och en graf där alla tre metoders resultat under bark(UB) visas och intervall gränserna för att uppfylla kraven ställda av skogsindustrin.

Hannes Järrendahl, "Automatic Detection of Anatomical Landmarks in Three-Dimensional MRI", Student thesis, LiTH-ISY-EX--16/4990--SE, 2016.

AbstractKeywordsBiBTeXFulltext

Richard Bondemark, "Improving SLAM on a TOF Camera by Exploiting Planar Surfaces", Student thesis, LiTH-ISY-EX--16/4984--SE, 2016.

AbstractKeywordsBiBTeXFulltext

Elin Andersson, "Thermal Impact of a Calibrated Stereo Camera Rig", Student thesis, LiTH-ISY-EX–16/4980–SE, 2016.

AbstractKeywordsBiBTeXFulltext

Karin Stacke, "Automatic Brain Segmentation into Substructures Using Quantitative MRI", Student thesis, LiTH-ISY-EX--16/4956--SE, 2016.

AbstractKeywordsBiBTeXFulltext

Maja Gasslander, "Segmentation of Clouds in Satellite Images", Student thesis, LiTH-ISY-EX--16/4945--SE, 2016.

AbstractKeywordsBiBTeXFulltext

David Habrman, "Face Recognition with Preprocessing and Neural Networks", Student thesis, LiTH-ISY-EX--16/4953--SE, 2016.

AbstractKeywordsBiBTeXFulltext

Tobias Norlund, "The Use of Distributional Semantics in Text Classification Models: Comparative performance analysis of popular word embeddings", Student thesis, LiTH-ISY-EX--16/4926--SE, 2016.

AbstractKeywordsBiBTeXFulltext

Mikael Jonsson, "Make it Flat: Detection and Correction of Planar Regions in Triangle Meshes", Student thesis, LiTH-ISY-EX--16/4930--SE, 2016.

AbstractKeywordsBiBTeXFulltext

Lukas Berglin, "Design, Evaluation and Implementation of a Pipeline for Semi-Automatic Lung Nodule Segmentation", Student thesis, LiTH-ISY-EX--16/4925--SE, 2016.

AbstractKeywordsBiBTeXFulltext

2015

Gustav Häger, "Improving Discriminative Correlation Filters for Visual Tracking", Student thesis, LiTH-ISY-EX-15/4919--SE, 2015.

AbstractKeywordsBiBTeXFulltext

Niklas Hansson, "Color Features for Boosted Pedestrian Detection", Student thesis, LiTH-ISY-EX--15/4899--SE, 2015.

AbstractKeywordsBiBTeXFulltext

Abstract

The car has increasingly become more and more intelligent throughout the years. Today's radar and vision based safety systems can warn a driver and brake the vehicle automatically if obstacles are detected. Research projects such as the Google Car have even succeeded in creating fully autonomous cars.

The demands to obtain the highest rating in safety tests such as Euro NCAP are also steadily increasing, and as a result, the development of these systems have become more attractive for car manufacturers. In the near future, a car must have a system for detecting, and performing automatic braking for pedestrians to receive the highest safety rating of five stars. The prospect is that the volume of active safety system will increase drastically when the car manufacturers start installing them in not only luxury cars, but also in the regularly priced ones. The use of automatic braking comes with a high demand on the performance of active safety systems, false positives must be avoided at all costs.

Dollar et al. [2014] introduced Aggregated Channel Features (ACF) which is based on a 10-channel LUV+HOG feature map. The method uses decision trees learned from boosting and has been shown to outperform previous algorithms in object detection tasks. The rediscovery of neural networks, and especially Convolutional Neural Networks (CNN) has increased the performance in almost every field of machine learning, including pedestrian detection. Recently Yang et al.[2015] combined the two approaches by using the the feature maps from a CNN as input to a decision tree based boosting framework. This resulted in state of the art performance on the challenging Caltech pedestrian data set.

This thesis presents an approach to improve the performance of a cascade of boosted classifiers by investigating the impact of using color information for pedestrian detection. The color self similarity feature introduced by Walk et al.[2010] was used to create a version better adapted for boosting. This feature is then used in combination with a gradient based feature at the last step of a cascade.

The presented feature increases the performance compared to currently used classifiers at Autoliv, on data recorded by Autoliv and on the benchmark Caltech pedestrian data set.

Oliver Larsson, "Evaluation of Flatness Gauge for Hot Rolling Mills", Student thesis, LiTH-ISY-EX--15/4894--SE, 2015.

AbstractKeywordsBiBTeXFulltext

Daniel Sandsveden, "Evaluation of Random Forests for Detection and Localization of Cattle Eyes", Student thesis, LiTH-ISY-EX--15/4885--SE, 2015.

AbstractKeywordsBiBTeXFulltext

Peter Thulin, "Anomaly Detection for Product Inspection and Surveillance Applications", Student thesis, LiTH-ISY-EX--15/4889--SE, 2015.

AbstractKeywordsBiBTeXFulltext

Anna Söderroos, "Fisheye Camera Calibration and Image Stitching for Automotive Applications", Student thesis, LiTH-ISY-EX--15/4887--SE, 2015.

AbstractKeywordsBiBTeXFulltext

Felix Järemo Lawin, "Depth Data Processing and 3D Reconstruction Using the Kinect v2", Student thesis, LiTH-ISY-EX–15/4884–SE, 2015.

AbstractKeywordsBiBTeXFulltext

Abstract

The Kinect v2 is a RGB-D sensor manufactured as a gesture interaction tool for the entertainment console XBOX One. In this thesis we will use it to perform 3D reconstruction and investigate its ability to measure depth. In order to sense both color and depth the Kinect v2 has two cameras: one RGB camera and one infrared camera used to produce depth and near infrared images. These cameras need to be calibrated if we want to use them for 3D reconstruction. We present a calibration procedure for simultaneously calibrating the cameras and extracting their relative pose. This enables us to construct colored meshes of the environment. When we know the camera parameters of the infrared camera, the depth images could be used to perform the Kinectfusion algorithm. This produces well-formed meshes of the environment by combining many depth frames taken from several camera poses.The Kinect v2 uses a time-of-flight technology were the phase shifts are extracted from amplitude modulated infrared light signals produced by an emitter. The extracted phase shifts are then converted to depth values. However, the extraction of phase shifts includes a phase unwrapping procedure, which is sensitive to noise and can result in large depth errors.By utilizing the ability to access the raw phase measurements from the device we managed to modify the phase unwrapping procedure. This new procedure includes an extraction of several hypotheses for the unwrapped phase and a spatial propagation to select amongst them. This proposed method has been compared with the available drivers in the open source library libfreenect2 and the Microsoft Kinect SDK v2. Our experiments show that the depth images of the two available drivers have similar quality and our proposed method improves over libfreenect2. The calculations in the proposed method are more expensive than those in libfreenect2 but it still runs at 2.5× real time. However, contrary to libfreenect2 the proposed method lacks a filter that removes outliers from the depth images. It turned out that this is an important feature when performing Kinect fusion and future work should thus be focused on adding an outlier filter.

David Molin, "Pedestrian Detection Using Convolutional Neural Networks", Student thesis, LiTH-ISY-EX–15/4855–SE, 2015.

AbstractKeywordsBiBTeXFulltext

Carl Karlsson Schmidt, "Rhino and Human Detection in Overlapping RGB and LWIR Images", Student thesis, LiTH-ISY-EX--15/4837--SE, 2015.

AbstractKeywordsBiBTeXFulltext

Niklas Rydholm, "Panoramic Video Stitching", Student thesis, LiTH-ISY-EX--15/4858--SE, 2015.

AbstractKeywordsBiBTeXFulltext

Olle Fridolfsson, "Machine Learning: for Barcode Detection and OCR", Student thesis, LiTH-ISY-Ex--15/4842--SE, 2015.

AbstractKeywordsBiBTeXFulltext

Alexander Vibeck, "Synchronization of a Multi Camera System", Student thesis, LiTH-ISY-EX-ET--15/0438--SE, 2015.

AbstractKeywordsBiBTeXFulltext

Benjamin Ingberg, "Registration of 2D Objects in 3D data", Student thesis, LiTH-ISY-EX–15/4848–SE, 2015.

AbstractKeywordsBiBTeXFulltext

Patrik Hillgren, "Geometric Scene Labeling for Long-Range Obstacle Detection", Student thesis, LiTH-ISY-EX--14/4819--SE, 2015.

AbstractKeywordsBiBTeXFulltext

2014

Mikael Persson, "Online Monocular SLAM: Rittums", Student thesis, Lith-ISY-EX--13/4741-SE, 2014.

AbstractKeywordsBiBTeXFulltext

Nikolaus Widebäck West, "Multiple Session 3D Reconstruction using RGB-D Cameras", Student thesis, LiTH-ISY-EX--14/4814--SE, 2014.

AbstractKeywordsBiBTeXFulltext

Abstract

In this thesis we study the problem of multi-session dense rgb-d slam for 3D reconstruc- tion. Multi-session reconstruction can allow users to capture parts of an object that could not easily be captured in one session, due for instance to poor accessibility or user mistakes. We first present a thorough overview of single-session dense rgb-d slam and describe the multi-session problem as a loosening of the incremental camera movement and static scene assumptions commonly held in the single-session case. We then implement and evaluate sev- eral variations on a system for doing two-session reconstruction as an extension to a single- session dense rgb-d slam system.

The extension from one to several sessions is divided into registering separate sessions into a single reference frame, re-optimizing the camera trajectories, and fusing together the data to generate a final 3D model. Registration is done by matching reconstructed models from the separate sessions using one of two adaptations on a 3D object detection pipeline. The registration pipelines are evaluated with many different sub-steps on a challenging dataset and it is found that robust registration can be achieved using the proposed methods on scenes without degenerate shape symmetry. In particular we find that using plane matches between two sessions as constraints for as much as possible of the registration pipeline improves results.

Several different strategies for re-optimizing camera trajectories using data from both ses- sions are implemented and evaluated. The re-optimization strategies are based on re- tracking the camera poses from all sessions together, and then optionally optimizing over the full problem as represented on a pose-graph. The camera tracking is done by incrementally building and tracking against a tsdf volume, from which a final 3D mesh model is extracted. The whole system is qualitatively evaluated against a realistic dataset for multi-session re- construction. It is concluded that the overall approach is successful in reconstructing objects from several sessions, but that other fine grained registration methods would be required in order to achieve multi-session reconstructions that are indistinguishable from singe-session results in terms of reconstruction quality.

Mattias Nilsson, "Evaluation of Computer Vision Algorithms Optimized for Embedded GPU:s.", Student thesis, LiTH-ISY-EX--14/4816--SE, 2014.

AbstractKeywordsBiBTeXFulltext

Alexander Sjöholm, "Closing the Loop: Mobile Visual Location Recognition", Student thesis, LiTH-ISY-EX--14/4813--SE, 2014.

AbstractKeywordsBiBTeXFulltext

Sanna Ringqvist, "Classification of terrain using superpixel segmentation and supervised learning", Student thesis, LiTH-ISY-EX--14/4752--SE, 2014.

AbstractKeywordsBiBTeXFulltext

Martin Svensson, "Accelerated Volumetric Next-Best-View Planning in 3D Mapping", Student thesis, LiTH-ISY-EX--14/4801--SE, 2014.

AbstractKeywordsBiBTeXFulltext

Andreas Robinson, "Implementation and evaluation of a 3D tracker", Student thesis, LiTH-ISY-EX--14/4800--SE, 2014.

AbstractKeywordsBiBTeXFulltext

Markus Landberg, "Enhancement Techniques for Lane PositionAdaptation (Estimation) using GPS- and Map Data", Student thesis, LiTH-ISY-EX--14/4788--SE, 2014.

AbstractKeywordsBiBTeXFulltext

Erik Fall, "Compressed Sensing for 3D Laser Radar", Student thesis, LiTH-ISY-EX–-14/4767-–SE, 2014.

AbstractKeywordsBiBTeXFulltext

Abstract

High resolution 3D images are of high interest in military operations, where data can be used to classify and identify targets. The Swedish defence research agency (FOI) is interested in the latest research and technologies in this area. A draw- back with normal 3D-laser systems are the lack of high resolution for long range measurements. One technique for high long range resolution laser radar is based on time correlated single photon counting (TCSPC). By repetitively sending out short laser pulses and measure the time of flight (TOF) of single reflected pho- tons, extremely accurate range measurements can be done. A drawback with this method is that it is hard to create single photon detectors with many pixels and high temporal resolution, hence a single detector is used. Scanning an entire scene with one detector is very time consuming and instead, as this thesis is all about, the entire scene can be measured with less measurements than the number of pixels. To do this a technique called compressed sensing (CS) is introduced. CS utilizes that signals normally are compressible and can be represented sparse in some basis representation. CS sets other requirements on the sampling compared to the normal Shannon-Nyquist sampling theorem. With a digital micromirror device (DMD) linear combinations of the scene can be reflected onto the single photon detector, creating scalar intensity values as measurements. This means that fewer DMD-patterns than the number of pixels can reconstruct the entire 3D-scene. In this thesis a computer model of the laser system helps to evaluate different CS reconstruction methods with different scenarios of the laser system and the scene. The results show how many measurements that are required to reconstruct scenes properly and how the DMD-patterns effect the results. CS proves to enable a great reduction, 85 − 95 %, of the required measurements com- pared to pixel-by-pixel scanning system. Total variation minimization proves to be the best choice of reconstruction method.

Rolf Kargén, "Utveckling av ett active vision system för demonstration av EDSDK++ i tillämpningar inom datorseende", Student thesis, LiTH-ISY-EX-ET--14/0419--SE, 2014.

AbstractKeywordsBiBTeXFulltext

Morgan Bengtsson, "Indoor 3D Mapping using Kinect", Student thesis, LiTH-ISY-EX--14/4753--SE, 2014.

AbstractKeywordsBiBTeXFulltext

2013

Martin Danelljan, "Visual Tracking", Student thesis, LiTH-ISY-EX--13/4736--SE, 2013.

AbstractKeywordsBiBTeXFulltext

Abstract

Visual tracking is a classical computer vision problem with many important applications in areas such as robotics, surveillance and driver assistance. The task is to follow a target in an image sequence. The target can be any object of interest, for example a human, a car or a football. Humans perform accurate visual tracking with little effort, while it remains a difficult computer vision problem. It imposes major challenges, such as appearance changes, occlusions and background clutter. Visual tracking is thus an open research topic, but significant progress has been made in the last few years.

The first part of this thesis explores generic tracking, where nothing is known about the target except for its initial location in the sequence. A specific family of generic trackers that exploit the FFT for faster tracking-by-detection is studied. Among these, the CSK tracker have recently shown obtain competitive performance at extraordinary low computational costs. Three contributions are made to this type of trackers. Firstly, a new method for learning the target appearance is proposed and shown to outperform the original method. Secondly, different color descriptors are investigated for the tracking purpose. Evaluations show that the best descriptor greatly improves the tracking performance. Thirdly, an adaptive dimensionality reduction technique is proposed, which adaptively chooses the most important feature combinations to use. This technique significantly reduces the computational cost of the tracking task. Extensive evaluations show that the proposed tracker outperform state-of-the-art methods in literature, while operating at several times higher frame rate.

In the second part of this thesis, the proposed generic tracking method is applied to human tracking in surveillance applications. A causal framework is constructed, that automatically detects and tracks humans in the scene. The system fuses information from generic tracking and state-of-the-art object detection in a Bayesian filtering framework. In addition, the system incorporates the identification and tracking of specific human parts to achieve better robustness and performance. Tracking results are demonstrated on a real-world benchmark sequence.

Richard Ekblad, "Korrelering mellan optiskt och akustiskt avbildade objekt på havsbotten", Student thesis, LiTH-ISY-EX--13/4742--SE, 2013.

AbstractKeywordsBiBTeXFulltext

Eric Gratorp, "Evaluation of online hardware video stabilization on a moving platform", Student thesis, LiTH-ISY-EX--13/4723--SE, 2013.

AbstractKeywordsBiBTeXFulltext

Johannes Markström, "3D Position Estimation of a Person of Interest in Multiple Video Sequences: People Detection", Student thesis, LiTH-ISY-EX--13/4721--SE, 2013.

AbstractKeywordsBiBTeXFulltext

Abstract

In most cases today when a specific person's whereabouts is monitored through video surveillance it is done manually and his or her location when not seen is based on assumptions on how fast he or she can move. Since humans are good at recognizing people this can be done accurately, given good video data, but the time needed to go through all data is extensive and therefore expensive. Because of the rapid technical development computers are getting cheaper to use and therefore more interesting to use for tedious work.

This thesis is a part of a larger project that aims to see to what extent it is possible to estimate a person of interest's time dependent 3D position, when seen in surveillance videos. The surveillance videos are recorded with non overlapping monocular cameras. Furthermore the project aims to see if the person of interest's movement, when position data is unavailable, could be predicted. The outcome of the project is a software capable of following a person of interest's movement with an error estimate visualized as an area indicating where the person of interest might be at a specific time.

This thesis main focus is to implement and evaluate a people detector meant to be used in the project, reduce noise in position measurement, predict the position when the person of interest's location is unknown, and to evaluate the complete project.

The project combines known methods in computer vision and signal processing and the outcome is a software that can be used on a normal PC running on a Windows operating system. The software implemented in the thesis use a Hough transform based people detector and a Kalman filter for one step ahead prediction. The detector is evaluated with known methods such as Miss-rate vs. False Positives per Window or Image (FPPW and FPPI respectively) and Recall vs. 1-Precision.

The results indicate that it is possible to estimate a person of interest's 3D position with single monocular cameras. It is also possible to follow the movement, to some extent, were position data are unavailable. However the software needs more work in order to be robust enough to handle the diversity that may appear in different environments and to handle large scale sensor networks.

Victor Johansson, "3D Position Estimation of a Person of Interest in Multiple Video Sequences: Person of Interest Recognition", Student thesis, LiTH-ISY-EX--13/4718--SE, 2013.

AbstractKeywordsBiBTeXFulltext

Niklas Pettersson, "GPU-Accelerated Real-Time Surveillance De-Weathering", Student thesis, LiTH-ISY-EX--13/4677--SE, 2013.

AbstractKeywordsBiBTeXFulltext

Amanda Berg, "Classification of leakage detections acquired by airborne thermography of district heating networks", Student thesis, LiTH-ISY-EX--13/4678--SE, 2013.

AbstractKeywordsBiBTeXFulltext

Abstract

In Sweden and many other northern countries, it is common for heat to be distributed to homes and industries through district heating networks. Such networks consist of pipes buried underground carrying hot water or steam with temperatures in the range of 90-150 C. Due to bad insulation or cracks, heat or water leakages might appear.

A system for large-scale monitoring of district heating networks through remote thermography has been developed and is in use at the company Termisk Systemteknik AB. Infrared images are captured from an aircraft and analysed, finding and indicating the areas for which the ground temperature is higher than normal. During the analysis there are, however, many other warm areas than true water or energy leakages that are marked as detections. Objects or phenomena that can cause false alarms are those who, for some reason, are warmer than their surroundings, for example, chimneys, cars and heat leakages from buildings.

During the last couple of years, the system has been used in a number of cities. Therefore, there exists a fair amount of examples of different types of detections. The purpose of the present master’s thesis is to evaluate the reduction of false alarms of the existing analysis that can be achieved with the use of a learning system, i.e. a system which can learn how to recognize different types of detections.

A labelled data set for training and testing was acquired by contact with customers. Furthermore, a number of features describing the intensity difference within the detection, its shape and propagation as well as proximity information were found, implemented and evaluated. Finally, four different classifiers and other methods for classification were evaluated.

The method that obtained the best results consists of two steps. In the initial step, all detections which lie on top of a building are removed from the data set of labelled detections. The second step consists of classification using a Random forest classifier. Using this two-step method, the number of false alarms is reduced by 43% while the percentage of water and energy detections correctly classified is 99%.

Felix Björkeson, "Autonomous Morphometrics using Depth Cameras for Object Classification and Identification", Student thesis, LiTH-ISY-EX--13/4680--SE, 2013.

AbstractKeywordsBiBTeXFulltext

Astrid de Laval, "Online Calibration of Camera Roll Angle", Student thesis, LiTH-ISY-EX--13/4688--SE, 2013.

AbstractKeywordsBiBTeXFulltext

Daniel Rydström, "Calibration of Laser Triangulating Cameras in Small Fields of View", Student thesis, LiTH-ISY-EX--13/4669--SE, 2013.

AbstractKeywordsBiBTeXFulltext

Magnus Stigson, "Object Tracking Using Tracking-Learning-Detection inThermal Infrared Video", Student thesis, LiTH-ISY-EX--13/4668--SE, 2013.

AbstractKeywordsBiBTeXFulltext

Niklas Ollesson, "Automatic Configuration of Vision Sensor", Student thesis, LiTH-ISY-EX--13/4666--SE, 2013.

AbstractKeywordsBiBTeXFulltext

Kristofer Höglund, "Non-destructive Testing Using Thermographic Image Processing", Student thesis, LiTH-ISY-EX--13/4655--SE, 2013.

AbstractKeywordsBiBTeXFulltext

2012

Anton Nordmark, "Kinect 3D Mapping", Student thesis, LiTH-ISY-EX--12/4636--SE, 2012.

AbstractKeywordsBiBTeXFulltext

Oscar Grandell, "An iterative reconstruction algorithm for quantitative tissue decomposition using DECT", Student thesis, LiTH-ISY-EX--12/4617--SE, 2012.

AbstractKeywordsBiBTeXFulltext

Fredrik Johansson, "Visual Stereo Odometry for Indoor Positioning", Student thesis, LiTH-ISY-EX--12/4621--SE, 2012.

AbstractKeywordsBiBTeXFulltext

Kiran Kumar Budde, "A Matlab Toolbox for fMRI Data Analysis: Detection, Estimation and Brain Connectivity", Student thesis, LiTH-ISY-EX--12/4600--SE, 2012.

AbstractKeywordsBiBTeXFulltext

Maria Schmiterlöw, "Autonomous Path Following Using Convolutional Networks", Student thesis, LiTH-ISY-EX--12/4577--SE, 2012.

AbstractKeywordsBiBTeXFulltext

Henrik Wolkesson, "Realtime Mosaicing of Video Stream from µUAV", Student thesis, LiTH-ISY-EX--07/4140--SE, 2012.

AbstractKeywordsBiBTeXFulltext

2011

Mattias Josefsson, "3D camera with built-in velocity measurement", Student thesis, LiTH-ISY-EX--11/4523--SE, 2011.

AbstractKeywordsBiBTeXFulltext

Richard Wasell, "Automatisk detektering av diken i LiDAR-data", Student thesis, LiTH-ISY-EX--11/4524--SE, 2011.

AbstractKeywordsBiBTeXFulltext

Anette Karlsson, "In-Plane Motion Correction in Reconstruction of non-Cartesian 3D-functional MRI", Student thesis, LiTH-ISY-EX--11/4480--SE, 2011.

AbstractKeywordsBiBTeXFulltext

Gustav Ahlman, "Improved Temporal Resolution Using Parallel Imaging in Radial-Cartesian 3D functional MRI", Student thesis, LiTH-ISY-EX--11/4470--SE, 2011.

AbstractKeywordsBiBTeXFulltext

Abstract

MRI (Magnetic Resonance Imaging) is a medical imaging method that uses magnetic fields in order to retrieve images of the human body. This thesis revolves around a novel acquisition method of 3D fMRI (functional Magnetic Resonance Imaging) called PRESTO-CAN that uses a radial pattern in order to sample the (kx,kz)-plane of k-space (the frequency domain), and a Cartesian sample pattern in the ky-direction. The radial sample pattern allows for a denser sampling of the central parts of k-space, which contain the most basic frequency information about the structure of the recorded object. This allows for higher temporal resolution to be achieved compared with other sampling methods since a fewer amount of total samples are needed in order to retrieve enough information about how the object has changed over time. Since fMRI is mainly used for monitoring blood flow in the brain, increased temporal resolution means that we can be able to track fast changes in brain activity more efficiently.The temporal resolution can be further improved by reducing the time needed for scanning, which in turn can be achieved by applying parallel imaging. One such parallel imaging method is SENSE (SENSitivity Encoding). The scan time is reduced by decreasing the sampling density, which causes aliasing in the recorded images. The aliasing is removed by the SENSE method by utilizing the extra information provided by the fact that multiple receiver coils with differing sensitivities are used during the acquisition. By measuring the sensitivities of the respective receiver coils and solving an equation system with the aliased images, it is possible to calculate how they would have looked like without aliasing.In this master thesis, SENSE has been successfully implemented in PRESTO-CAN. By using normalized convolution in order to refine the sensitivity maps of the receiver coils, images with satisfying quality was able to be reconstructed when reducing the k-space sample rate by a factor of 2, and images of relatively good quality also when the sample rate was reduced by a factor of 4. In this way, this thesis has been able to contribute to the improvement of the temporal resolution of the PRESTO-CAN method.

Tobias Lundqvist, "3D mapping with iPhone", Student thesis, LiTH-ISY-EX--11/4517--SE, 2011.

AbstractKeywordsBiBTeXFulltext

Rikard Söderström, "An early fire detection system through registration and analysis of waste station IR-images", Student thesis, LiTH-ISY-EX--11/4354--SE, 2011.

AbstractKeywordsBiBTeXFulltext

Viola Thomasson, "Liver Tumor Segmentation Using Level Sets and Region Growing", Student thesis, LiTH-ISY-EX--11/4485--SE, 2011.

AbstractKeywordsBiBTeXFulltext

Abstract

Medical imaging is an important tool for diagnosis and treatment planning today. However as the demand for efficiency increases at the same time as the data volumes grow immensely, the need for computer assisted analysis, such as image segmentation, to help and guide the practitioner increases.

Medical image segmentation could be used for various different tasks, the localization and delineation of pathologies such as cancer tumors is just one example. Numerous problems with noise and image artifacts in the generated images make the segmentation a difficult task, and the developer is forced to choose between speed and performance. In clinical practise, however, this is impossible as both speed and performance are crucial. One solution to this problem might be to involve the user more in the segmentation, using interactivite algorithms where the user might influence the segmentation for an improved result.

This thesis has concentrated on finding a fast and interactive segmentation method for liver tumor segmentation. Various different methods were explored, and a few were chosen for implementation and further development. Two methods appeared to be the most promising, Bayesian Region Growing (BRG) and Level Set.

An interactive Level Set algorithm emerged as the best alternative for the interactivity of the algorithm, and could be used in combination with both BRG and Level Set. A new data term based on a probability model instead of image edges was also explored for the Level Set-method, and proved to be more promising than the original one. The probability based Level Set and the BRG method both provided good quality results, but the fastest of the two was the BRG-method, which could segment a tumor present in 25 CT image slices in less than 10 seconds when implemented in Matlab and mex-C++ code on an ACPI x64-based PC with two 2.4 GHz Intel(R) Core(TM) 2CPU and 8 GB RAM memory. The interactive Level Set could be succesfully used as an interactive addition to the automatic method, but its usefulness was somewhat reduced by its slow processing time ( 1.5 s/slice) and the relative complexity of the needed user interactions.

Gustav Hanning, "Video Stabilization and Rolling Shutter Correction using Inertial Measurement Sensors", Student thesis, LiTH-ISY-EX--11/4464--SE, 2011.

AbstractKeywordsBiBTeXFulltext

Ema Ceco, "Image Analysis in the Field of Oil Contamination Monitoring", Student thesis, LITH-ISY-EX--11/4467--SE, 2011.

AbstractKeywordsBiBTeXFulltext

David Sandberg, "Model-Based Video Coding Using a Colour and Depth Camera", Student thesis, LiTH-ISY-EX--11/4463--SE, 2011.

AbstractKeywordsBiBTeXFulltext

Sebastian Möller, "Image Segmentation and Target Tracking using Computer Vision", Student thesis, LiTH-ISY-EX--11/4424--SE, 2011.

AbstractKeywordsBiBTeXFulltext

Andreas Schöndell, "Evaluation of methods for segmentation of 3D range image data", Student thesis, LiTH-ISY-EX--11/4346--SE, 2011.

AbstractKeywordsBiBTeXFulltext

2010

Axel Landgren, "A robotic camera platform for evaluation of biomimetic gaze stabilization using adaptive cerebellar feedback", Student thesis, LiTH-ISY-EX--10/4351--SE, 2010.

AbstractKeywordsBiBTeXFulltext

Fredrik Svensson, "Structure from Forward Motion", Student thesis, LiTH-ISY-EX--10/4364--SE, 2010.

AbstractKeywordsBiBTeXFulltext

Anders Lind, "High-speed View Matching using Region Descriptors", Student thesis, LiTH-ISY-EX--10/4356--SE, 2010.

AbstractKeywordsBiBTeXFulltext

Peter Johansson, "Plant Condition Measurement from Spectral Reflectance Data", Student thesis, LiTH-ISY-EX--10/4369--SE, 2010.

AbstractKeywordsBiBTeXFulltext

Hannes Holm Ovrén, Erika Emilsson, "Missile approach warning using multi-spectral imagery", Student thesis, LiTH-ISY-EX--10/4329--SE, 2010.

AbstractKeywordsBiBTeXFulltext

Kristoffer Öfjäll, "LEAP, A Platform for Evaluation of Control Algorithms", Student thesis, LiTH-ISY-EX--10/4370--SE, 2010.

AbstractKeywordsBiBTeXFulltext

Alexander Tuttle, "Saliency Maps using Channel Representations", Student thesis, LITH-ISY-EX--10/4169--SE, 2010.

AbstractKeywordsBiBTeXFulltext

Joel Molin, "Foreground Segmentation of Moving Objects", Student thesis, LiTH-ISY-EX–10/4299–SE, 2010.

AbstractKeywordsBiBTeXFulltext

2009

Mattias Lennartsson, "Object Recognition with Cluster Matching", Student thesis, LITH-ISY-EX--09/4152--SE, 2009.

AbstractKeywordsBiBTeXFulltext

Michael Westberg, "Time of Flight Based Teat Detection", Student thesis, LiTH-ISY-EX--09/4154 --SE, 2009.

AbstractKeywordsBiBTeXFulltext

Lina Stuhr, "Grain Reduction in Scanned Image Sequences under Time Constraints", Student thesis, LiTH-ISY-EX--09/4203--SE, 2009.

AbstractKeywordsBiBTeXFulltext

Marcus Wallenberg, "A Single-Camera Gaze Tracker using Controlled Infrared Illumination", Student thesis, LITH-ISY-EX--09/4199--SE, 2009.

AbstractKeywordsBiBTeXFulltext

2008

Per Thyr, "Method for Acquisition and Reconstruction of non-Cartesian 3-D fMRI", Student thesis, LITH-ISY-EX--08/4058--SE, 2008.

AbstractKeywordsBiBTeXFulltext

Markus Olgemar, "Camera Based Navigation: Matching between Sensor reference and Video image", Student thesis, LITH-ISY-EX--08/4170--SE, 2008.

AbstractKeywordsBiBTeXFulltext

Erik Ringaby, "Optical Flow Computation on Compute Unified Device Architecture", Student thesis, LiTH-ISY-EX--08/4043--SE, 2008.

AbstractKeywordsBiBTeXFulltext

Mikael Karelid, "Image Enhancement over a Sequence of Images", Student thesis, LiTH-ISY-EX--08/4013--SE, 2008.

AbstractKeywordsBiBTeXFulltext

Marcus Lundagårds, "Vehicle Detection in Monochrome Images", Student thesis, LiTH-ISY-EX--08/4148--SE, 2008.

AbstractKeywordsBiBTeXFulltext

Martin Berg, "Pose Recognition for Tracker Initialization Using 3D Models", Student thesis, LiTH-ISY-EX--07/4076--SE, 2008.

AbstractKeywordsBiBTeXFulltext

2007

Björn Benderius, "Laser Triangulation Using Spacetime Analysis", Student thesis, LITH-ISY-EX--07/4047--SE, 2007.

AbstractKeywordsBiBTeXFulltext

Lars Arvidsson, "Stereoseende i realtid", Student thesis, LITH-ISY-EX--07/3944--SE, 2007.

AbstractKeywordsBiBTeXFulltext

Johan Hallenberg, "Robot Tool Center Point Calibration using Computer Vision", Student thesis, LiTH-ISY-EX-- 07/3943--SE, 2007.

AbstractKeywordsBiBTeXFulltext

John Wood, "Statistical Background Models with Shadow Detection for Video Based Tracking", Student thesis, LITH-ISY-EX--07/3921--SE, 2007.

AbstractKeywordsBiBTeXFulltext

Johan Borg, "Detecting and Tracking Players in Football Using Stereo Vision", Student thesis, LiTH−ISY−EX--07/3535--SE, 2007.

AbstractKeywordsBiBTeXFulltext

2006

Jonas Dehlin, Joakim Löf, "Dynamic Infrared Simulation: A Feasibility Study of a Physically Based Infrared Simulation Model", Student thesis, LITH-ISY-EX--06/3815--SE, 2006.

AbstractKeywordsBiBTeXFulltext

Hans Brolund, "Förbättring av fluoroskopibilder", Student thesis, LITH-ISY-EX-06/3823-SE, 2006.

AbstractKeywordsBiBTeXFulltext

Christer Norström, "Underwater 3-D imaging with laser triangulation", Student thesis, LiTH-ISY-EX--06/3851--SE, 2006.

AbstractKeywordsBiBTeXFulltext

2005

Gabriella Gustafsson, "Multiphase Motion Estimation in a Two Phase Flow", Student thesis, LITH-ISY-EX--05/3723--SE, 2005.

AbstractKeywordsBiBTeXFulltext

Wilhelm Isoz, "Calibration of Multispectral Sensors", Student thesis, LiTH-ISY-EX--05/3651--SE, 2005.

AbstractKeywordsBiBTeXFulltext

Robin Björling, "Denoising of Infrared Images Using Independent Component Analysis", Student thesis, LiTH-ISY-EX--05/3726--SE, 2005.

AbstractKeywordsBiBTeXFulltext

Gunnar Hedlund, "Närmaskbestämning från stereoseende", Student thesis, LiTH-ISY-EX--05/3623--SE, 2005.

AbstractKeywordsBiBTeXFulltext

Mattias Sonesson, "A Probabilistic Approach to Conceptual Sensor Modeling", Student thesis, LITH-ISY-EX-3428-2004, 2005.

AbstractKeywordsBiBTeXFulltext

Adam Andersson, "Range Gated Viewing with Underwater Camera", Student thesis, LITH-ISY-EX--05/3718--SE, 2005.

AbstractKeywordsBiBTeXFulltext

Björn Wernersson, Mikael Södergren, "Automatiserad inlärning av detaljer för igenkänning och robotplockning", Student thesis, LiTH-ISY-EX--05/3755—SE, 2005.

AbstractKeywordsBiBTeXFulltext

Mathias Andersson, "Image processing algorithms for compensation of spatially variant blur", Student thesis, LITH-ISY-EX--05/3633--SE, 2005.

AbstractKeywordsBiBTeXFulltext

Staffan Håkansson, "Detektering av sprickor i vägytor med hjälp av Datorseende", Student thesis, LITH-ISY-EX--05/3699--SE, 2005.

AbstractKeywordsBiBTeXFulltext

2004

Johan Sunnegårdh, "Iterative Enhancement of Non-Exact Reconstruction in Cone Beam CT", Student thesis, LITH-ISY-EX--04/3646--SE, 2004.

AbstractKeywordsBiBTeXFulltext

Johan Schultz, "Sensordatafusion av IR- och radarbilder", Student thesis, LiTH-ISY-Ex No. 3475, 2004.

AbstractKeywordsBiBTeXFulltext

2003

Niklas Dahlbäck, "Implementation of a fast method for reconstruction of ISAR images", Student thesis, LiTH-ISY-Ex No. 3437, 2003.

AbstractKeywordsBiBTeXFulltext

Elisabeth Ågren, "Lateral Position Detection Using a Vehicle-Mounted Camera", Student thesis, LiTH-ISY-Ex No. 3417, 2003.

AbstractKeywordsBiBTeXFulltext

Petter Torle, "Scene-based correction of image sensor deficiencies", Student thesis, LiTH-ISY-Ex No. 3350, 2003.

AbstractKeywordsBiBTeXFulltext

Per Öberg, "Tracking by Image Processing in a Real Time System", Student thesis, LiTH-ISY-Ex No. 3374, 2003.

AbstractKeywordsBiBTeXFulltext

Mårten Björk, Sofia Max, "ARTSY: A Reproduction Transaction System", Student thesis, LiTH-ISY-Ex No. 3262, 2003.

AbstractKeywordsBiBTeXFulltext

2002

Håkan Bjurström, Jon Svensson, "Assessment of Grapevine Vigour Using Image Processing", Student thesis, LiTH-ISY-Ex No. 3293, 2002.

AbstractKeywordsBiBTeXFulltext

Andreas Eidehall, "Tensor representation of 3D structures", Student thesis, LiTH-ISY-Ex No. 3271, 2002.

AbstractKeywordsBiBTeXFulltext

Andreas Böckert, "Vehicle detection and classification in video sequences", Student thesis, LiTH-ISY-Ex No. 3270, 2002.

AbstractKeywordsBiBTeXFulltext

Sara Molin, "Förbättring av upplösningen i Landsat 7-bilder med hjälp av bildfusion", Student thesis, LiTH-ISY-Ex No. 3229, 2002.

AbstractKeywordsBiBTeXFulltext

Mikael Svensson, "Utveckling av styrning till solföljande MaReCo-hybrid i Hammarby Sjöstad", Student thesis, LiTH-ISY-Ex No. 3193, 2002.

AbstractKeywordsBiBTeXFulltext

Per Mattsson, Andreas Eriksson, "Segmentation of Carotid Arteries from 3D and 4D Ultrasound Images", Student thesis, LiTH-ISY-Ex No. 3279, 2002.

AbstractKeywordsBiBTeXFulltext

Marcus Isaksson, "Face Detection and Pose Estimation using Triplet Invariants", Student thesis, LiTH-ISY-Ex No. 3223, 2002.

AbstractKeywordsBiBTeXFulltext

2001

Oskar Söderkvist, "Computer Vision Classification of Leaves from Swedish Trees", Student thesis, LiTH-ISY-Ex No. 3132, 2001.

AbstractKeywordsBiBTeXFulltext

Per Nordlöv, "Implementation Aspects of Image Processing", Student thesis, LiTH-ISY-Ex No. 3088, 2001.

AbstractKeywordsBiBTeXFulltext

2000

Marcus Klomark, "Occupant Detection using Computer Vision", Student thesis, LiTH-ISY-Ex No. 3026, 2000.

AbstractKeywordsBiBTeXFulltext

1999

Urban Bergquist, "Colour Vision and Hue for Autonomous Vehicle Guidance", Student thesis, LiTH-ISY-Ex No. 2091, 1999.

AbstractKeywordsBiBTeXFulltext

Stefan Langemark, "GIS in a simulator environment and efficient inverse mapping of roads", Student thesis, LiTH-ISY-Ex No. 2090, 1999.

AbstractKeywordsBiBTeXFulltext

Abstract

This thesis investigates the possibilities of using GIS (Geographic Information System) data with an airborne autonomous vehicle developed in the WITAS project. Available for the thesis are high resolution (0.16 meter sample interval) aerial photographs over Stockholm, and vector data in a common GIS format containing all roads in the Stockholm area.

A method for removing cars from aerial photographs is presented, using the filtering method normalized convolution, originally developed for filtering uncertain and incomplete data. By setting the certainty to zero over the cars, this data is disregarded in the filtering process, resulting in an image without cars. This method is further improved by choosing an anisotropic applicability function, resulting in a filtering that preserves structures oriented in certain directions.

The available vector data is investigated with regard to its use in a simulator for vehicle movement, and is found to be missing much of the essential information needed in such a simulator. A new data format better suited to these requirements is created, using the extensible markup language (XML) which generates a humanreadable data format and can use existing parsers to make the implementation simpler. The result is a somewhat complex, but highly general data format that can accurately express almost any type of road and intersection. Cars can follow arbitrary paths in the road database and move with a smooth motion suitable for use as input to image processing equipment. The simulator does not allow any dynamic behaviour such as changing speeds, starting or stopping, or interaction between cars, takeovers or intelligent behavior in intersections.

In the airborne vehicle, a mapping from pixels in a camera image (like the ones output from the simulator) to locations in the road database is needed. This is an inverse mapping with respect to visualizing as described above. This gives important information to a car tracking system regarding the probable movement of cars and also making it possible to determine if a car breaks traffic regulations. A mapping of this kind is created using a simplified form of ray tracing known as ray casting, together with space partitioning methods used to vastly improve efficiency.

All above mentioned tasks are implemented using C++ and object oriented methods, giving maintainable and extendable code suiting a quickly changing research area. The interface to the simulator is designed to be compatible to the existing simulation software used in the WITAS project. Visualization is done through the OpenGL graphics library, providing realistic effects such as lighting and shading.

Jakob Roll, "A System for Visual-Based Automated Storage Robots", Student thesis, LiTH-ISY-Ex No. 2053, 1999.

AbstractKeywordsBiBTeXFulltext

Robert Stewing, "Parameterprediktering med multipla sammansatta lokala neuronnätsbaserade modeller vid framställning av pappersmassa", Student thesis, LiTH-ISY-Ex No. 1991, 1999.

AbstractKeywordsBiBTeXFulltext

1998

Anders Moe, "Investigations in Tracking and Colour Classification", Student thesis, LiTH-ISY-Ex No. 1967, 1998.

AbstractKeywordsBiBTeXFulltext

1997

Björn Johansson, "Multidimensional signal recognition, invariant to affine transformation and time-shift, using canonical correlation", Student thesis, LiTH-ISY-EX-1825, 1997.

AbstractKeywordsBiBTeXFulltext

Claes Lundström, "Segmentation of Medical Image Volumes", Student thesis, LiTH-ISY-Ex No. 1864, 1997.

AbstractKeywordsBiBTeXFulltext

Per-Erik Forssén, "Detection of Man-made Objects in Satellite Images", Student thesis, LiTH-ISY-Ex No. 1852, 1997.

AbstractKeywordsBiBTeXFulltext

Thord Andersson, Mikael Karlsson, "Neuronnätsbaserad identifiering av processparametrar vid tillverkning av pappersmassa", Student thesis, LiTH-ISY-Ex No. 1709, 1997.

AbstractKeywordsBiBTeXFulltext

Abstract

Artificiella neurala nätverk (ANN) är en teknik som under de senaste tio åren har mognat och som numera återfinns i allt fler tillämpningar så som avläsning av skriven text, linjär programmering, reglerteknik, expertsystem, taligenkänning och många olika sorters klassificeringsproblem [Zurada, 1992]. Vi ville i vårt examensarbete försöka använda ANN i en industriell process där standardmetoder ej fungerat tillfredsställande eller varit svåra att tillämpa. En sådan process fann vi i tillverkningen av pappersmassa.

För att tillverka pappersmassa från ved krävs en lång och komplicerad process uppdelad i flera olika steg. Ett av dessa steg är den så kallade kokningen där man med hjälp av högt tryck och varm lut bryter ned träflis till fibrer. Kokningsprocessen är komplex, pågår under lång tid (ca. 8 timmar) samt påverkas av en stor mängd parametrar och därför krävs det stor erfarenhet och kunskap för att kunna styra den. På Kværner Pulping Technologies i Karlstad, som konstruerar bl.a. kokare, har man tagit fram en simulator för kokningsprocessen för att man skall få en bättre insikt i hur processen fungerar och följaktligen kunna styra kokningen på ett bättre sätt. Simulatorns beteende är beroende av ett antal s.k. dolda parametrar som är en delmängd av de parametrar som antas påverka kokningsprocessen. Dessa dolda parametrar är svåra/omöjliga att mäta och därför sätts dessa i simuleringen till estimerade värden. De, i den riktiga processen, motsvarande dolda parametrarna varierar dock på ett okänt sätt. De påverkas dels av interna processer i kokaren, dels av externa orsaker, t.ex. kan träflis av en annan kvalitet matas in i kokaren. Detta leder till simulatorn ger bra simuleringar under ganska kort tid då de dolda parametrarna är approximativt konstanta.

Om man på något sätt skulle kunna detektera förändringarna i de dolda parametrarna i processen och föra över dessa till simulatorn, skulle den kunna gå "parallellt" med kokprocessen. Simulatorn skulle i detta fall utgöra ett utmärkt kompletterande verktyg för den person som styr kokprocessen, eftersom han/hon skulle få en bättre uppfattning om vad som händer/hände i processen och därmed få ett större beslutsunderlag för styrning. Detta förutsätter att simulatorn är så pass bra att den under stationära förhållanden i parametrarna lyckas fånga den globala utvecklingen i kokaren med tillräcklig precision.

Som ett första steg för att nå detta mål avser vi i denna rapport att undersöka om detektering av förändringar i de dolda parametrarna i simulatorn är möjlig med hjälp av framåtkopplade ANN och inlärningsalgoritmen resilient propagation.

Rapporten är uppdelad i 7 kapitel där vi i kapitel 2 kommer behandla problemet mer i detalj. Kapitel 3 och 4 är av allmänt slag där vi beskriver tillverkningsprocessen för papper och vad artificiella neurala nätverk egentligen är. I kapitel 5 beskriver vi de olika lösningsförslag som behandlats och de resultat vi har uppnått. Slutsatser och resultat sammanfattas i kapitel 6 . Det finns mycket mer vi skulle vilja pröva på och undersöka, dessa fortsatta arbeten beskriver vi kapitel 7. Sist i rapporten kommer bilagorna 1 och 2 med detaljer som vi finner relevanta, men som är för skrymmande att ta med i huvuddelen av rapporten. I bilaga 3 har vi bifogat den programkod vi producerat under arbetets gång.

1996

Jörgen Ahlberg, "Active Contours in Three Dimensions", Student thesis, LiTH-ISY-Ex No. 1708, 1996.

AbstractKeywordsBiBTeXFulltext

Gunnar Farnebäck, "Motion-based segmentation of image sequences", Student thesis, LiTH-ISY-Ex No. 1596, 1996.

AbstractKeywordsBiBTeXFulltext

The publication list is extracted from the DiVA - Academic Archive Online - publishing system. The extraction software is developed by Johan Wiklund.

Last updated: 2010-08-26

Department of Electrical Engineering (ISY)

Information

Education

Division

Publications from Computer Vision Laboratory

Journal papers

Books

Book chapters

Conference papers

Conference proceedings

Theses

Other

Reports

Student theses