
Multimodal Scene Understanding
Algorithms, Applications and Deep Learning
Academic Press
Published on 17. July 2019
Book
Paperback/Softback
422 pages
978-0-12-817358-9 (ISBN)
Description
Multimodal Scene Understanding: Algorithms, Applications and Deep Learning presents recent advances in multi-modal computing, with a focus on computer vision and photogrammetry. It provides the latest algorithms and applications that involve combining multiple sources of information and describes the role and approaches of multi-sensory data and multi-modal deep learning. The book is ideal for researchers from the fields of computer vision, remote sensing, robotics, and photogrammetry, thus helping foster interdisciplinary interaction and collaboration between these realms.
Researchers collecting and analyzing multi-sensory data collections - for example, KITTI benchmark (stereo+laser) - from different platforms, such as autonomous vehicles, surveillance cameras, UAVs, planes and satellites will find this book to be very useful.
Researchers collecting and analyzing multi-sensory data collections - for example, KITTI benchmark (stereo+laser) - from different platforms, such as autonomous vehicles, surveillance cameras, UAVs, planes and satellites will find this book to be very useful.
More details
Language
English
Place of publication
San Diego
United States
Publishing group
Elsevier Science Publishing Co Inc
Target group
Professional and scholarly
Product notice
Paperback (trade)
Dimensions
Height: 235 mm
Width: 191 mm
Thickness: 22 mm
Weight
726 gr
ISBN-13
978-0-12-817358-9 (9780128173589)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Classification
Other editions
Additional editions

Michael Ying Yang | Bodo Rosenhahn | Vittorio Murino
Multimodal Scene Understanding
Algorithms, Applications and Deep Learning
E-Book
07/2019
Academic Press
€122.00
Available for download
Persons
He is Assistant Professor with University of Twente (the Netherlands), heading a group working on scene understanding. He received the PhD degree (summa cum laude) from University of Bonn (Germany) in 2011. His research interests are in the fields of computer vision and photogrammetry with specialization on scene understanding, deep learning, UAV vision, and multi-sensor fusion. He published over 90 articles in international journals and conference proceedings. He serves as co-chair of ISPRS working group II/5 Dynamic Scene Analysis, and recipient of the ISPRS President's Honorary Citation (2016) and Best Science Paper Award at BMVC 2016. Since 2016, he is a Senior Member of IEEE. He is regularly serving as program committee member of conferences and reviewer for international journals. His works received several awards, including a DAGM-Prize 2002 , Dr.-Ing. Siegfried Werth Prize 2003, DAGM-Main Prize 2005, IVCNZ best student paper award , DAGM-Main Prize 2007, Olympus-Prize 2007, ICPRAM Best student paper award 2014, ICMC Best student paper award 2014, the WACV 2015 Challenge Award, the Guenter Enderle Award (Eurographics) 2017 and the CVPR 2017 Multi-Object Tracking Challenge. In 2011, the European Commission awarded Bodo Rosenhahn with a 1.43 million Euros ERC-Starting Grant and in 2013 with a POC Grant. He published more than 180 research papers, journal articles and book chapters, holds more than 10 patents and edited several books. Full professor at the University of Verona, Italy, and director of the PAVIS (Pattern Analysis and Computer Vision) department at the Istituto Italiano di Tecnologia. He took the Laurea degree in Electronic Engineering in 1989 and a Ph.D. in Electronic Engineering and Computer Science in 1993 at the University of Genova, Italy.
His main research interests include: computer vision and pattern recognition/machine learning, in particular, probabilistic techniques for image and video processing, with applications on video surveillance, biomedical image analysis and bioinformatics.
His main research interests include: computer vision and pattern recognition/machine learning, in particular, probabilistic techniques for image and video processing, with applications on video surveillance, biomedical image analysis and bioinformatics.
Editor
Scene Understanding Group, University of Twente, The Netherlands
Leibniz University Hannover, Germany
Professor, University of Verona, Italy, and Director, PAVIS (Pattern Analysis and Computer Vision), Istituto Italiano di Tecnologia
Content
1. Introduction to Multimodal Scene Understanding
Michael Ying Yang, Bodo Rosenhahn and Vittorio Murino
2. Multi-modal Deep Learning for Multi-sensory Data Fusion
Asako Kanezaki, Ryohei Kuga, Yusuke Sugano and Yasuyuki Matsushita
3. Multi-Modal Semantic Segmentation: Fusion of RGB and Depth Data in Convolutional Neural Networks
Zoltan Koppanyi, Dorota Iwaszczuk, Bing Zha, Can Jozef Saul, Charles K. Toth and Alper Yilmaz
4. Learning Convolutional Neural Networks for Object Detection with very little Training Data
Christoph Reinders, Hanno Ackermann, Michael Ying Yang and Bodo Rosenhahn
5. Multi-modal Fusion Architectures for Pedestrian Detection
Dayan Guan, Jiangxin Yang, Yanlong Cao, Michael Ying Yang and Yanpeng Cao
6. ThermalGAN: Multimodal Color-to-Thermal Image Translation for Person Re-Identification in Multispectral Dataset
Vladimir A. Knyaz and Vladimir V. Kniaz
7. A Review and Quantitative Evaluation of Direct Visual-Inertia Odometry
Lukas von Stumberg, Vladyslav Usenko and Daniel Cremers
8. Multimodal Localization for Embedded Systems: A Survey
Imane Salhi, Martyna Poreba, Erwan Piriou, Valerie Gouet-Brunet and Maroun Ojail
9. Self-Supervised Learning from Web Data for Multimodal Retrieval
Raul Gomez, Lluis Gomez, Jaume Gibert and Dimosthenis Karatzas
10. 3D Urban Scene Reconstruction and Interpretation from Multi-sensor Imagery
Hai Huang, Andreas Kuhn, Mario Michelini, Matthais Schmitz and Helmut Mayer
11. Decision Fusion of Remote Sensing Data for Land Cover Classification
Arnaud Le Bris, Nesrine Chehata, Walid Ouerghemmi, Cyril Wendl, Clement Mallet, Tristan Postadjian and Anne Puissant
12. Cross-modal learning by hallucinating missing modalities in RGB-D vision
Nuno Garcia, Pietro Morerio and Vittorio Murino
Michael Ying Yang, Bodo Rosenhahn and Vittorio Murino
2. Multi-modal Deep Learning for Multi-sensory Data Fusion
Asako Kanezaki, Ryohei Kuga, Yusuke Sugano and Yasuyuki Matsushita
3. Multi-Modal Semantic Segmentation: Fusion of RGB and Depth Data in Convolutional Neural Networks
Zoltan Koppanyi, Dorota Iwaszczuk, Bing Zha, Can Jozef Saul, Charles K. Toth and Alper Yilmaz
4. Learning Convolutional Neural Networks for Object Detection with very little Training Data
Christoph Reinders, Hanno Ackermann, Michael Ying Yang and Bodo Rosenhahn
5. Multi-modal Fusion Architectures for Pedestrian Detection
Dayan Guan, Jiangxin Yang, Yanlong Cao, Michael Ying Yang and Yanpeng Cao
6. ThermalGAN: Multimodal Color-to-Thermal Image Translation for Person Re-Identification in Multispectral Dataset
Vladimir A. Knyaz and Vladimir V. Kniaz
7. A Review and Quantitative Evaluation of Direct Visual-Inertia Odometry
Lukas von Stumberg, Vladyslav Usenko and Daniel Cremers
8. Multimodal Localization for Embedded Systems: A Survey
Imane Salhi, Martyna Poreba, Erwan Piriou, Valerie Gouet-Brunet and Maroun Ojail
9. Self-Supervised Learning from Web Data for Multimodal Retrieval
Raul Gomez, Lluis Gomez, Jaume Gibert and Dimosthenis Karatzas
10. 3D Urban Scene Reconstruction and Interpretation from Multi-sensor Imagery
Hai Huang, Andreas Kuhn, Mario Michelini, Matthais Schmitz and Helmut Mayer
11. Decision Fusion of Remote Sensing Data for Land Cover Classification
Arnaud Le Bris, Nesrine Chehata, Walid Ouerghemmi, Cyril Wendl, Clement Mallet, Tristan Postadjian and Anne Puissant
12. Cross-modal learning by hallucinating missing modalities in RGB-D vision
Nuno Garcia, Pietro Morerio and Vittorio Murino