
Computer Vision - ECCV 2022 Workshops
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
The 367 full papers included in this volume set were carefully reviewed and selected for inclusion in the ECCV 2022 workshop proceedings. They were organized in individual parts as follows:
Part I:
W01 - AI for Space; W02 - Vision for Art; W03 - Adversarial Robustness in the Real World; W04 - Autonomous Vehicle Vision
Part II: W05 - Learning With Limited and Imperfect Data; W06 - Advances in Image Manipulation;
Part III: W07 - Medical Computer Vision; W08 - Computer Vision for Metaverse; W09 - Self-Supervised Learning: What Is Next?;
Part IV: W10 - Self-Supervised Learning for Next-Generation Industry-LevelAutonomous Driving; W11 - ISIC Skin Image Analysis; W12 - Cross-Modal Human-Robot Interaction; W13 - Text in Everything; W14 - BioImage Computing; W15 - Visual Object-Oriented Learning Meets Interaction: Discovery, Representations, and Applications; W16 - AI for Creative Video Editing and Understanding; W17 - Visual Inductive Priors for Data-Efficient Deep Learning; W18 - Mobile Intelligent Photography and Imaging;
Part V: W19 - People Analysis: From Face, Body and Fashion to 3D Virtual Avatars; W20 - Safe Artificial Intelligence for Automated Driving; W21 - Real-World Surveillance: Applications and Challenges; W22 - Affective Behavior Analysis In-the-Wild;
Part VI : W23 - Visual Perception for Navigation in Human Environments: The JackRabbot Human Body Pose Dataset and Benchmark; W24 - Distributed Smart Cameras; W25 - Causality in Vision; W26 - In-Vehicle Sensing and Monitorization; W27 - Assistive Computer Vision and Robotics; W28 - Computational Aspectsof Deep Learning;
Part VII: W29 - Computer Vision for Civil and Infrastructure Engineering; W30 - AI-Enabled Medical Image Analysis: Digital Pathology and Radiology/COVID19; W31 - Compositional and Multimodal Perception;
Part VIII: W32 - Uncertainty Quantification for Computer Vision; W33 - Recovering 6D Object Pose; W34 - Drawings and Abstract Imagery: Representation and Analysis; W35 - Sign Language Understanding; W36 - A Challenge for Out-of-Distribution Generalization in Computer Vision; W37 - Vision With Biased or Scarce Data; W38 - Visual Object Tracking Challenge.
More details
Other editions
Additional editions

Content
- Intro
- Foreword
- Preface
- Organization
- Contents - Part I
- W01 - AI for Space
- W01 - AI for Space
- Globally Optimal Event-Based Divergence Estimation for Ventral Landing
- 1 Introduction
- 2 Related Work
- 3 Geometry of Ventral Landing
- 3.1 Continuous-Time Optic Flow
- 3.2 Continuous-Time Radial Flow and Divergence
- 3.3 Event-Based Radial Flow
- 4 Exact Algorithm for Divergence Estimation
- 4.1 Optimisation Domain and Retrieval of Divergence
- 4.2 Contrast Maximisation
- 4.3 Exact Algorithm
- 4.4 Divergence Estimation for Event Stream
- 5 Ventral Landing Event Dataset
- 6 Results
- 6.1 Qualitative Results
- 6.2 Quantitative Results on Full Resolution Sequences
- 6.3 Quantitative Results on Resized and Resampled Sequences
- 7 Conclusions and Future Work
- References
- Transfer Learning for On-Orbit Ship Segmentation
- 1 Introduction
- 2 Background
- 3 Related Work
- 4 Dataset
- 4.1 Dataset for Ground Sampling Distance Experiments
- 4.2 Dataset for Band Misalignment Experiments
- 5 Experiments
- 5.1 Ground Sampling Distance Robustness
- 5.2 Band Misalignment Robustness
- 6 Future Work and Conclusion
- References
- Spacecraft Pose Estimation Based on Unsupervised Domain Adaptation and on a 3D-Guided Loss Combination
- 1 Introduction
- 2 Related Work
- 2.1 Pose Estimation
- 2.2 Domain Adaptation
- 3 Pose Estimation
- 3.1 2D-2D: Key-Point Heatmap Loss
- 3.2 3D-2D: PnP Loss
- 3.3 3D-3D: Structure Loss
- 4 Domain Adaptation
- 5 Evaluation
- 5.1 Metrics
- 5.2 Challenge Results
- 5.3 Additional Experiments
- 6 Conclusions
- References
- MaRF: Representing Mars as Neural Radiance Fields
- 1 Introduction
- 2 Related Works
- 2.1 Mixed Reality for Space Applications
- 2.2 Neural Radiance Fields (NeRF)
- 2.3 NeRF Landscape
- 3 Mars Radiance Fields (MaRF)
- 4 Bootstrapping the Uncertainty
- 5 Conclusion and Discussion
- References
- Asynchronous Kalman Filter for Event-Based Star Tracking
- 1 Introduction
- 2 Related Work
- 3 Problem Formulation
- 4 Method
- 4.1 Event Camera Calibration
- 4.2 Stars' Log Intensity Model
- 4.3 Star Tracking
- 5 Experiments
- 5.1 Event Camera Calibration
- 5.2 Gaussian Model for Star's Log Intensity
- 5.3 Simulated Star Tracking
- 5.4 Real Data Star Tracking
- 6 Conclusion
- References
- Using Moffat Profiles to Register Astronomical Images
- 1 Introduction
- 1.1 Image Registration
- 1.2 Moffat PSF
- 2 Related Work
- 2.1 AstroAlign and Astrometry.net
- 2.2 Use of Image Registration in Astronomical Super Resolution
- 2.3 Sequential Image Registration for Astronomical Images
- 3 Methodology
- 3.1 Image Synthesis
- 3.2 Feature Detection
- 3.3 Feature Matching
- 3.4 Recovery of Transformation Parameters
- 4 Experimental Design
- 4.1 Synthetic Images
- 4.2 Comparing Recovered Transformation Parameters to Generated Transformation Parameters
- 5 Results
- 5.1 Synthetic Images
- 5.2 Comparing Recovered Transformation Parameters to Generated Transformation Parameters with SDSS Imagery
- 6 Conclusion
- 7 Future Work
- References
- Mixed-Domain Training Improves Multi-mission Terrain Segmentation
- 1 Introduction
- 2 Related Work
- 2.1 Planetary Computer Vision
- 3 Methodology
- 3.1 Dataset Composition
- 3.2 Semi-supervised Finetuning
- 4 Results
- 4.1 Analysis of Dataset Composition
- 4.2 Segmentation with Fewer Labels
- 4.3 Ablation Studies
- 5 Conclusion
- References
- CubeSat-CDT: A Cross-Domain Dataset for 6-DoF Trajectory Estimation of a Symmetric Spacecraft
- 1 Introduction
- 2 Related Datasets
- 3 Proposed CubeSat-CDT Dataset
- 3.1 Zero-G Laboratory Data
- 3.2 Unity-Based SPARK Synthetic Data
- 3.3 Blender-Based Synthetic Data
- 3.4 Discussion
- 4 Proposed Baseline for Spacecraft Trajectory Estimation
- 4.1 Problem Formulation
- 4.2 Proposed Approach
- 4.3 Justification of the Proposed Approach
- 5 Experiments
- 5.1 Domain Gap Analysis
- 5.2 Impact of Temporal Information
- 6 Conclusions
- References
- Data Lifecycle Management in Evolving Input Distributions for Learning-based Aerospace Applications
- 1 Introduction
- 2 Background
- 2.1 Challenges for Data Lifecycle Management
- 2.2 Out-of-Distribution Detection
- 2.3 Bayesian Batch Active Learning
- 3 Problem Formulation and Framework
- 4 Evaluation Benchmark
- 5 Diverse Subsampling Using SCOD (DS-SCOD)
- 6 Experimental Results
- 7 Conclusion
- References
- Strong Gravitational Lensing Parameter Estimation with Vision Transformer
- 1 Introduction
- 2 Data and Models
- 2.1 Simulation and Datasets
- 2.2 Models
- 3 Results
- 4 Conclusion
- References
- End-to-end Neural Estimation of Spacecraft Pose with Intermediate Detection of Keypoints
- 1 Introduction
- 2 Method
- 2.1 Keypoint Detection Network
- 2.2 Pose Inference Network
- 2.3 Joint Training of KDN and PIN
- 3 Experiments
- 3.1 Dataset and Implementation Details
- 3.2 Comparisons with a PnP-based Solution and Prior Works
- 3.3 Complexity Reduction
- 3.4 Ablation Study
- 4 Conclusions
- References
- Improving Contrastive Learning on Visually Homogeneous Mars Rover Images
- 1 Introduction
- 2 Related Work
- 3 Datasets
- 3.1 Perseverance Rover Images
- 3.2 Curiosity Rover Images
- 3.3 Mars Reconnaissance Orbiter Images
- 4 Methods
- 4.1 Contrastive Learning
- 4.2 Cluster-Aware Contrastive Learning
- 4.3 Mixed-Domain Contrastive Learning
- 4.4 Benchmarking Learned Feature Extraction with Linear Evaluation
- 5 Results and Discussion
- 5.1 Adhering to the Contrastive Assumption Strengthens Performance
- 5.2 Learned Features Generalise to Out-of-domain Tasks
- 5.3 Mixed-Domain Approaches Increase Dataset Variability and Performance
- 6 Ablation Studies
- 6.1 10% of the Labels Is Sufficient to Transfer to Other Tasks
- 6.2 Cluster-Aware Contrastive Learning Is Sensitive to the Number of Clusters Chosen
- 7 Conclusion
- References
- Monocular 6-DoF Pose Estimation for Non-cooperative Spacecrafts Using Riemannian Regression Network
- 1 Introduction
- 2 Related Work
- 2.1 Template-Matching Monocular Pose Estimation
- 2.2 Appearance-Based Monocular Pose Estimation
- 2.3 CNN-Based Monocular Pose Estimation
- 3 Proposed Method
- 3.1 Backbone
- 3.2 Pose Prediction Heads and Loss Function
- 3.3 Data Pre-processing and Augmentation
- 4 Experiments
- 4.1 Experimental Setup
- 4.2 Comparative Analysis
- 4.3 Ablation Study
- 5 Conclusions
- References
- W02 - Vision for Art
- W02 - Vision for Art
- HyperNST: Hyper-Networks for Neural Style Transfer
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 HyperNST Architecture
- 3.2 Conditioning on Style, and Facial Semantic Regions
- 3.3 HyperNST Training Process
- 3.4 Stylized Target Generator
- 3.5 Region Mask Driven Patch Discriminator
- 4 Evaluation
- 4.1 Datasets
- 4.2 Baselines
- 4.3 Ablations
- 4.4 Downstream Experiments
- 5 Conclusion
- References
- DEArt: Dataset of European Art
- 1 Introduction
- 2 Related Work
- 3 Dataset
- 3.1 Object Categories
- 3.2 Pose Categories
- 3.3 Image Collection Process
- 3.4 Image Annotation
- 3.5 Dataset Statistics
- 4 Experiments
- 4.1 Object Detection
- 4.2 Pose Classification
- 5 Discussion
- 5.1 Poses
- 5.2 Generating New Object Labels Without Annotation
- 5.3 Limitations
- References
- How Well Do Vision Transformers (VTs) Transfer to the Non-natural Image Domain? An Empirical Study Involving Art Classification
- 1 Introduction
- 2 Preliminaries
- 2.1 Transfer Learning
- 2.2 Related Works
- 3 Methods
- 3.1 Data
- 3.2 Neural Architectures
- 3.3 Training Procedure
- 3.4 Hardware and Software
- 4 Results
- 4.1 Off-The-Shelf Learning
- 4.2 Fine-Tuning
- 5 Discussion
- 6 Additional Studies
- 6.1 Saliency Maps
- 6.2 Dealing with Small Artistic Collections
- 6.3 Training Times
- 7 Conclusion
- References
- On-the-Go Reflectance Transformation Imaging with Ordinary Smartphones
- 1 Introduction
- 2 Related Work
- 3 Proposed Method
- 3.1 Data Acquisition
- 3.2 The Reflectance Model
- 3.3 Neural Model
- 4 Experimental Results
- 4.1 Parameter Study
- 4.2 Comparisons
- 5 Conclusions
- References
- Is GPT-3 All You Need for Visual Question Answering in Cultural Heritage?
- 1 Introduction
- 2 Related Work
- 3 GPT-3
- 4 Method
- 5 Experiments
- 5.1 Dataset
- 5.2 Experimental Protocol
- 5.3 Experimental Results
- 6 Qualitative Analysis
- 7 Considerations on Complexity and Accessibility of GPT-3
- 8 Conclusions
- References
- Automatic Analysis of Human Body Representations in Western Art
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Average Contours of Body Segments
- 3.2 Visualising Joint Distributions
- 3.3 Hierarchical Clustering
- 4 The Artistic Pose (AP) Dataset
- 5 Experimental Results
- 5.1 Comparison of OpenPose and DensePose
- 5.2 Average Contours of Body Segments
- 5.3 Visualising Joint Distributions
- 5.4 Hierarchical Clustering
- 6 Conclusions
- References
- ArtFacePoints: High-Resolution Facial Landmark Detection in Paintings and Prints
- 1 Introduction
- 2 Related Work
- 3 Artistic Facial Landmarks Dataset Creation
- 3.1 Synthetic Image Generation Using Style Transfer and Geometric Augmentations
- 3.2 Semi-automatic Facial Landmarks Annotation
- 4 ArtFacePoints
- 4.1 Global Facial Landmark Detection
- 4.2 Regional Facial Landmarks Refinement
- 5 Experiments and Results
- 5.1 Datasets
- 5.2 Implementation and Experimental Details
- 5.3 Results
- 6 Applications
- 6.1 Image Registration Using Facial Landmarks
- 6.2 Facial Image and Contour Comparison
- 7 Conclusions
- References
- W03 - Adversarial Robustness in the Real World
- W03 - Adversarial Robustness in the Real World
- TransPatch: A Transformer-based Generator for Accelerating Transferable Patch Generation in Adversarial Attacks Against Object Detection Models
- 1 Introduction
- 2 Related Work
- 2.1 Adversarial Attack on Object Detection
- 2.2 Adversarial Attacks
- 2.3 Generative Models in Adversarial Attack
- 3 Methodology
- 3.1 Overall Pipeline
- 3.2 Generator
- 3.3 Design of Loss Function
- 4 Experiments
- 4.1 Datasets
- 4.2 Adversarial Hiding Attack
- 4.3 Ablation Study
- 5 Physical Attack
- 5.1 Addition Constrains
- 6 Conclusion
- References
- Feature-Level Augmentation to Improve Robustness of Deep Neural Networks to Affine Transformations
- 1 Introduction
- 2 Related Work
- 3 Method
- 4 Experiments
- 4.1 Data Sets
- 4.2 Evaluation Setup
- 4.3 Results
- 5 Conclusion
- References
- Benchmarking Robustness Beyond lp Norm Adversaries
- 1 Introduction
- 2 Related Work
- 3 Proposed Wide Angle Anomalies Dataset
- 3.1 Common Corruptions
- 3.2 Adversarial Examples
- 4 Experimental Results and Analysis
- 4.1 Results and Analysis
- 5 Conclusion
- References
- Masked Faces with Faced Masks
- 1 Introduction
- 2 Related Work
- 2.1 Face Recognition
- 2.2 Mask Detection
- 2.3 Adversarial Attack
- 3 Methodology
- 3.1 Delaunay-Based Masking and Motivation
- 3.2 Adversarial Noise Delaunay-Based Masking
- 3.3 Adversarial Filtering Delaunay-Based Masking
- 3.4 Algorithm for MF2M
- 4 Experiments
- 4.1 Experimental Setup
- 4.2 Comparison on White-Box Attack
- 4.3 Comparison on Black-Box Attack Transferability
- 4.4 Extension to Physical Attack
- 5 Conclusions
- References
- Adversarially Robust Panoptic Segmentation (ARPaS) Benchmark
- 1 Introduction
- 2 Related Work
- 3 Adversarially Robust Panoptic Segmentation Benchmark
- 3.1 Assessing Panoptic Segmentation Robustness
- 3.2 Adversarial Training
- 4 Experiments
- 4.1 Experimental Setting
- 4.2 Assessing the Robustness of Panoptic Segmentation Approaches
- 4.3 Quantifying the Effects of Common Corruptions
- 4.4 Diagnosing the Brittleness of Panoptic Segmentation Methods
- 4.5 Ablation Experiments
- 5 Conclusion
- References
- BadDet: Backdoor Attacks on Object Detection
- 1 Introduction
- 2 Related Work
- 3 Background
- 3.1 Notations of Object Detection
- 3.2 General Pipeline of Backdoor Attacks
- 3.3 Threat Model
- 4 Methodology
- 4.1 Backdoor Attack Settings
- 4.2 Evaluation Metrics
- 5 Experiments
- 6 Detector Cleanse
- 7 Conclusion
- References
- Universal, Transferable Adversarial Perturbations for Visual Object Trackers
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Overall Pipeline
- 3.2 Training the Generator
- 3.3 Universal Perturbations: Inference Time
- 4 Experiments
- 5 Results
- 6 Conclusion
- References
- Why Is the Video Analytics Accuracy Fluctuating, and What Can We Do About It?
- 1 Introduction
- 2 Motivation
- 2.1 Object Detection in Videos
- 2.2 Face Detection in Videos
- 3 Analysis and Control of External Factors
- 3.1 Control for Motion
- 3.2 Analysis and Control for Video Compression
- 3.3 Analysis and Control for Flicker
- 3.4 Impact of Different Camera Models
- 4 Camera as an Unintentional Adversary
- 4.1 Hypothesis
- 4.2 Hypothesis Validation
- 5 Implications
- 5.1 Retraining Image-Based Models with Transfer Learning
- 5.2 Calibrating Softmax Confidence Scores
- 6 Related Work
- 7 Conclusion
- References
- SkeleVision: Towards Adversarial Resiliency of Person Tracking with Multi-Task Learning
- 1 Introduction
- 2 Related Work
- 2.1 Multi-Task Learning
- 2.2 Adversarial Attacks
- 2.3 Adversarial Defenses in Tracking
- 3 Preliminaries
- 3.1 Tracking with SiamRPN
- 3.2 Multi-Task Learning with Shared Backbone
- 3.3 Adversarial Attacks
- 4 Experiment Setup
- 4.1 Architecture
- 4.2 Training Data
- 4.3 Multi-Task Training
- 4.4 Evaluation
- 5 Results
- 5.1 Varying MTL Weight
- 5.2 Increasing Depth of Keypoint Head
- 5.3 Pre-training the Keypoint Head
- 6 Conclusion
- References
- Unrestricted Black-Box Adversarial Attack Using GAN with Limited Queries
- 1 Introduction
- 2 Related Work
- 2.1 Adversarial Examples
- 2.2 Generative Adversarial Networks
- 2.3 Unrestricted Adversarial Attacks
- 3 Proposed Methods
- 3.1 Decision-Based Attack in Latent Space
- 3.2 Encoding Algorithm
- 4 Experiments
- 4.1 Experiment Settings
- 4.2 Gender Classification
- 4.3 Identity Recognition
- 4.4 Real-World Application
- 5 Discussion
- 6 Conclusion
- References
- Truth-Table Net: A New Convolutional Architecture Encodable by Design into SAT Formulas
- 1 Introduction
- 2 Related Work and Background
- 3 Truth Table Deep Convolution Neural Network (TTnet)
- 3.1 SAT Encoding of a One-Layer 2D-CNN
- 3.2 Learning Truth Table (LTT) Block
- 3.3 The TTnet Architecture
- 3.4 Experiments: Natural Accuracy Comparison
- 4 Application to Complete Robustness Verification
- 4.1 Post-tuning: Characterizing and Filtering Overfitting DNF Clauses
- 4.2 Tractability: Computing all the Possibilities of the Adversarial Setup Before Production
- 4.3 Formal and Complete Robustness Verification for Untargeted Attack
- 5 Limitation and Future Work
- 6 Conclusion
- References
- Attribution-Based Confidence Metric for Detection of Adversarial Attacks on Breast Histopathological Images
- 1 Introduction
- 2 Related Work
- 3 Proposed Approach
- 4 Experimental Results
- 4.1 Implementation Details
- 4.2 Classification Accuracy
- 4.3 Adversarial Images Generation
- 4.4 Detection of Adversarial Images
- 5 Conclusions
- References
- Improving Adversarial Robustness by Penalizing Natural Accuracy
- 1 Introduction
- 1.1 Additional Related Work
- 2 Preliminaries and Notation
- 3 NAP Loss
- 3.1 Comparison to Other Losses
- 4 Experiments
- 4.1 Experimental Details
- 4.2 Results
- 5 Ablations
- 5.1 Boosted Cross-Entropy
- 6 Discussion
- 7 Conclusions
- A Hyperparameter Search
- B Softmax Distributions
- References
- W04 - Autonomous Vehicle Vision
- W04 - Autonomous Vehicle Vision
- 4D-StOP: Panoptic Segmentation of 4D LiDAR Using Spatio-Temporal Object Proposal Generation and Aggregation
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 4D Volume Formation
- 3.2 4D-StOP
- 3.3 Tracking
- 3.4 Training and Implementation Details
- 4 Experiments
- 4.1 Comparing with State-of-the-Art Methods
- 4.2 Analysis
- 4.3 Qualitative Results
- 5 Conclusion
- References
- BlindSpotNet: Seeing Where We Cannot See
- 1 Introduction
- 2 Related Work
- 3 Road Blind Spot Dataset
- 3.1 T-Frame Blind Spots
- 3.2 Visibility Mask
- 4 BlindSpotNet
- 4.1 Network Architecture
- 4.2 Knowledge Distillation
- 4.3 Loss Function
- 5 Experimental Results
- 5.1 RBS Dataset Evaluation
- 5.2 BlindSpotNet Evaluation
- 6 Conclusion
- References
- Gesture Recognition with Keypoint and Radar Stream Fusion for Automated Vehicles
- 1 Introduction
- 2 Related Work
- 3 Fusion Method
- 3.1 Neural Network Architecture
- 3.2 Model Training
- 4 Results
- 4.1 Dataset
- 4.2 Experimental Setup
- 4.3 Results
- 4.4 Results with Single Modality
- 5 Ablation Studies
- 5.1 Amount of Single-Modality Training Samples
- 5.2 Loss Function
- 6 Conclusion
- References
- An Improved Lightweight Network Based on YOLOv5s for Object Detection in Autonomous Driving
- 1 Introduction
- 2 Methods
- 2.1 YOLOv5s
- 2.2 CoordConv Module
- 2.3 Hierarchical-Split Block (HS Block)
- 2.4 Proposed Network
- 3 Experiment
- 3.1 Datasets
- 3.2 Evaluation Criteria
- 3.3 Implementation Details and Hardware for Training
- 4 Results
- 4.1 Qualitative Results
- 4.2 Quantitative Results
- 4.3 Detection Performance on Objects with Different Sizes
- 4.4 Model Evaluation in Extreme Weather Situations
- 5 Conclusions
- References
- Plausibility Verification for 3D Object Detectors Using Energy-Based Optimization
- 1 Introduction
- 2 Related Work
- 3 Concepts and Proposed Method
- 3.1 Chamfer Distance Energy Function
- 3.2 Silhouette Alignment Energy Function (SAEF)
- 3.3 Height over Ground Energy Function
- 3.4 Rotation Consistent Energy Function
- 4 Method
- 4.1 Metric
- 5 Experiments
- 6 Conclusion
- References
- Lane Change Classification and Prediction with Action Recognition Networks
- 1 Introduction
- 2 Related Work
- 3 Methods
- 3.1 Problem Formulation
- 3.2 RGB+3DN: 3D Networks and RGB Video Data
- 3.3 RGB+BB+3DN: 3D Networks and Video Combined with Bounding Box Data
- 4 Experiments
- 4.1 Dataset
- 4.2 Evaluation Metrics
- 4.3 Lane Change Classification and Prediction with RGB+3DN
- 4.4 Lane Change Classification and Prediction with RGB+BB+3DN
- 4.5 Comparison to Previous Methods
- 4.6 Class Activation Maps
- 4.7 Optimizing Temporal Information Extraction
- 5 Conclusions
- References
- Joint Prediction of Amodal and Visible Semantic Segmentation for Automated Driving
- 1 Introduction
- 2 Related Work
- 2.1 Amodal Segmentation
- 2.2 Joint Amodal and Visible Semantic Segmentation
- 3 Proposed Method for Joint Amodal and Visible Semantic Segmentation
- 4 Experimental Validation and Discussion
- 4.1 Datasets and Metrics
- 4.2 Training Details
- 4.3 Quantitative Results
- 4.4 Qualitative Results
- 5 Conclusion
- References
- Human-Vehicle Cooperative Visual Perception for Autonomous Driving Under Complex Traffic Environments
- 1 Introduction
- 1.1 Visual Perception of Autonomous Driving
- 1.2 Cooperative Visual Perception Methods
- 2 Methodology
- 2.1 Data Acquisition Under Complex Traffic Environments
- 2.2 Data Preprocessing
- 2.3 Data Fusion Algorithm
- 2.4 Data Postprocessing
- 3 Experiment
- 3.1 Setup
- 3.2 Object Detection of Vehicle's Visual Perception
- 3.3 Gaze Point Fusion
- 3.4 Ground-Truth Processing
- 3.5 Validation of Human-Vehicle Cooperative Visual Perception
- 4 Conclusion
- References
- MCIP: Multi-Stream Network for Pedestrian Crossing Intention Prediction
- 1 Introduction
- 2 Related Work
- 3 MCIP: Multi-Stream Network for Pedestrian Crossing Intention Prediction
- 3.1 Model Architecture
- 3.2 Input Acquisition
- 3.3 Implementation Unit
- 4 Experiments
- 4.1 Dataset
- 4.2 Experimental Settings
- 5 Results
- 5.1 Qualitative Results
- 5.2 Quantitative Results
- 6 Conclusions
- References
- SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking
- 1 Introduction
- 2 Related Work
- 2.1 3D MOT
- 2.2 2D MOT
- 3 3D MOT Pipeline
- 4 Analyzing and Improving 3D MOT
- 4.1 Pre-processing
- 4.2 Motion Model
- 4.3 Association
- 4.4 Life Cycle Management
- 4.5 Integration of SimpleTrack
- 5 Rethinking NuScenes
- 5.1 Detection Frequencies
- 5.2 Tracklet Interpolation
- 6 Error Analyses
- 6.1 Upper Bound Experiment Settings
- 6.2 Analyses for ``Tracking by Detection''
- 7 Conclusions and Future Work
- References
- Ego-Motion Compensation of Range-Beam-Doppler Radar Data for Object Detection
- 1 Introduction
- 2 Radar Processing Theory
- 2.1 Radar Foundations
- 3 Methods and Experiments
- 3.1 Ego-Motion Compensation
- 3.2 Radar Object Detection
- 3.3 Dataset
- 4 Results
- 5 Discussion
- 6 Conclusion
- References
- RPR-Net: A Point Cloud-Based Rotation-Aware Large Scale Place Recognition Network
- 1 Introduction
- 2 Related Works
- 2.1 Rotation-Invariant Convolution
- 2.2 Large Scale Place Recognition
- 3 Methodology
- 3.1 Overview
- 3.2 Low Level Rotation Invariant Features
- 3.3 Attentive Rotation-Invariant Convolution Operation
- 4 Experiments
- 4.1 Dataset
- 4.2 Implementation Details
- 4.3 Results
- 4.4 Ablation Study
- 5 Conclusion
- References
- Learning 3D Semantics From Pose-Noisy 2D Images with Hierarchical Full Attention Network
- 1 Introduction
- 2 Related Work
- 3 Problem Formulation
- 4 Hierarchical Full Attention Network
- 4.1 Patch Attention for Patch Aggregation
- 4.2 Instance Attention for Image Aggregation
- 4.3 Inter-point Attention for 3D Points Aggregation
- 5 Discussion on HiFANet Design Motivation
- 5.1 Pose Noise and Patch Observation
- 5.2 View-Angle and Void Projection
- 6 Experiments and Results
- 6.1 More Experimental Result
- 6.2 Discussion on Pose Noise
- 7 Conclusion
- References
- SIM2E: Benchmarking the Group Equivariant Capability of Correspondence Matching Algorithms
- 1 Introduction
- 2 SIM2E Dataset
- 2.1 Data Collection and Augmentation
- 2.2 SIM2E-SO2S, SIM2E-Sim2S, and SIM2E-PersS
- 2.3 Comparison with Other Public Correspondence Matching Datasets
- 3 Experiments
- 3.1 Experimental Setup
- 3.2 Comparison of the SoTA Approaches on the Rotated-HPatches Dataset
- 3.3 Comparison of the SoTA Approaches on Our SIM2E Dataset
- 3.4 Discussion
- 4 Conclusion
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.