
Computer Vision - ECCV 2020
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
The 1360 revised papers presented in these proceedings were carefully reviewed and selected from a total of 5025 submissions. The papers deal with topics such as computer vision; machine learning; deep neural networks; reinforcement learning; object recognition; image classification; image processing; object detection; semantic segmentation; human pose estimation; 3d reconstruction; stereo vision; computational photography; neural networks; image coding; image reconstruction; object recognition; motion estimation.
More details
Other editions
Additional editions

Content
- Intro
- Foreword
- Preface
- Organization
- Contents - Part XXII
- Object Tracking Using Spatio-Temporal Networks for Future Prediction Location
- 1 Introduction
- 2 Related Works
- 3 Proposed Method
- 3.1 Tracker Module
- 3.2 Background Motion
- 3.3 Trajectory Prediction
- 4 Implementation Details
- 5 Experiment Results
- 5.1 Comparison with the State-of-the-Art
- 5.2 Attributes Analysis
- 5.3 Ablation Studies
- 6 Conclusions
- References
- Pillar-Based Object Detection for Autonomous Driving
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Preliminaries
- 3.2 Overall Architecture
- 3.3 Cylindrical View
- 3.4 Pillar-Based Prediction
- 3.5 Bilinear Interpolation
- 3.6 Loss Function
- 4 Experiments
- 4.1 Results Compared to State-of-the-Art
- 4.2 Comparing Anchor-Based, Point-Based, and Pillar-Based Prediction
- 4.3 View Combinations
- 4.4 Bilinear Interpolation or Nearest Neighbor Interpolation?
- 5 Discussion
- References
- Sparse Adversarial Attack via Perturbation Factorization
- 1 Introduction
- 2 Related Work
- 3 Sparse Adversarial Attack
- 3.1 Preliminaries of Adversarial Attack
- 3.2 Sparse Adversarial Attack via Perturbation Factorization
- 3.3 Continuous Optimization for the MIP Problem
- 3.4 Two Extensions of SAPF
- 4 Experiments
- 4.1 Experimental Settings
- 4.2 Experimental Comparisons Between SAPF and Other Methods
- 4.3 Results of Group-Wise Sparsity and Visual Imperceptibility
- 4.4 Supplementary Materials
- 5 Conclusion
- References
- 3D Scene Reconstruction from a Single Viewport
- 1 Introduction
- 2 Related Work
- 3 Problem Description and General Approach
- 4 Generating Synthetic 3D Training Data
- 4.1 Viewport Alignment
- 4.2 Fast Generation of TSDF Voxel Data
- 4.3 Spatial Compression
- 5 Proposed Network Architecture
- 5.1 Tree Network
- 5.2 Multipath
- 5.3 General Architecture
- 6 Loss Shaping
- 6.1 Output Loss Shaping
- 6.2 Tree Loss Shaping
- 7 Experiments
- 7.1 Test Setup
- 7.2 Qualitative Results
- 7.3 Quantitative Results
- 8 Conclusion
- References
- Learning to Optimize Domain Specific Normalization for Domain Generalization
- 1 Introduction
- 2 Related Work
- 2.1 Domain Generalization
- 2.2 Multi-source Domain Adaptation
- 2.3 Normalization in Neural Networks
- 3 Domain-Specific Optimized Normalization for Domain Generalization
- 3.1 Overview
- 3.2 Instance Normalization for Domain Generalization
- 3.3 Optimization for Domain-Specific Normalization
- 3.4 Inference
- 4 Experiments
- 4.1 Experimental Settings
- 4.2 Comparison with Other Methods
- 4.3 Ablation Study
- 4.4 Additional Experiments
- 4.5 Analysis
- 5 Conclusion
- References
- Self-supervised Outdoor Scene Relighting
- 1 Introduction
- 2 Related Work
- 3 Overview
- 4 Inverse Rendering
- 5 Neural Rendering
- 5.1 Losses
- 5.2 Shadow Prediction Network
- 5.3 Sky GAN
- 5.4 Training
- 6 Results
- 6.1 Outdoor Relighting Bechmarking Dataset
- 6.2 Qualitative Evaluation
- 6.3 Quantitative Evaluation
- 7 Discussion
- 8 Conclusion
- References
- Privacy Preserving Visual SLAM
- 1 Introduction
- 2 Related Works
- 2.1 Visual SLAM
- 2.2 Map Representation with Line Cloud
- 2.3 Bundle Adjustment for Map Optimization
- 3 Proposed Method
- 3.1 System Overview
- 3.2 Relocalization and Loop Detection with a Line Cloud
- 3.3 2D-3D Matching with 3D Lines and Points
- 3.4 Bundle Adjustments with a Line Cloud
- 4 Experiments
- 4.1 Experimental Setting
- 4.2 Implementation Details
- 4.3 Dataset and Prebulit Map Creation
- 4.4 Quantitative Evaluation
- 4.5 Qualitative Evaluation
- 5 Conclusions
- References
- Leveraging Acoustic Images for Effective Self-supervised Audio Representation Learning
- 1 Introduction
- 2 Related Works
- 3 ACIVW: ACoustic Images and Videos in the Wild
- 4 The Method
- 4.1 Input Data
- 4.2 Single Data Stream Models
- 4.3 Pretext Task
- 4.4 Knowledge Distillation
- 5 Experiments
- 5.1 Cross-Modal Retrieval
- 5.2 Classification
- 6 Conclusions
- References
- Learning Joint Visual Semantic Matching Embeddings for Language-Guided Retrieval
- 1 Introduction
- 2 Related Work
- 3 Proposed Approach
- 3.1 Visual Semantic Embedding
- 3.2 Image-Text Compositional Embedding
- 4 Experiments
- 4.1 Implement Details
- 4.2 Fashion-200k
- 4.3 UT-Zap50K
- 4.4 Fashion-iq
- 4.5 Ablation Study and Other Tasks
- 5 Conclusion
- References
- Globally Optimal and Efficient Vanishing Point Estimation in Atlanta World
- 1 Introduction
- 2 Related Work
- 3 Algorithm Overview
- 4 Simplified Case Without Perturbation
- 4.1 Defining Dominant Plane and Candidate Region
- 4.2 Mining Candidate Interval
- 4.3 Stabbing Candidate Intervals by Probes
- 5 Practical Case with Perturbation
- 5.1 Bounds of Number of Inliers
- 5.2 Collaboration Between BnB and MnS
- 6 Experiments
- 6.1 Synthetic Dataset
- 6.2 Real-World Dataset
- 7 Conclusions
- References
- StyleGAN2 Distillation for Feed-Forward Image Manipulation
- 1 Introduction
- 2 Related Work
- 3 Method Overview
- 3.1 Data Collection
- 3.2 Training Process
- 4 Experiments
- 4.1 Evaluation Protocol
- 4.2 Distillation of Image-to-image Translation
- 4.3 Distillation of Style Mixing
- 5 Conclusions
- References
- Self-Prediction for Joint Instance and Semantic Segmentation of Point Clouds
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Self-Prediction
- 3.2 Associated Learning Framework
- 3.3 Optimization Objectives
- 4 Experiments
- 4.1 Experiment Settings
- 4.2 Segmentation Results on S3DIS
- 4.3 Segmentation Results on ShapeNet
- 4.4 Ablation Study
- 5 Conclusion
- References
- Learning Disentangled Representations via Mutual Information Estimation
- 1 Introduction
- 2 Related Work
- 3 Mutual Information
- 4 Method
- 4.1 Shared Representation Learning
- 4.2 Exclusive Representation Learning
- 4.3 Implementation Details
- 5 Experiments
- 5.1 Datasets
- 5.2 Representation Disentanglement Evaluation
- 5.3 Analysis of the Objective Function
- 5.4 Satellite Applications
- 6 Conclusions
- References
- Challenge-Aware RGBT Tracking
- 1 Introduction
- 2 Related Work
- 2.1 RGBT Tracking Methods
- 2.2 Multi-task Learning
- 3 Challenge-Aware RGBT Tracker
- 3.1 Challenge-Aware Neural Network
- 3.2 Training Algorithm
- 3.3 Online Tracking
- 4 Performance Evaluation
- 4.1 Experimental Setting
- 4.2 Quantitative Comparison
- 4.3 In-Depth Analysis of the Proposed CAT
- 5 Conclusion
- References
- Fully Trainable and Interpretable Non-local Sparse Models for Image Restoration
- 1 Introduction
- 2 Preliminaries and Related Work
- 3 Proposed Approach
- 3.1 Trainable Sparse Coding (without Self-similarities)
- 3.2 Differentiable Relaxation for Non-local Sparse Priors
- 3.3 Similarity Metrics
- 3.4 Extension to Blind Denoising and Parameter Sharing
- 3.5 Extension to Demosaicking
- 3.6 Practical Variants and Implementation
- 4 Experiments
- 5 Conclusion
- References
- AutoSimulate: (Quickly) Learning Synthetic Data Generation
- 1 Introduction
- 2 Related Work
- 3 Problem Formulation
- 4 AutoSimulate
- 4.1 Stochastic Simulator (Data Generating Distribution)
- 4.2 Efficient Numerical Computation
- 5 Experiments
- 5.1 CLEVR Blender
- 5.2 Photorealistic Renderer Arnold
- 5.3 Additional Studies
- 6 Conclusion
- References
- LatticeNet: Towards Lightweight Image Super-Resolution with Lattice Block
- 1 Introduction
- 2 Related Work
- 2.1 Deep SR Models
- 2.2 Lightweight SR Models
- 2.3 Attention Mechanism
- 3 Proposed Method
- 3.1 From Lattice Filter to Lattice Block
- 3.2 Network Architecture
- 3.3 Backward Fusion Module
- 3.4 Loss Function
- 3.5 Discussions
- 4 Experiments
- 4.1 Datasets
- 4.2 Implementation Details
- 4.3 The Contribution of Lattice Block for Lightweight
- 4.4 Ablation Analysis
- 4.5 Comparisons with the State-of-the-arts
- 5 Conclusions
- References
- Learning from Scale-Invariant Examples for Domain Adaptation in Semantic Segmentation
- 1 Introduction
- 2 Related Works
- 3 Methodology
- 3.1 Preliminaries
- 3.2 Class-Based Sorting for Target Subset Selection
- 3.3 Dynamic Entropy Threshold for Class Dependent Filter Selection
- 3.4 Self-generated Scale-Invariant Examples
- 3.5 Leveraging Focal Loss for Class-Imbalance
- 3.6 Adaptation
- 4 Experiments and Results
- 4.1 Experimental Details
- 4.2 Comparisons with State-of-the-art Methods
- 4.3 Analysis
- 5 Conclusion
- References
- Active Visual Information Gatheringpg for Vision-Language Navigation
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 A Naïve Model with a Simple Exploration Ability
- 3.2 Where to Explore
- 3.3 Deeper Exploration
- 3.4 Training
- 4 Experiment
- 4.1 Experimental Setup
- 4.2 Comparison Results
- 4.3 Diagnostic Experiments
- 5 Conclusion
- References
- Deep Hough-Transform Line Priors
- 1 Introduction
- 2 Related Work
- 3 Hough Transform Block for Global Line Priors
- 3.1 HT: From Image Domain to Hough Domain
- 3.2 IHT: From Hough Domain to Image Domain
- 3.3 Convolution in Hough Transform Space
- 4 Experiments
- 4.1 Exp 1: Local and Global Information for Line Detection.
- 4.2 Exp 2: The Effect of Convolution in the Hough Domain
- 4.3 Exp 3: HT-IHT Block for Line Segment Detection
- 5 Conclusion
- References
- Unsupervised Shape and Pose Disentanglement for 3D Meshes
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Overview
- 3.2 Cross-Consistency
- 3.3 Self-consistency
- 3.4 Loss Terms and Objective Function
- 3.5 Implementation Details
- 4 Experiments
- 4.1 Datasets
- 4.2 Quantitative Evaluation
- 4.3 Qualitative Evaluation
- 5 Conclusion and Future Work
- References
- CLAWS: Clustering Assisted Weakly Supervised Learning with Normalcy Suppression for Anomalous Event Detection
- 1 Introduction
- 2 Related Work
- 3 Proposed Architecture
- 3.1 Training Data Organization
- 3.2 Backbone Network
- 3.3 Normalcy Suppression
- 3.4 Clustering Loss Module
- 3.5 Training Losses of the Proposed Algorithm
- 4 Experiments
- 4.1 Datasets
- 4.2 Evaluation Metric
- 4.3 Experimental Settings
- 4.4 Experiments on UCF-Crime Dataset
- 4.5 Experiments on ShanghaiTech
- 4.6 Ablation
- 4.7 Qualitative Analysis
- 5 Conclusions
- References
- Inclusive GAN: Improving Data and Minority Coverage in Generative Models
- 1 Introduction
- 2 Related Work
- 3 Inclusive GAN for Data and Minority Coverage
- 3.1 Adversarial Generation: GANs
- 3.2 Reconstructive Generation: IMLE
- 3.3 Harmonizing Adversarial and Reconstructive Generation: IMLE-GAN
- 3.4 Minority Coverage in IMLE-GAN
- 4 Experiments
- 4.1 Setup
- 4.2 Preliminary Study on Stacked MNIST
- 4.3 Empirical Study on Data and Model Biases
- 4.4 Comparisons on CelebA
- 4.5 Extension to Minority Inclusion
- 5 Conclusion
- References
- SESAME: Semantic Editing of Scenes by Adding, Manipulating or Erasing Objects
- 1 Introduction
- 2 Related Work
- 3 SESAME
- 4 Experiments
- 5 Conclusion
- References
- Dive Deeper into Box for Object Detection
- 1 Introduction
- 2 Related Work
- 3 Our Approach
- 3.1 Box Decomposition and Recombination
- 3.2 Semantic Consistency Module
- 4 Experiments
- 4.1 Experimental Setting
- 4.2 Overall Performance
- 4.3 Ablation Study
- 5 Conclusion
- References
- PG-Net: Pixel to Global Matching Network for Visual Tracking
- 1 Introduction
- 2 Related Works
- 3 Pixel to Global Matching Network
- 3.1 Overview
- 3.2 Pixel to Global Matching Module
- 3.3 Shared Correlation Architecture
- 3.4 Multiple Losses Mechanism
- 4 Experiment
- 4.1 Implementation Details
- 4.2 Ablation Experiments
- 4.3 Evaluation on VOT2018
- 4.4 Evaluation on VOT2018-LT
- 4.5 Evaluation on LaSOT Dataset
- 4.6 Evaluation on OTB2015
- 5 Conclusion
- References
- Why Are Deep Representations Good Perceptual Quality Features?
- 1 Introduction
- 2 Deep CNN Representations as Perceptual Quality Features
- 3 Problem Formulation
- 4 Perceptual Efficacy of Deep Features
- 4.1 Inputs
- 4.2 Measurement of the Spatial Frequency Sensitivity
- 4.3 Measurement of the Orientation Selectivity
- 4.4 Perceptual Efficacy (PE)
- 5 Experiments
- 5.1 Quality Assessment (QA) Tests
- 5.2 Just Noticeable Difference (JND) Test
- 5.3 2AFC Similarity Tests
- 5.4 Visual Evaluation of the Features
- 5.5 Super-Resolution
- 5.6 Discussion
- 6 Comparison with LPIPS and Other Metrics
- 7 Conclusion
- References
- Geometric Estimation via Robust Subspace Recovery
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Preliminaries on DLT
- 3.2 Robust Generalization
- 3.3 Extended Exploration of Linear Structure
- 3.4 Implementation Details
- 4 Experimental Results
- 4.1 Qualitative Analysis of Linear Embedding
- 4.2 Fundamental and Homography Estimation
- 4.3 Sensitivity to Outlier Rate
- 5 Conclusion
- References
- Latent Embedding Feedback and Discriminative Features for Zero-Shot Classification
- 1 Introduction
- 1.1 Contributions
- 2 Related Work
- 3 Method
- 3.1 Preliminaries: f-VAEGAN
- 3.2 Overall Architecture
- 3.3 Semantic Embedding Decoder
- 3.4 Feedback Module
- 3.5 (Generalized) Zero-Shot Classification
- 4 Experiments
- 4.1 State-of-the-Art Comparison
- 4.2 Ablation Study
- 5 (Generalized) Zero-Shot Action Recognition
- 6 Conclusion
- References
- Human Correspondence Consensus for 3D Object Semantic Understanding
- 1 Introduction
- 2 Related Work
- 3 CorresPondenceNet
- 3.1 Dataset Collections
- 3.2 Annotation Process
- 3.3 Annotation Type
- 4 Learning Dense Semantic Embeddings
- 4.1 Problem Statement
- 4.2 Method Details
- 4.3 Mean Geodesic Error
- 4.4 Experiments
- 5 Other Applications
- 5.1 Cross-Object Registration
- 5.2 Partial Object Matching
- 6 Conclusion
- References
- Learning Memory Augmented Cascading Network for Compressed Sensing of Images
- 1 Introduction
- 2 Related Work
- 3 Memory Augmented Cascading Reconstruction
- 3.1 Network Architecture
- 3.2 Single Cascading Stage
- 3.3 Contextual Memory Augmentation
- 3.4 Network Loss and Learning
- 4 Experimental Results and Analysis
- 4.1 Ablation Studies
- 4.2 Results on Natural Images
- 4.3 Compressive MRI Reconstruction
- 5 Conclusions
- References
- Least Squares Surface Reconstruction on Arbitrary Domains
- 1 Introduction
- 1.1 Related Work
- 2 Linear Least Squares Height-from-Normals
- 2.1 Linear Equations
- 2.2 Discrete Formulation
- 3 Numerical Differentiation Kernels
- 3.1 2D Savitzky-Golay Filters
- 3.2 K-Nearest Pixels Kernel
- 3.3 3D K-Nearest Neighbours Kernel
- 4 Implementation
- 5 Evaluation
- 6 Conclusions
- References
- Task-Conditioned Domain Adaptation for Pedestrian Detection in Thermal Imagery
- 1 Introduction
- 2 Related Work
- 2.1 Pedestrian Detection in the Visible Spectrum
- 2.2 Multispectral Pedestrian Detection Approaches
- 2.3 Pedestrian Detection in Thermal Imagery
- 2.4 Task-Conditioned Networks
- 3 Task-Conditioned Domain Adaptation
- 3.1 Auxiliary Classification Network
- 3.2 Conditioning Layers
- 3.3 Conditioned Network Architectures
- 3.4 Adaptation Loss
- 4 Experimental Results
- 4.1 Dataset and Evaluation Metrics
- 4.2 Implementation and Training
- 4.3 Ablation Studies
- 4.4 Comparison with the State-of-the-Art
- 5 Conclusions
- References
- Improving the Transferability of Adversarial Examples with Resized-Diverse-Inputs, Diversity-Ensemble and Region Fitting
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Gradient-Based Attack Methods
- 3.2 Observation Analyses
- 3.3 Resized-Diverse-Inputs Method
- 3.4 Diversity-Ensemble Method
- 3.5 Region Fitting
- 4 Experiments
- 4.1 Experimental Settings
- 4.2 The Internal Relationship
- 4.3 Single-Model Attacks
- 4.4 Ensemble-Based Attacks
- 5 Conclusion
- References
- Differentiable Automatic Data Augmentation
- 1 Introduction
- 2 Related Work
- 3 Differentiable Automatic Data Augmentation (DADA)
- 3.1 Search Space
- 3.2 Policy Sampling from a Joint Distribution
- 3.3 Differentiable Relaxation with Gumbel-Softmax
- 3.4 RELAX Gradient Estimator
- 3.5 Bi-level Optimization
- 4 Experiments
- 4.1 Settings
- 4.2 Results
- 4.3 DADA for Object Detection
- 4.4 Further Analysis
- 5 Conclusion
- References
- SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans
- 1 Introduction
- 2 Related Work
- 3 SceneCAD: Joint Object Alignment and Layout Estimation
- 3.1 Layout Prediction
- 3.2 CAD Model Alignment
- 3.3 Learning Object and Layout Relationships
- 4 Object+Layout Dataset
- 4.1 Extraction of Scene Layouts
- 4.2 Extraction of Object and Layout Relationships
- 4.3 Synthetic Data
- 5 Results
- 5.1 CAD Alignment Performance
- 5.2 Layout Prediction
- 6 Limitations
- 7 Conclusion
- References
- Kinship Identification Through Joint Learning Using Kinship Verification Ensembles
- 1 Introduction
- 2 Related Work
- 3 Kinship Identification Through Joint Learning with Kinship Verification
- 3.1 Definition of Kinship Verification, Kinship Identification and Kinship Classification
- 3.2 Relationship Between Kinship Verification and Kinship Identification and the Limitation of Existing Methods
- 4 Joint Learning of Kinship Identification and Kinship Verification
- 4.1 Architecture of the Proposed Joint Learning Network (JLNet)
- 4.2 Comparative Methods
- 5 Experiments
- 5.1 Unbias Dataset for Training and Testing
- 5.2 Experimental Design
- 5.3 Results and Evaluation
- 6 Conclusion
- References
- Kernelized Memory Network for Video Object Segmentation
- 1 Introduction
- 2 Related Work
- 3 Kernelized Memory Network
- 3.1 Architecture
- 3.2 Kernelized Memory Read
- 4 Pre-training by Hide-and-Seek
- 5 Experiments
- 5.1 Training Details
- 5.2 Inference Details
- 5.3 DAVIS 2016 and 2017
- 5.4 Youtube-VOS 2018
- 5.5 Qualitative Results
- 5.6 Analysis
- 6 Conclusion
- References
- A Single Stream Network for Robust and Real-Time RGB-D Salient Object Detection
- 1 Introduction
- 2 Related Work
- 3 Proposed Method
- 3.1 Single Stream Encoder Network
- 3.2 Depth-Enhanced Dual Attention Module
- 3.3 Pyramidally Attended Feature Extraction
- 4 Experiments
- 4.1 Dataset
- 4.2 Evaluation Metrics
- 4.3 Implementation Details
- 4.4 Comparison with State-of-the-Art Results
- 4.5 Ablation Studies
- 5 Conclusions
- References
- Splitting Vs. Merging: Mining Object Regions with Discrepancy and Intersection Loss for Weakly Supervised Semantic Segmentation
- 1 Introduction
- 2 Related Works
- 2.1 Weakly Supervised Semantic Segmentation
- 2.2 Region Mining
- 2.3 Co-training
- 3 Approach
- 3.1 Revisiting CAM
- 3.2 Splitting vs. Merging
- 3.3 Mask Generation
- 4 Experiments
- 4.1 Datasets and Implementation Details
- 4.2 Ablation Study
- 4.3 Segmentation Results
- 5 Conclusion
- References
- Temporal Keypoint Matching and Refinement Network for Pose Estimation and Tracking
- 1 Introduction
- 2 Related Work
- 2.1 Single-Image Pose Estimation
- 2.2 Multi-person Pose Tracking
- 2.3 Pose Estimation in Videos
- 3 Proposed Approach
- 3.1 Single-Frame Pose Estimation
- 3.2 Pose Tracking with Temporal Keypoint Matching
- 3.3 Pose Refinement with Temporal Context
- 3.4 Training
- 4 Experiments
- 4.1 Datasets and Evaluation
- 4.2 Implementation Details
- 4.3 Results on PoseTrack 2017
- 4.4 Results on PoseTrack 2018
- 5 Conclusion
- References
- Neural Point-Based Graphics
- 1 Introduction
- 2 Related Work
- 3 Methods
- 3.1 Rendering
- 3.2 Model Creation
- 4 Experiments
- 5 Discussion
- References
- FHDe2Net: Full High Definition Demoireing Network
- 1 Introduction
- 2 Related Work
- 3 Full High Definition Moiré Image Dataset
- 4 Methodology
- 4.1 Cascaded Global to Local Moiré Pattern Removal
- 4.2 Frequency Based High-Resolution Content Separation
- 4.3 Layer Fusion and Refinement
- 4.4 Training Loss and Implementation Details
- 5 Experiments
- 5.1 Quantitative Evaluation
- 5.2 Qualitative Evaluation
- 5.3 Ablation Study
- 6 Conclusion
- References
- Learning Structural Similarity of User Interface Layouts Using Graph Networks
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Graph Representation
- 3.2 GCN-CNN Encoder-Decoder
- 3.3 Metric Learning via Triplet Training
- 4 Experiments and Discussion
- 4.1 Datasets
- 4.2 Evaluation Metrics
- 4.3 Baseline Comparisons
- 4.4 Ablation Studies
- 4.5 Cumulative Ablation Study
- 4.6 Searching Auto-parsed Layouts
- 5 Conclusion
- References
- NAS-Count: Counting-by-Density with Neural Architecture Search
- 1 Introduction
- 2 Related Work
- 2.1 Crowd Counting Literature
- 2.2 NAS Fundamentals
- 2.3 NAS Applications
- 3 NAS-Count Methodology
- 3.1 Automatic Multi-Scale Network
- 3.2 Scale Pyramid Pooling Loss
- 4 Experiments
- 4.1 Implementation Details
- 4.2 Search Result Analysis
- 4.3 Ablation Study on Searched Architectures
- 4.4 Hyper-parameter Study
- 4.5 Performance and Comparison
- 5 Conclusion
- References
- Towards Generalization Across Depth for Monocular 3D Object Detection
- 1 Introduction
- 2 Related Work
- 3 Problem Description
- 4 Details of Our Contributions
- 4.1 Proposed Virtual Views
- 4.2 Proposed Single-Stage Architecture
- 5 Experiments
- 5.1 Implementation Details
- 5.2 Dataset and Experimental Protocol
- 5.3 3D Detection
- 6 Conclusions
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.