
Computer Vision - ECCV 2020
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
The 1360 revised papers presented in these proceedings were carefully reviewed and selected from a total of 5025 submissions. The papers deal with topics such as computer vision; machine learning; deep neural networks; reinforcement learning; object recognition; image classification; image processing; object detection; semantic segmentation; human pose estimation; 3d reconstruction; stereo vision; computational photography; neural networks; image coding; image reconstruction; object recognition; motion estimation.
More details
Other editions
Additional editions

Content
- Intro
- Foreword
- Preface
- Organization
- Contents - Part I
- Quaternion Equivariant Capsule Networks for 3D Point Clouds
- 1 Introduction
- 2 Related Work
- 3 Preliminaries and Technical Background
- 3.1 Equivariance
- 3.2 The Quaternion Group H1
- 3.3 3D Point Clouds
- 4 SO(3)-Equivariant Dynamic Routing
- 4.1 Equivariant Quaternion Mean
- 4.2 Equivariant Weiszfeld Dynamic Routing
- 5 Equivariant Capsule Network Architecture
- 5.1 QEC Module
- 5.2 Network Architecture
- 6 Experimental Evaluations
- 7 Conclusion and Discussion
- References
- DeepFit: 3D Surface Fitting via Neural Network Weighted Least Squares
- 1 Introduction
- 2 Background and Related Work
- 2.1 Deep Learning for Unstructured 3D Point Clouds
- 2.2 Normal Vector and Principal Curvature Estimation
- 2.3 Jet Fitting Using Least Squares and Weighted Least Squares
- 3 DeepFit
- 3.1 Learning Point-Wise Weights
- 3.2 Geometric Quantities Estimation
- 3.3 Consistency Loss
- 3.4 Implementation Notes
- 4 Results
- 4.1 Dataset and Training Details
- 4.2 Normal Estimation Performance
- 4.3 Principal Curvature Estimation Performance
- 4.4 Surface Reconstruction and Noise Removal
- 5 Summary
- References
- NSGANetV2: Evolutionary Multi-objective Surrogate-Assisted Neural Architecture Search
- 1 Introduction
- 2 Related Work
- 3 Proposed Approach
- 3.1 Search Space
- 3.2 Overall Algorithm Description
- 3.3 Speeding Up Upper Level Optimization
- 3.4 Speeding Up Lower Level Optimization
- 4 Experiments and Results
- 4.1 Performance of the Surrogate Predictors
- 4.2 Search Efficiency
- 4.3 Results on Standard Datasets
- 5 Scalability of MSuNAS
- 5.1 Types of Datasets
- 5.2 Number of Objectives
- 6 Conclusion
- References
- Describing Textures Using Natural Language
- 1 Introduction
- 2 Related Work
- 3 Dataset and Tasks
- 3.1 Dataset Collection
- 3.2 Tasks and Evaluation Metrics
- 4 Methods
- 4.1 A Discriminative Approach
- 4.2 A Metric Learning Approach
- 4.3 A Generative Language Approach
- 5 Experiments and Analysis
- 5.1 Phrase and Image Retrieval
- 5.2 Description Generation
- 5.3 A Critical Analysis of Language Modeling
- 6 Applications
- 7 Conclusion
- References
- Empowering Relational Network by Self-attention Augmented Conditional Random Fields for Group Activity Recognition
- 1 Introduction
- 2 Related Works
- 3 Feature Extraction Network
- 4 CRF for Individual Action Recognition
- 5 Proposed Method
- 5.1 Temporal and Spatial Self-attention
- 5.2 Self-Attention Augmented Conditional Random Fields
- 5.3 Reformulation of Mean-Field Inference
- 5.4 Bidirectional UTE for Group Activity Recognition
- 6 Experimental Results
- 6.1 Experimental Settings
- 6.2 Ablation Studies
- 6.3 Comparison with the State-of-the-Art Works
- 7 Conclusions
- References
- AiR: Attention with Reasoning Capability
- 1 Introduction
- 2 Related Works
- 3 Method
- 3.1 Attention with Reasoning Capability
- 3.2 Measuring Attention Accuracy with ROIs
- 3.3 Reasoning-Aware Attention Supervision
- 3.4 Evaluation Benchmark and Human Attention Baseline
- 4 Experiments and Analyses
- 4.1 Do Machines or Humans Look at Places Important to Reasoning? How Does Attention Influence Task Performances?
- 4.2 How Does Attention Accuracy Evolve Throughout the Reasoning Process?
- 4.3 Does Progressive Attention Supervision Improve Attention and Task Performance?
- 5 Conclusion
- References
- Self6D: Self-supervised Monocular 6D Object Pose Estimation
- 1 Introduction
- 2 Related Work
- 2.1 Monocular 6D Pose Estimation
- 2.2 Neural Rendering
- 2.3 Recent Trends in Self-supervised Learning
- 2.4 Domain Adaptation for 6D Pose Estimation
- 3 Self-supervised 6D Pose Estimation
- 4 Evaluation
- 4.1 Analysis on the Quality of Predicted Masks
- 4.2 Ablation Study
- 4.3 Comparison with State-of-the-Art
- 5 Conclusion
- References
- Invertible Image Rescaling
- 1 Introduction
- 2 Related Work
- 2.1 Image Upscaling After Downscaling
- 2.2 Invertible Neural Network
- 2.3 Image Compression
- 3 Methods
- 3.1 Model Specification
- 3.2 Invertible Architecture
- 3.3 Training Objectives
- 4 Experiments
- 4.1 Dataset and Settings
- 4.2 Evaluation on Reconstructed HR Images
- 4.3 Evaluation on Downscaled LR Images
- 5 Conclusion
- References
- Synthesize Then Compare: Detecting Failures and Anomalies for Semantic Segmentation
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 General Framework
- 3.2 Failure Detection
- 3.3 Anomaly Segmentation
- 3.4 Conceptual Explanation
- 4 Experiments
- 4.1 Failure Detection
- 4.2 Anomaly Segmentation
- 5 Conclusions
- References
- House-GAN: Relational Generative Adversarial Networks for Graph-Constrained House Layout Generation
- 1 Introduction
- 2 Related Work
- 3 Graph-Constrained House Layout Generation Problem
- 4 House-GAN
- 4.1 House Layout Generator
- 4.2 House Layout Discriminator
- 5 Implementation Details
- 6 Experimental Results
- 7 Conclusion
- References
- Crowdsampling the Plenoptic Function
- 1 Introduction
- 2 Related Work
- 3 Approach
- 3.1 Collecting Crowdsampled Data
- 3.2 The DeepMPI Scene Representation
- 3.3 Stage 1: Optimizing DeepMPI Color and Planes
- 3.4 Stage 2: Learning How Appearance Changes with Time
- 4 Experiments
- 5 Discussion and Conclusion
- References
- VoxelPose: Towards Multi-camera 3D Human Pose Estimation in Wild Environment
- 1 Introduction
- 2 Related Work
- 2.1 Single Person 3D Pose Estimation
- 2.2 Multiple Person 3D Pose Estimation
- 3 Cuboid Proposal Network
- 3.1 Feature Volume
- 3.2 Cuboid Proposals
- 3.3 Non-maximum Suppression
- 3.4 Network Structures of CPN
- 4 Pose Regression Network
- 4.1 Constructing Feature Volume
- 4.2 Regression of Human Poses
- 4.3 Training Strategies
- 5 Datasets and Metrics
- 6 Evaluation of CPN
- 7 Evaluation of PRN
- 7.1 2D Pose Estimation Accuracy
- 7.2 Ablation Study on 3D Pose Estimation
- 7.3 Comparison to the State-of-the-Arts
- 8 Conclusion
- References
- End-to-End Object Detection with Transformers
- 1 Introduction
- 2 Related Work
- 2.1 Set Prediction
- 2.2 Transformers and Parallel Decoding
- 2.3 Object Detection
- 3 The DETR Model
- 3.1 Object Detection Set Prediction Loss
- 3.2 DETR Architecture
- 4 Experiments
- 4.1 Comparison with Faster R-CNN and RetinaNet
- 4.2 Ablations
- 4.3 DETR for Panoptic Segmentation
- 5 Conclusion
- References
- DeepSFM: Structure from Motion via Deep Bundle Adjustment
- 1 Introduction
- 2 Related Work
- 3 Architecture
- 3.1 2D Feature Extraction
- 3.2 Depth Based Cost Volume (D-CV)
- 3.3 Pose Based Cost Volume (P-CV)
- 3.4 Cost Aggregation and Regression
- 3.5 Training
- 4 Experiments
- 4.1 Datasets
- 4.2 Evaluation
- 4.3 Model Analysis
- 5 Conclusions
- References
- Ladybird: Quasi-Monte Carlo Sampling for Deep Implicit Field Based 3D Reconstruction with Symmetry
- 1 Introduction
- 1.1 Sampling Methods in Monte Carlo Integration
- 2 Our Approach
- 2.1 Preliminary
- 2.2 Sampling
- 2.3 Feature Fusion Based on Symmetry
- 3 Experiments
- 3.1 Data Processing
- 3.2 Network Details
- 3.3 Samplers Impact on Training
- 3.4 Effect of Feature Fusion Based on Symmetry
- 3.5 Comparison with Other Methods
- 4 Conclusion
- References
- Segment as Points for Efficient Online Multi-Object Tracking and Segmentation
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Context-Aware Instance Embeddings Extraction
- 3.2 Instance Segmentation with Temporal Seed Consistency
- 4 Apollo MOTS Dataset
- 4.1 Overview
- 4.2 Annotation
- 5 Experiments
- 6 Conclusions
- References
- Conditional Convolutions for Instance Segmentation
- 1 Introduction
- 1.1 Related Work
- 2 Instance Segmentation with CondInst
- 2.1 Overall Architecture
- 2.2 Network Outputs and Training Targets
- 2.3 Loss Function
- 2.4 Inference
- 3 Experiments
- 3.1 Implementation Details
- 3.2 Architectures of the Mask Head
- 3.3 Design Choices of the Mask Branch
- 3.4 How Important to Upsample Mask Predictions?
- 3.5 CondInst without Bounding-Box Detection
- 3.6 Comparisons with State-of-the-Art Methods
- 4 Conclusions
- References
- MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Preliminary
- 3.2 Rethinking Efficient Network Design
- 3.3 Mutual Learning Framework
- 4 Experiments
- 4.1 Evaluation on ImageNet Classification
- 4.2 Ablation Study
- 4.3 Transfer Learning
- 4.4 Object Detection and Instance Segmentation
- 5 Conclusion and Future Work
- References
- Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset
- 1 Introduction
- 2 Related Work
- 3 Dataset Specification and Collection
- 3.1 Ontology Specification
- 3.2 Image Collection and Annotation Pipeline
- 4 Dataset Analysis
- 4.1 Image Analysis
- 4.2 Mask Analysis
- 4.3 Category and Attributes Analysis
- 5 Evaluation Protocol and Baselines
- 5.1 Evaluation Metric
- 5.2 Attribute-Mask R-CNN
- 5.3 Results Discussion
- 6 Conclusion
- References
- Privacy Preserving Structure-from-Motion
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Initialization
- 3.2 Triangulation
- 3.3 Camera Resectioning
- 3.4 Bundle Adjustment
- 3.5 Implementation Details
- 4 Experiments
- 4.1 Evaluation of Camera Pose Accuracy
- 4.2 Evaluation of Initialization Scheme
- 4.3 Comparison with Traditional Structure-from-Motion
- 4.4 Structure-from-Motion on Internet Images
- 4.5 Qualitative Comparison of Feature Inversion Results
- 5 Conclusion
- References
- Rewriting a Deep Generative Model
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Objective: Changing a Rule with Minimal Collateral Damage
- 3.2 Viewing a Convolutional Layer as an Associative Memory
- 3.3 Updating W to Insert a New Value
- 3.4 Generalize to a Nonlinear Neural Layer
- 4 User Interface
- 5 Results
- 5.1 Putting Objects into a New Context
- 5.2 Removing Undesired Features
- 5.3 Changing Contextual Rules
- 6 Discussion
- References
- Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Similar Images Set
- 3.2 Between-Set CIDEr (CIDErBtw)
- 3.3 CIDErBtw Training Strategies
- 4 Experiments
- 4.1 Implementation Details
- 4.2 Experiment Results
- 4.3 User Study
- 4.4 Qualitative Results
- 5 Conclusion
- References
- Long-Term Human Motion Prediction with Scene Context
- 1 Introduction
- 2 Related Work
- 3 Approach
- 3.1 GoalNet: Predicting 2D Path Destination
- 3.2 PathNet: Planning 3D Path towards Destination
- 3.3 PoseNet: Generating 3D Pose following Path
- 4 GTA Indoor Motion Dataset
- 5 Evaluation
- 5.1 Datasets
- 5.2 Evaluation Metric and Baselines
- 5.3 Comparison with Baselines
- 5.4 Evaluation and Visualization on Longer-Term Predictions
- 5.5 Discussion of Failure Cases
- 6 Conclusion
- References
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
- 1 Introduction
- 2 Related Work
- 3 Neural Radiance Field Scene Representation
- 4 Volume Rendering with Radiance Fields
- 5 Optimizing a Neural Radiance Field
- 5.1 Positional Encoding
- 5.2 Hierarchical Volume Sampling
- 5.3 Implementation Details
- 6 Results
- 6.1 Datasets
- 6.2 Comparisons
- 6.3 Discussion
- 6.4 Ablation Studies
- 7 Conclusion
- References
- ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes
- 1 Introduction
- 2 Related Work
- 3 Developing Referential 3D-Centric Data
- 3.1 Creating Template Based Spatial References
- 3.2 Natural Reference in 3D Scenes
- 4 Developing 3D Neural Listeners
- 5 Experiments and Analysis
- 6 Conclusion
- References
- MatryODShka: Real-time 6DoF Video View Synthesis Using Multi-sphere Images
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Multi-sphere Image Representation
- 3.2 Model Architecture
- 3.3 Training Losses
- 3.4 High-resolution Rendering
- 4 Experiments
- 5 Discussion and Conclusion
- References
- Learning and Aggregating Deep Local Descriptors for Instance-Level Recognition
- 1 Introduction
- 2 Related Work
- 3 Background
- 4 Method
- 4.1 Derivation of the Architecture
- 4.2 Relation to Prior Work
- 5 Experiments
- 5.1 Datasets
- 5.2 Implementation Details
- 5.3 Ablation Experiments
- 5.4 Large-Scale Instance-Level Search
- 5.5 Large-Scale Instance-Level Classification
- 6 Conclusions
- References
- A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point Problem
- 1 Introduction
- 1.1 Related Work
- 1.2 The PnP as a Quadratic Program with Quadratic Constraints
- 1.3 Contributions
- 2 Method
- 2.1 Minima on the 8-Sphere
- 2.2 Sequential Quadratic Programming
- 2.3 The General Case
- 3 The SQPnP Algorithm
- 4 Experiments
- 4.1 Synthetic Experiments
- 5 Conclusion
- References
- Learn to Recover Visible Color for Video Surveillance in a Day
- 1 Introduction
- 2 Related Work
- 3 Dataset
- 3.1 Data Capturing
- 3.2 Data Preprocessing
- 4 State Synchronization Network
- 5 Experimental Setup
- 5.1 Baselines
- 6 Results and Discussions
- 6.1 Quantitative Evaluation
- 6.2 Qualitative Results
- 6.3 Perceptual Experiments
- 6.4 Generalization Analysis
- 6.5 Ablation Experiment
- 7 Conclusion
- References
- Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction from Single Images
- 1 Introduction
- 2 Related Work
- 3 Dataset Construction
- 4 A Baseline Approach for Single-View Reconstruction
- 4.1 Template Mesh Generation
- 4.2 Learning Surface Reconstruction
- 4.3 Training
- 5 Experimental Results
- 5.1 Benchmarking on Single-View Reconstruction
- 5.2 Ablation Analysis
- 6 Conclusions and Discussions
- References
- Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Stochastic Sampling-Interpolation Network
- 3.2 Stochastic Sampling Module
- 3.3 Interpolation Module
- 3.4 Grid Prior
- 3.5 Integration with Residual Block
- 4 Experiments
- 4.1 Experimental Settings
- 4.2 Ablation Study
- 4.3 Object Detection
- 4.4 Semantic Segmentation
- 4.5 Image Classification
- 4.6 Analysis of Sampling and Interpolation Modules
- 4.7 Realistic Run-Time on CPU
- 5 Conclusion
- References
- BorderDet: Border Feature for Dense Object Detection
- 1 Introduction
- 2 Related Works
- 3 Our Approach
- 3.1 Motivation
- 3.2 Border Align
- 3.3 Network Architecture
- 3.4 Model Training and Inference
- 4 Experiments
- 4.1 Implementation Details
- 4.2 Ablation Study
- 4.3 Border Align
- 4.4 Analysis of BorderDet
- 4.5 Generalization of BorderDet
- 4.6 Comparisons with State-of-the-Art Detectors
- 5 Conclusion
- References
- Regularization with Latent Space Virtual Adversarial Training
- 1 Introduction
- 2 Related Work
- 3 Background
- 3.1 Virtual Adversarial Training and Local Constraint
- 3.2 Transformer
- 4 Method
- 5 Experiments
- 5.1 Datasets
- 5.2 Model Training
- 5.3 Results
- 6 Discussions
- 6.1 Adversarial Examples
- 6.2 Failure Analysis: Limitation of VAE Reconstruction Ability on CIFAR-10
- 7 Conclusion
- References
- Du2Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels
- 1 Introduction
- 2 Related Work
- 3 Dual-Pixel Sensors
- 4 Fusing Dual-Pixels and Dual-Cameras
- 4.1 Feature Extraction and Cost Volumes
- 4.2 Fused Confidence Volume
- 4.3 Disparity Refinement
- 4.4 Loss Function
- 5 Evaluation
- 5.1 Data Collection
- 5.2 Training Scheme
- 5.3 Ablation Study
- 5.4 Comparison to State-of-the-Art Methods
- 5.5 Applications in Computational Photography
- 6 Discussion
- References
- Model-Agnostic Boundary-Adversarial Sampling for Test-Time Generalization in Few-Shot Learning
- 1 Introduction
- 2 Related Work
- 2.1 Few-Shot Learning
- 2.2 Adversarial Learning
- 3 MABAS: Boundary-Adversarial Sample Generation
- 3.1 The Few-Shot Classification Problem
- 3.2 Test-Time Fine-Tuning of Embedding Functions
- 3.3 Fine-Tuning by Boundary-Adversarial Samples
- 4 Application to Various Few-Shot Methods
- 4.1 MetaOptNet
- 4.2 Few-Shot Without Forgetting
- 4.3 Standard Transfer Learning
- 5 Experiments
- 5.1 Experimental Setup
- 5.2 Quantitative Evaluation
- 5.3 Qualitative Evaluation
- 6 Conclusion
- References
- Targeted Attack for Deep Hashing Based Retrieval
- 1 Introduction
- 2 Related Work
- 2.1 Deep Hashing Based Similarity Retrieval
- 2.2 Adversarial Attack
- 3 The Proposed Method
- 3.1 Preliminaries
- 3.2 Deep Hashing Targeted Attack
- 4 Experiments
- 4.1 Benchmark Datasets and Evaluation Metrics
- 4.2 Overall Results on Image Retrieval
- 4.3 Overall Results on Video Retrieval
- 4.4 Discussion
- 4.5 Open-Set Targeted Attack
- 5 Conclusion and Future Work
- References
- Gradient Centralization: A New Optimization Technique for Deep Neural Networks
- 1 Introduction
- 2 Related Work
- 3 Gradient Centralization
- 3.1 Motivation
- 3.2 Notations
- 3.3 Formulation of GC
- 3.4 Embedding of GC to SGDM/Adam
- 4 Properties of GC
- 4.1 Improving Generalization Performance
- 4.2 Accelerating Training Process
- 5 Experimental Results
- 5.1 Setup of Experiments
- 5.2 Results on Mini-Imagenet
- 5.3 Experiments on CIFAR100
- 5.4 Results on ImageNet
- 5.5 Results on Fine-Grained Image Classification
- 5.6 Objection Detection and Segmentation
- 6 Conclusions
- References
- Content-Aware Unsupervised Deep Homography Estimation
- 1 Introduction
- 2 Related Work
- 3 Algorithm
- 3.1 Network Structure
- 3.2 Triplet Loss for Robust Homography Estimation
- 3.3 Unsupervised Content-Awareness Learning
- 4 Experimental Results
- 4.1 Dataset and Implementation Details
- 4.2 Comparisons with Existing Methods
- 4.3 Ablation Studies
- 5 Conclusions
- References
- Multi-view Optimization of Local Feature Geometry
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Overview
- 3.2 Two-View Refinement
- 3.3 Multi-view Refinement
- 4 Implementation Details
- 5 Experimental Evaluation
- 5.1 Image Matching
- 5.2 Triangulation
- 5.3 Camera Localization
- 5.4 Structure-from-Motion
- 6 Conclusion
- References
- The Phong Surface: Efficient 3D Model Fitting Using Lifted Optimization
- 1 Introduction
- 1.1 Related Work
- 2 Method
- 2.1 Phong Surface Model
- 2.2 Lifted Optimization with the Phong Surface
- 2.3 Correspondence Update on Triangles
- 3 Experiments
- 3.1 Rigid Pose Alignment of an Ellipsoid
- 3.2 Performance on Hand Tracking
- 4 Conclusions
- References
- Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video
- 1 Introduction
- 2 Related Works
- 3 Method
- 3.1 Joint Modeling of Human-Object Interaction
- 3.2 Motor Attention Module
- 3.3 Interaction Hotspots Module
- 3.4 Anticipation Module
- 3.5 Training and Inference
- 3.6 Network Architecture
- 4 Experiments
- 4.1 Datasets and Annotations
- 4.2 FPV Action Anticipation on EPIC-Kitchens
- 4.3 Ablation Study
- 4.4 Remarks and Discussion
- 5 Conclusions
- References
- Learning Stereo from Single Images
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Stereo Training Data from Monocular Depth
- 3.2 Handling Occlusion and Collisions
- 3.3 Depth Sharpening
- 3.4 Implementation Details
- 4 Experiments
- 4.1 Evaluation Datasets and Metrics
- 4.2 Comparison to Alternative Data Generation Methods
- 4.3 Model Architecture Ablation
- 4.4 Comparing Different Monocular Depth Networks
- 4.5 Ablating Components of Our Method
- 4.6 Adapting to the Target Domain
- 4.7 Varying the Amount of Training Data
- 5 Discussion
- 6 Conclusion
- References
- Prototype Rectification for Few-Shot Learning
- 1 Introduction
- 2 Related Works
- 3 Methodology
- 3.1 Denotation
- 3.2 Cosine Similarity Based Prototypical Network
- 3.3 Bias Diminishing for Prototype Rectification
- 4 Theoretical Analysis
- 4.1 Lower Bound of the Expected Performance
- 4.2 Derivation of Shifting Term
- 5 Experiments
- 5.1 Datasets
- 5.2 Implementation Details
- 5.3 Results on MiniImageNet and TieredImageNet
- 5.4 Results on Meta-Dataset
- 5.5 Ablation Study
- 5.6 Comparison with Transductive Fine-Tuning
- 6 Conclusions
- References
- Learning Feature Descriptors Using Camera Pose Supervision
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Loss Formulation
- 3.2 Differentiable Matching Layer
- 3.3 Coarse-to-Fine Architecture
- 3.4 Discussion
- 3.5 Implementation Details
- 4 Experimental Results
- 4.1 Feature Matching Results
- 4.2 Results on Downstream Tasks
- 4.3 Ablation Analysis
- 5 Conclusion
- References
- Semantic Flow for Fast and Accurate Scene Parsing
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Preliminary
- 3.2 Flow Alignment Module
- 3.3 Network Architectures
- 4 Experiments
- 4.1 Experiments on Cityscapes
- 4.2 Experiment on More Datasets
- 5 Conclusion
- References
- Appearance Consensus Driven Self-supervised Human Mesh Recovery
- 1 Introduction
- 2 Related Work
- 3 Approach
- 3.1 Representation and Notations
- 3.2 Mesh Estimation Architecture
- 3.3 Self-supervised Learning Objectives
- 4 Experiments
- 4.1 Ablative Study
- 4.2 Comparison with the State-of-the-Art
- 4.3 Qualitative Results
- 5 Conclusion
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.