
Computer Vision - ECCV 2020
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
The 1360 revised papers presented in these proceedings were carefully reviewed and selected from a total of 5025 submissions. The papers deal with topics such as computer vision; machine learning; deep neural networks; reinforcement learning; object recognition; image classification; image processing; object detection; semantic segmentation; human pose estimation; 3d reconstruction; stereo vision; computational photography; neural networks; image coding; image reconstruction; object recognition; motion estimation.
More details
Other editions
Additional editions

Content
- Intro
- Foreword
- Preface
- Organization
- Contents - Part XVI
- Partially-Shared Variational Auto-encoders for Unsupervised Domain Adaptation with Target Shift
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Problem Statement
- 3.2 Overview of the Proposed Method
- 3.3 Disentangled CycleGAN with Feature Consistency Loss
- 3.4 Partially Shared VAEs
- 4 Evaluation
- 4.1 Evaluation on Human-Pose Dataset
- 4.2 Evaluation on Digit Classification
- 5 Conclusion
- References
- Learning Where to Focus for Efficient Video Object Detection
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Framework Overview
- 3.2 Learnable Spatio-Temporal Sampling
- 3.3 Sparsely Recursive Feature Updating
- 3.4 Dense Feature Aggregation
- 4 Experiments
- 4.1 Datasets and Evaluation Metrics
- 4.2 Implementation Detail
- 4.3 Results
- 4.4 Ablation Studies
- 5 Conclusion
- References
- Learning Object Permanence from Video
- 1 Introduction
- 2 Related Work
- 3 The Learning Setup: Reason About Non-visible Objects
- 4 Our Approach
- 5 The LA-CATER Dataset
- 6 Experiments
- 6.1 Baselines and Model Variants
- 6.2 Evaluation Metric
- 7 Results
- 7.1 Reasoning with Perfect Perception
- 7.2 Learning only from Visible Frames
- 7.3 Comparison with CATER Data
- 7.4 Qualitative Examples
- 8 Conclusion
- References
- Adaptive Text Recognition Through Visual Matching
- 1 Introduction
- 2 Related Work
- 3 Model Architecture
- 3.1 Visual Similarity Encoder
- 3.2 Alphabet Agnostic Decoder
- 3.3 Training Loss
- 3.4 Discussion: One-Shot Sequence Recognition
- 4 Implementation Details
- 4.1 Visual Similarity Encoder
- 4.2 Alphabet Agnostic Decoder
- 4.3 Training and Optimization
- 5 Experiments
- 5.1 State-of-the-art Models in Text Recognition
- 5.2 Datasets and Metrics
- 5.3 Overview of Experiments
- 5.4 Ablation Study
- 5.5 Results
- 6 Conclusion
- References
- Actions as Moving Points
- 1 Introduction
- 2 Related Work
- 2.1 Object Detection
- 2.2 Spatio-Temporal Action Detection
- 3 Approach
- 3.1 Center Branch: Detect Center at Key Frame
- 3.2 Movement Branch: Move Center Temporally
- 3.3 Box Branch: Determine Spatial Extent
- 3.4 Tubelet Linking
- 4 Experiments
- 4.1 Experimental Setup
- 4.2 Ablation Studies
- 4.3 Comparison with the State of the Art
- 4.4 Runtime Analysis
- 4.5 Visualization
- 5 Conclusion and Future Work
- References
- Learning to Exploit Multiple Vision Modalities by Using Grafted Networks
- 1 Introduction
- 2 Related Work
- 3 Methods
- 3.1 Network Grafting Algorithm
- 3.2 Event Camera and Feature Volume Representation
- 3.3 Datasets
- 4 Experiments
- 4.1 Object Detection on Thermal Driving Dataset
- 4.2 Car Detection on Event Camera Driving Dataset
- 4.3 Comparing NGA and Standard Supervised Learning
- 5 Network Analysis
- 5.1 Decoding Grafted Front End Features
- 5.2 Design of Grafted Network
- 5.3 Ablation Study on Loss Terms
- 6 Conclusion
- References
- Geometric Correspondence Fields: Learned Differentiable Rendering for 3D Pose Refinement in the Wild
- 1 Introduction
- 2 Related Work
- 2.1 Differentiable Rendering
- 2.2 3D Pose Estimation
- 2.3 3D Pose Refinement
- 3 Learned 3D Pose Refinement
- 3.1 Runtime Object Function
- 3.2 Refinement Feature Space
- 3.3 Geometric Correspondence Fields
- 3.4 Learned Differentiable Rendering
- 4 Experimental Results
- 4.1 Comparison to the State of the Art
- 4.2 Ablation Study
- 4.3 3D Model Retrieval
- 5 Conclusion
- References
- 3D Fluid Flow Reconstruction Using Compact Light Field PIV
- 1 Introduction
- 2 Related Work
- 3 Our Approach
- 3.1 3D Particle Reconstruction
- 3.2 Fluid Flow Reconstruction
- 4 Experimental Results
- 4.1 Synthetic Data
- 4.2 John Hopkins Turbulence Database (JHUTDB)
- 4.3 Real Data
- 5 Conclusions
- References
- Contextual Diversity for Active Learning
- 1 Introduction
- 2 Related Work
- 3 Active Frame Selection
- 3.1 Contextual Diversity
- 3.2 Frame Selection Strategy
- 3.3 Network Architecture and Training
- 4 Results and Comparison
- 4.1 Semantic Segmentation
- 4.2 Object Detection
- 4.3 Image Classification
- 5 Analysis and Ablation Experiments
- 6 Conclusion
- References
- Temporal Aggregate Representations for Long-Range Video Understanding
- 1 Introduction
- 2 Related Works
- 3 Representations
- 3.1 Pooling
- 3.2 Recent vs. Spanning Representations
- 4 Framework
- 4.1 Non-Local Blocks (NLB)
- 4.2 Coupling Block (CB)
- 4.3 Temporal Aggregation Block (TAB)
- 4.4 Prediction Model
- 5 Experiments
- 5.1 Datasets and Features
- 5.2 Component Validation
- 5.3 Anticipation on Procedural Activities - Breakfast & 50 Salads
- 5.4 How Much Spanning Past Is Necessary?
- 5.5 Recognition and Anticipation on Daily Activities - EPIC
- 5.6 Temporal Video Segmentation
- 6 Discussion and Conclusion
- References
- Stochastic Fine-Grained Labeling of Multi-state Sign Glosses for Continuous Sign Language Recognition
- 1 Introduction
- 2 Related Works
- 3 Methodology
- 3.1 Framework Overview
- 3.2 Visual Model
- 3.3 Contextual Model
- 3.4 Alignment Model
- 4 Experiments
- 4.1 Dataset and Metrics
- 4.2 Basic Settings
- 4.3 Results and Analysis
- 4.4 Comparison with State-of-the-Arts
- 5 Conclusions
- References
- General 3D Room Layout from a Single View by Render-and-Compare
- 1 Introduction
- 2 Related Work
- 2.1 Layout Generation from Image Features
- 2.2 Layout Generation from 3D Planes
- 3 Approach
- 3.1 Formalization
- 3.2 Set of Candidate 3D Polygons R0(I)
- 3.3 Cost Function K(X, I)
- 3.4 Optimization
- 3.5 Iterative Layout Refinement
- 3.6 Structured Output
- 4 Evaluation
- 4.1 ScanNet-Layout Benchmark
- 4.2 Evaluation on ScanNet-Layout
- 4.3 Evaluation on NYUv2 303
- 4.4 Failure Cases
- 5 Conclusion
- References
- Neural Dense Non-Rigid Structure from Motion with Latent Space Constraints
- 1 Introduction
- 2 Related Work
- 3 Revisiting NRSfM
- 4 Deformation Model with Shape Auto-Decoder
- 4.1 Modelling Deformation with Neural Networks
- 4.2 Differentiable Energy Function
- 4.3 Implementation Details
- 4.4 Applications of the Deformation Auto-Decoder f
- 5 Experiments
- 5.1 Quantitative Comparisons
- 5.2 Period Detection and Sequence Segmentation
- 5.3 Qualitative Results and Applications
- 6 Concluding Remarks
- References
- Multimodal Memorability: Modeling Effects of Semantics and Decay on Video Memorability
- 1 Introduction
- 2 Related Work
- 3 Memento10k: A Multimodal Memorability Dataset
- 4 Memory Decay: A Theoretical Formulation
- 5 Modeling Experiments
- 5.1 Modeling Visual Features
- 5.2 Modeling Semantic Features
- 5.3 Modeling Memorability Decay
- 6 Model Results
- 7 Conclusion
- References
- Yet Another Intermediate-Level Attack
- 1 Introduction
- 2 Related Work
- 2.1 Problem Setting
- 3 Our Method
- 3.1 Intermediate-Level Normalization
- 4 Experimental Results
- 4.1 Delve into the Multi-step Baseline Attacks
- 4.2 Our Method with Varying
- 4.3 Compare with the State-of-the-Arts
- 4.4 Experimental Settings and 2 Attacks
- 5 Conclusions
- References
- Topology-Change-Aware Volumetric Fusion for Dynamic Scene Reconstruction
- 1 Introduction
- 2 Related Work
- 3 System Overview
- 4 Technical Details
- 4.1 Topology-Change-Aware Registration
- 4.2 Topology-Change-Aware Geometric Fusion
- 5 Experimental Results
- 5.1 Evaluation on Synthetic Data
- 5.2 Comparison to State-of-the-Art on Real Data
- 5.3 Ablation Study
- 6 Conclusion and Future Work
- References
- Early Exit or Not: Resource-Efficient Blind Quality Enhancement for Compressed Images
- 1 Introduction
- 2 Related Work
- 2.1 Quality Enhancement for Compressed Images
- 2.2 Blind Denoising for Images
- 3 Proposed Approach
- 3.1 Motivation
- 3.2 Dynamic DNN Architecture with Early-Exit Strategy
- 3.3 Image Quality Assessment for Enhanced Images
- 3.4 Loss Function
- 4 Experiments
- 4.1 Dataset
- 4.2 Implementation Details
- 4.3 Evaluation
- 5 Conclusions
- References
- PatchNets: Patch-Based Generalizable Deep Implicit 3D Shape Representations
- 1 Introduction
- 2 Related Work
- 3 Proposed Approach
- 3.1 Implicit Patch Representation
- 3.2 Preliminaries
- 3.3 Loss Functions
- 3.4 Blended Surface Reconstruction
- 4 Experiments
- 4.1 Settings
- 4.2 Surface Reconstruction
- 4.3 Object-Level Priors
- 4.4 Articulated Deformation
- 5 Concluding Remarks
- References
- How Does Lipschitz Regularization Influence GAN Training?
- 1 Introduction
- 2 Related Work
- 2.1 GAN Loss Functions
- 2.2 Lipschitz Regularization
- 3 Restrictions of GAN Loss Functions
- 3.1 How Does the Restriction Happen?
- 3.2 Restricting Loss Functions by Domain Scaling
- 4 Experiments
- 4.1 Experiment Setup
- 4.2 Empirical Analysis of Lipschitz Regularization
- 4.3 Empirical Results on Domain Scaling
- 5 Conclusion
- References
- Infrastructure-Based Multi-camera Calibration Using Radial Projections
- 1 Introduction
- 2 Related Work
- 3 Multi-camera Calibration
- 3.1 The Sparse Map and Input Framesets
- 3.2 Initial Camera Pose Estimation
- 3.3 Camera Extrinsics and Rig Poses Estimation
- 3.4 Camera Extrinsics and Rig Poses Refinement
- 3.5 Camera Upgrading and Refinement
- 3.6 Final Refinement
- 4 Implementation
- 5 Experimental Evaluation
- 5.1 Evaluation Datasets and Setup
- 5.2 Calibration Accuracy and Run-Time on Full Image Sequence
- 5.3 Evaluation of Robustness on Shorter Image Sequences
- 5.4 Evaluation of Initial Estimates
- 5.5 Evaluation on RobotCar Dataset
- 5.6 Application: Robot Localization in a Garden
- 6 Conclusions
- References
- MotionSqueeze: Neural Motion Feature Learning for Video Understanding
- 1 Introduction
- 2 Related Work
- 3 Proposed Approach
- 3.1 MotionSqueeze (MS) Module
- 3.2 MotionSqueeze Network (MSNet)
- 4 Experiments
- 4.1 Datasets
- 4.2 Implementation Details
- 4.3 Comparison with State-of-the-Art Methods
- 4.4 Comparison with Other Motion Representation Methods
- 4.5 Ablation Studies
- 4.6 Visualization
- 5 Conclusion
- References
- Polarized Optical-Flow Gyroscope
- 1 Introduction
- 2 Prior Work
- 3 Theoretical Background
- 3.1 Reference Frames and Polarization
- 3.2 Traditional Optical-Flow
- 4 Polarized-Flow
- 4.1 Polarized Optical-Flow Equation
- 5 Quantifying Sensitivities
- 5.1 Image-Based Sensitivity
- 5.2 Polarimetric Sensitivity
- 6 Polarized-Flow Gyro Forward Model
- 7 Solving a Polarized-Flow Gyro Inverse Problem
- 8 Simulation Based on Real Data
- 9 Experiments
- 10 Conclusions
- References
- Online Meta-learning for Multi-source and Semi-supervised Domain Adaptation
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Background
- 3.2 Meta-learning for Domain Adaptation
- 3.3 Shortest Path Optimization
- 4 Experiments
- 4.1 Multi-source Domain Adaptation
- 4.2 Semi-supervised Domain Adaptation
- 4.3 Further Analysis
- 5 Conclusion
- A Short-Path Gradient Descent
- B Additional Illustrative Schematics
- C Additional Experiments
- References
- An Ensemble of Epoch-Wise Empirical Bayes for Few-Shot Learning
- 1 Introduction
- 2 Related Works
- 3 Preliminary
- 4 An Ensemble of Epoch-Wise Empirical Bayes Models
- 4.1 Empirical Bayes Method
- 4.2 Learning the Ensemble of Base-Learners
- 4.3 Meta-learning the Hyperprior Learners
- 4.4 Plugging-In E3BM to Baseline Methods
- 5 Experiments
- 5.1 Datasets and Implementation Details
- 5.2 Results and Analyses
- 6 Conclusions
- References
- On the Effectiveness of Image Rotation for Open Set Domain Adaptation
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Problem Formulation
- 3.2 Overview
- 3.3 Rotation Recognition for Open Set Domain Adaptation
- 3.4 Stage I: Known/Unknown Separation
- 3.5 Stage II: Domain Alignment
- 4 On Reproducibility and Open Set Metrics
- 5 Experiments
- 5.1 Setup: Baselines, Datasets
- 5.2 Implementation Details
- 5.3 Results
- 6 Discussion and Conclusions
- References
- Combining Task Predictors via Enhancing Joint Predictability
- 1 Introduction
- 2 The Predictor Combination Problem
- 3 Joint Predictor Combination Algorithm
- 3.1 Linear Predictor Combination (LPC)
- 3.2 Nonlinear Predictor Combination (NPC)
- 3.3 Automatic Identification of Relevant Tasks
- 3.4 Joint Denoising
- 4 Experiments
- 5 Discussions and Conclusions
- References
- Multi-scale Positive Sample Refinement for Few-Shot Object Detection
- 1 Introduction
- 2 Related Work
- 3 Background
- 3.1 Baseline Few-Shot Object Detection
- 3.2 Preliminary Attempts
- 4 Multi-scale Positive Sample Refinement
- 4.1 Multi-scale Positive Sample Refinement Branch
- 4.2 Framework
- 5 Experiments
- 5.1 Datasets and Settings
- 5.2 Results
- 5.3 Analysis of Sparse Scales
- 5.4 Ablation Studies
- 6 Conclusions
- References
- Single-Image Depth Prediction Makes Feature Matching Easier
- 1 Introduction
- 2 Related Work
- 3 Perspective Unwarping
- 3.1 Depth Estimation
- 3.2 Normal Computation and Clustering
- 3.3 Patch Rectification
- 3.4 Warping Back
- 4 Dataset for Strong Viewpoint Changes
- 5 Experiments
- 5.1 Matching Across Large Viewpoint Changes
- 5.2 Re-localization from Opposite Viewpoints
- 6 Conclusion
- References
- Deep Reinforced Attention Learning for Quality-Aware Visual Recognition
- 1 Introduction
- 2 Related Work
- 3 Approach
- 3.1 Overview
- 3.2 Recurrent Critic
- 3.3 Attention Actor
- 3.4 Reinforced Optimization
- 4 Experiments
- 4.1 Category Recognition
- 4.2 Instance Recognition
- 5 Conclusion
- References
- CFAD: Coarse-to-Fine Action Detector for Spatiotemporal Action Localization
- 1 Introduction
- 2 Related Works
- 2.1 Action Recognition
- 2.2 Spatio-Temporal Action Detection
- 2.3 Weight Prediction
- 3 Methodology
- 3.1 Framework
- 3.2 Coarse Tube Estimation
- 3.3 Selective Refinement
- 4 Experiment Results
- 4.1 Experiment Configuration
- 4.2 Ablation Study
- 4.3 Comparison with State-of-the-Art
- 4.4 Qualitative Results
- 5 Conclusion
- References
- Learning Joint Spatial-Temporal Transformations for Video Inpainting
- 1 Introduction
- 2 Related Work
- 3 Spatial-Temporal Transformer Networks
- 3.1 Overall Design
- 3.2 Spatial-Temporal Transformer
- 3.3 Optimization Objectives
- 4 Experiments
- 4.1 Dataset
- 4.2 Baselines and Evaluation Metrics
- 4.3 Comparisons with State-of-the-Arts
- 4.4 Ablation Study
- 5 Conclusions
- References
- Single Path One-Shot Neural Architecture Search with Uniform Sampling
- 1 Introduction
- 2 Review of NAS Approaches
- 3 Our Single Path One-Shot Approach
- 4 Experiment Results
- 5 Conclusion
- References
- Learning to Generate Novel Domains for Domain Generalization
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Generating Novel-Domain Data
- 3.2 Maintaining Semantic Consistency
- 3.3 Training
- 3.4 Design of Conditional Generator Network
- 3.5 Design of Distribution Divergence Measure
- 4 Experiments
- 4.1 Evaluation on Homogeneous DG
- 4.2 Evaluation on Heterogeneous DG
- 4.3 Ablation Study
- 4.4 Further Analysis
- 5 Conclusion
- References
- Continuous Adaptation for Interactive Object Segmentation by Learning from Corrections
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Interactive Segmentation Model
- 3.2 Learning from Corrections at Test-Time
- 3.3 Simulating User Corrections
- 4 Experiments
- 4.1 Adapting to Distribution Shift
- 4.2 Adapting to a Specific Class
- 4.3 Adapting to Domain Changes
- 4.4 Comparison to Previous Methods
- 4.5 Ablation Study
- 4.6 Adaptation Speed
- 5 Conclusion
- References
- Impact of Base Dataset Design on Few-Shot Image Classification
- 1 Introduction
- 2 Related Work and Classical Few-Shot Benchmarks
- 2.1 Data Selection and Sampling
- 2.2 Few-Shot Classification
- 3 Base Dataset Design and Evaluation for Few-Shot Classification
- 3.1 Dataset Evaluation Using Few-Shot Classification
- 3.2 A Large Base Dataset, ImageNet-6K
- 3.3 Class Definition and Sampling Strategies
- 3.4 Architecture and Training Details
- 4 Analysis
- 4.1 Importance of Base Data and Its Similarity to Test Data
- 4.2 Effect of the Number of Classes for a Fixed Number of Annotations
- 4.3 Redefining Classes
- 4.4 Selecting Classes Based on Their Diversity or Difficulty
- 5 Conclusion
- References
- Invertible Zero-Shot Recognition Flows
- 1 Introduction
- 2 Related Work
- 3 Preliminaries: Generative Flows and INNs
- 4 Formulation: Factorized Conditional Flow
- 4.1 Forward Pass: Factorizing the Semantics
- 4.2 Reverse Pass: Conditional Sample Generation
- 4.3 Network Structure
- 5 Training with the Merits of Generative Flow
- 5.1 Learning to Decode by Encoding
- 5.2 Centralizing Classification Prototypes
- 5.3 Measuring the Seen-Unseen Bias
- 5.4 Overall Objective and Training
- 5.5 Zero-Shot Recognition with IZF
- 6 Experiments
- 6.1 Implementation Details
- 6.2 Toy Experiments: Illustrative Analysis
- 6.3 Real Data Experimental Settings
- 6.4 Comparison with the State-of-the-Arts
- 6.5 Component Analysis
- 6.6 Hyper-Parameters
- 6.7 Discriminability on Unseen Classes
- 7 Conclusion
- References
- GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of Planes
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Parameterizing Depth Maps of Planes
- 3.2 Learning Depth Maps of Planes
- 3.3 Training on 2D Layout Datasets
- 3.4 Generating Layout Estimates
- 4 Matterport3D-Layout Dataset
- 5 Experimental Results
- 5.1 Implementation Details
- 5.2 Results on Matterport3D-Layout Dataset.
- 5.3 Results on 2D Layout Datasets
- 6 Conclusion
- References
- Location Sensitive Image Retrieval and Tagging
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Learning with Hashtag Supervision
- 3.2 Location Sensitive Model (LocSens)
- 4 Experiments
- 4.1 Image by Tag Retrieval
- 4.2 Location Sensitive Image by Tag Retrieval
- 4.3 Image Tagging
- 5 Conclusions
- References
- Joint 3D Layout and Depth Prediction from a Single Indoor Panorama Image
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Input and Pre-processing
- 3.2 Coarse Depth and Semantics
- 3.3 Layout Prediction
- 3.4 Depth Refinement
- 3.5 Training Details
- 4 Experiments
- 4.1 Layout Prediction
- 4.2 Depth Estimation
- 5 Conclusion
- References
- Guessing State Tracking for Visual Dialogue
- 1 Introduction
- 2 Related Work
- 3 Model: Guessing State Tracking
- 3.1 Update of Visual Representation (UoVR)
- 3.2 Question-Answer Encoder (QAEncoder)
- 3.3 Update of Guessing State (UoGS)
- 3.4 Early and Incremental Supervision
- 3.5 Training
- 4 Experiments and Analysis
- 4.1 Experimental Setup
- 4.2 Comparison with the State-of-the-Art
- 4.3 Ablation Study
- 4.4 Qualitative Evaluation
- 4.5 Discussion on Stop Questioning
- 5 Conclusion
- References
- Memory-Efficient Incremental Learning Through Feature Adaptation
- 1 Introduction
- 2 Related Work
- 3 Background on Incremental Learning
- 4 Memory-Efficient Incremental Learning
- 4.1 Network Training
- 4.2 Feature Adaptation
- 4.3 Training the Feature Classifier g
- 5 Experiments
- 5.1 Experimental Setup
- 5.2 Impact of Memory Footprint
- 5.3 Comparison to State of the Art
- 5.4 Impact of Parameters
- 6 Conclusions
- References
- Neural Voice Puppetry: Audio-Driven Facial Reenactment
- 1 Introduction
- 2 Related Work
- 3 Overview
- 4 Data
- 4.1 Preprocessing:
- 5 Method
- 5.1 Audio2ExpressionNet
- 5.2 Neural Face Rendering
- 5.3 Training
- 5.4 Inference
- 6 Results
- 6.1 Ablation Studies
- 6.2 Comparisons to State-of-the-art Methods
- 7 Limitations
- 8 Conclusion
- References
- One-Shot Unsupervised Cross-Domain Detection
- 1 Introduction
- 2 Related Work
- 3 Method
- 4 Experiments
- 4.1 Datasets
- 4.2 Performance Analysis
- 4.3 Comparison with One-Shot Style Transfer
- 4.4 Ablation Study
- 5 Conclusions
- References
- Stochastic Frequency Masking to Improve Super-Resolution and Denoising Networks
- 1 Introduction
- 2 Related Work
- 3 Frequency Perspective on SR and Denoising
- 3.1 Super-Resolution
- 3.2 Extension to Denoising
- 4 Stochastic Frequency Masking (SFM)
- 4.1 Motivation and Implementation
- 4.2 Learning SR and Denoising with SFM
- 5 Experiments
- 5.1 SR: Bicubic and Gaussian Degradations
- 5.2 SR: Real-Image Degradations
- 5.3 Denoising: AWGN
- 5.4 Denoising: Real Poisson-Gaussian Images
- 6 Conclusion
- References
- Probabilistic Future Prediction for Video Scene Understanding
- 1 Introduction
- 2 Related Work
- 3 Model Architecture
- 3.1 Perception
- 3.2 Dynamics
- 3.3 Future Prediction
- 3.4 Present and Future Distributions
- 3.5 Control
- 3.6 Losses
- 4 Experiments
- 4.1 Training Data
- 4.2 Metrics
- 5 Results
- 5.1 Spatio-Temporal Representation
- 5.2 Probabilistic Future
- 5.3 Driving Policy
- 6 Conclusions
- References
- Suppressing Mislabeled Data via Grouping and Self-attention
- 1 Introduction
- 2 Related Work
- 2.1 Learning with Noisy Labeled Data
- 2.2 Mixup and Variations
- 3 Attentive Feature Mixup
- 3.1 Overview
- 3.2 Group-to-Attend Module
- 3.3 Mixup Module
- 3.4 Training and Inference
- 4 Experiments
- 4.1 Datasets and Implementation Details
- 4.2 Comparison on Food101N
- 4.3 Comparison on Clothing1M
- 4.4 Ablation Study
- 4.5 Visualizations
- 5 Conclusion
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.