Computer Vision - ECCV 2022 Workshops

Name: Computer Vision - ECCV 2022 Workshops | Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part IV
Brand: Springer
Price: 106.99 EUR
Availability: OnlineOnly

Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part IV

Leonid Karlinsky Tomer Michaeli Ko Nishino(Editor)

Springer (Publisher)

Published on 13. February 2023

XXV, 775 pages

E-Book

PDF with digital watermarking

System requirements

978-3-031-25069-9 (ISBN)

€106.99incl. 7% vat

System requirements

for PDF with digital watermarking

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

The 8-volume set, comprising the LNCS books 13801 until 13809, constitutes the refereed proceedings of 38 out of the 60 workshops held at the 17th European Conference on Computer Vision, ECCV 2022. The conference took place in Tel Aviv, Israel, during October 23-27, 2022; the workshops were held hybrid or online.

The 367 full papers included in this volume set were carefully reviewed and selected for inclusion in the ECCV 2022 workshop proceedings. They were organized in individual parts as follows:

Part I: W01 - AI for Space; W02 - Vision for Art; W03 - Adversarial Robustness in the Real World; W04 - Autonomous Vehicle Vision

Part II: W05 - Learning With Limited and Imperfect Data; W06 - Advances in Image Manipulation;

Part III: W07 - Medical Computer Vision; W08 - Computer Vision for Metaverse; W09 - Self-Supervised Learning: What Is Next?;

Part IV: W10 - Self-Supervised Learning for Next-Generation Industry-LevelAutonomous Driving; W11 - ISIC Skin Image Analysis; W12 - Cross-Modal Human-Robot Interaction; W13 - Text in Everything; W14 - BioImage Computing; W15 - Visual Object-Oriented Learning Meets Interaction: Discovery, Representations, and Applications; W16 - AI for Creative Video Editing and Understanding; W17 - Visual Inductive Priors for Data-Efficient Deep Learning; W18 - Mobile Intelligent Photography and Imaging;

Part V: W19 - People Analysis: From Face, Body and Fashion to 3D Virtual Avatars; W20 - Safe Artificial Intelligence for Automated Driving; W21 - Real-World Surveillance: Applications and Challenges; W22 - Affective Behavior Analysis In-the-Wild;

Part VI : W23 - Visual Perception for Navigation in Human Environments: The JackRabbot Human Body Pose Dataset and Benchmark; W24 - Distributed Smart Cameras; W25 - Causality in Vision; W26 - In-Vehicle Sensing and Monitorization; W27 - Assistive Computer Vision and Robotics; W28 - Computational Aspectsof Deep Learning;

Part VII: W29 - Computer Vision for Civil and Infrastructure Engineering; W30 - AI-Enabled Medical Image Analysis: Digital Pathology and Radiology/COVID19; W31 - Compositional and Multimodal Perception;

Part VIII: W32 - Uncertainty Quantification for Computer Vision; W33 - Recovering 6D Object Pose; W34 - Drawings and Abstract Imagery: Representation and Analysis; W35 - Sign Language Understanding; W36 - A Challenge for Out-of-Distribution Generalization in Computer Vision; W37 - Vision With Biased or Scarce Data; W38 - Visual Object Tracking Challenge.

More details

Other editions

Content

Intro
Foreword
Preface
Organization
Contents - Part IV
W09 - Self-supervised Learning: What Is Next?
W09 - Self-supervised Learning: What Is Next?
Towards Self-Supervised and Weight-preserving Neural Architecture Search
1 Introduction
2 Related Work
3 Methodology
3.1 Preliminary: Differentiable NAS
3.2 Towards Weight-preserving
3.3 Towards Self-supervised Learning
3.4 Network Inflation Challenge
3.5 Searching with SSWP-NAS
4 Experiments
4.1 Experimental Settings
4.2 Benchmarking SSWP-NAS
4.3 Ablation Study
4.4 Weight-preserving Benefits Semi-supervised Learning
5 Conclusion
References
MoQuad: Motion-focused Quadruple Construction for Video Contrastive Learning
1 Introduction
2 Related Work
3 Method
3.1 MoQuad Sample Construction
3.2 Extra Training Strategies for MoQuad
4 Experiments
4.1 Experimental Settings
4.2 Comparison with State of the Arts
4.3 Ablation Studies
4.4 More Analyses
5 Conclusion
References
On the Effectiveness of ViT Features as Local Semantic Descriptors
1 Introduction
2 Related Work
3 ViT Features as Local Patch Descriptors
3.1 Properties of ViT's Features
4 Deep ViT Features Applied to Vision Tasks
5 Results
5.1 Part Co-segmentation
5.2 Co-segmentation
5.3 Point Correspondences
6 Conclusion
References
Anomaly Detection Requires Better Representations
1 Introduction
2 Related Work
3 Anomaly Detection as a Downstream Task for Representation Learning
4 Successful Representation Learning Enables Anomaly Detection
5 Gaps in Anomaly Detection Point to Bottlenecks in Representations Learning
5.1 Masked-Autoencoder: Advances in Self-Supervised Learning Do Not Always Imply Better Anomaly Detection
5.2 Complex Datasets: Current Representations Struggle on Scenes, Finegrained Classes, Multiple Objects
5.3 Unidentifiability: Representations for Anomaly Detection May Be Ambiguous Without Further Guidance
5.4 3D Point Clouds: Self-supervised Representations Do Not Always Improve over Handcrafted Ones
5.5 Tabular Data: When Representations Do Not Improve over the Original Data
6 Final Remarks
A Appendix
A.1 Anomaly detection comparison of MAE and DINO
A.2 Multi-modal datasets
A.3 Tabular domain
References
Leveraging Self-Supervised Training for Unintentional Action Recognition
1 Introduction
2 Related Work
3 Approach to Exploit the UA Inherent Biases
3.1 Framework Overview
3.2 Temporal Transformations of Inherent Biases of Unintentional Actions (T2IBUA )
4 Multi-Stage Learning for Unintentional Action Recognition
4.1 Transformer Block
4.2 [Stage 1] Frame2Clip (F2C) Learning
4.3 [Stage 2] Frame2Clip2Video (F2C2V) Learning
4.4 [Stage 3] Downstream Transfer to Unintentional Action Tasks
5 Experimental Results
5.1 Comparison to State-of-the-art
5.2 Ablation Study
6 Conclusion
References
A Study on Self-Supervised Object Detection Pretraining
1 Introduction
2 Related Work
2.1 Self-Supervised Learning from Images
2.2 Object Detection
3 Approach
3.1 View Construction and Box Sampling
3.2 SSL Backbone
3.3 Comparison with Prior Work
4 Experiments
4.1 Experimental Setup
4.2 Effect of Box Sampling Strategies
4.3 Effect of Methods to Extract Box Features
4.4 Effect of Multiple Views
4.5 Effect of Box Localization Auxiliary Task
5 Conclusion
References
Internet Curiosity: Directed Unsupervised Learning on Uncurated Internet Data
1 Introduction
2 Internet Curiosity
2.1 Image Search
2.2 Self-Supervised Training
2.3 ``Densifying'' the Target Dataset via Curiosity
2.4 Generating Queries
3 Results
References
W10 - Self-supervised Learning for Next-Generation Industry-Level Autonomous Driving
W10 - Self-supervised Learning for Next-Generation Industry-Level Autonomous Driving
Towards Autonomous Grading in the Real World
1 Introduction
2 Related Work
2.1 Bulldozer Automation
2.2 Sand Simulation
2.3 Sim-to-Real
3 Method
3.1 Problem Formulation
3.2 Dozer Simulation
3.3 Baseline Algorithm
3.4 Privileged Behavioral Cloning
3.5 Scaled Prototype Environment
4 Experiments
4.1 Simulation Results
4.2 Scaled Prototype Environment Results
5 Conclusions
References
Bootstrapping Autonomous Lane Changes with Self-supervised Augmented Runs
1 Introduction
1.1 Challenge
1.2 Related Works
2 Problem Formulation
2.1 States of Lanes and Surrounding Vehicles
2.2 Formulation as a Learning Problem
3 Sample Preparation by Augmented Run
3.1 Extracting Anchor Features from Real Runs
3.2 Auto-labeling from Augmented Runs
4 Supervised Learning
4.1 Interval-Level Feature Aggregation
4.2 Classification
5 Experiments
5.1 Performance of Proposed Approach
5.2 Performance of Alternative Approach
6 Conclusion
References
W11 - Skin Image Analysis
W11 - ISIC Skin Image Analysis
Artifact-Based Domain Generalization of Skin Lesion Models
1 Introduction
2 Background
3 Methodology
3.1 Trap Sets
3.2 Artifact-Based Environments
3.3 NoiseCrop: Test-Time Feature Selection
4 Results
4.1 Data
4.2 Model Selection and Implementation Details
4.3 Debiasing of Skin Lesion Models
4.4 Ablation Study
4.5 Out-of-Distribution Evaluation
4.6 Qualitative Analysis
5 Related Work
6 Conclusion
References
An Evaluation of Self-supervised Pre-training for Skin-Lesion Analysis
1 Introduction
2 Related Work
2.1 Self-supervised Learning for Visual Tasks
2.2 Self-supervised Learning on Medical Tasks
2.3 Self-supervised Learning on Skin Lesion Analysis
3 Materials and Methods
3.1 Datasets
3.2 Experimental Design
3.3 SSL UCL/SCL FT Pipelines
3.4 Implementation Details
4 Results
4.1 Self-supervision Schemes vs. Baseline Comparison
4.2 Systematic Evaluation of Pipelines
4.3 Low Training Data Scenario
4.4 Qualitative Analysis
5 Conclusions
References
Skin_Hair Dataset: Setting the Benchmark for Effective Hair Inpainting Methods for Improving the Image Quality of Dermoscopic Images
1 Introduction
2 Related Work
3 Skin_Hair Dataset
4 Effective Hair Inpainting Algorithms
4.1 Navier-Stokes
4.2 Telea
4.3 Hair_SinGAN Architecture
4.4 R-MNet Method
5 Result Analysis
6 Conclusions
References
FairDisCo: Fairer AI in Dermatology via Disentanglement Contrastive Learning
1 Introduction
2 Related Works
2.1 Skin Lesion Diagnosis
2.2 Fairness
3 Methodology
3.1 Proposed FairDisCo Model
3.2 An Investigation for Three Approaches
4 Experiments
4.1 Results on the Fitzpatrick17k Dataset
4.2 Results on the DDI Dataset
4.3 Loss Analysis
5 Conclusion
References
CIRCLe: Color Invariant Representation Learning for Unbiased Classification of Skin Lesions
1 Introduction
2 Method
2.1 Problem Definition
2.2 Feature Extractor and Classifier
2.3 Regularization Network
3 Experiments
3.1 Dataset
3.2 Implementation Details
3.3 Metrics
3.4 Models
4 Results and Analysis
4.1 Classification and Fairness Performance
4.2 Domain Adaptation Performance
4.3 Classification Performance Relation with Training Size
5 Discussion and Future Work
6 Conclusion
References
W12 - Cross-Modal Human-Robot Interaction
W12 - Cross-Modal Human-Robot Interaction
Distinctive Image Captioning via CLIP Guided Group Optimization
1 Introduction
2 Related Work
2.1 Image Captioning
2.2 Objectives for Image Captioning
2.3 Metrics for Distinctive Image Captioning
3 Methodology
3.1 Similar Image Group
3.2 Metrics
3.3 Group Embedding Gap Reward
4 Experiments
4.1 Implementation Details
4.2 Main Results
4.3 Comparison with State-of-the-Art
4.4 Ablation Study
4.5 User Study
4.6 Qualitative Results
5 Conclusions
References
W13 - Text in Everything
W13 - Text in Everything
OCR-IDL: OCR Annotations for Industry Document Library Dataset
1 Introduction
2 Related Work
3 OCR-IDL Dataset
3.1 Data Collection
3.2 Comparison to Existing Datasets
3.3 Dataset Statistics
4 Conclusion
References
Self-paced Learning to Improve Text Row Detection in Historical Documents with Missing Labels
1 Introduction
2 Related Work
3 Method
4 Experiments
4.1 Data Sets
4.2 Evaluation Setup
4.3 Results
5 Conclusion
References
On Calibration of Scene-Text Recognition Models
1 Introduction
2 Related Work
3 Background
4 Sequence-Level Calibration
5 Experiments
5.1 Experimental Setup
5.2 Results and Analysis
6 Conclusion
References
End-to-End Document Recognition and Understanding with Dessurt
1 Introduction
2 Related Work
2.1 LayoutLM Family
2.2 End-to-End Models
3 Model
4 Pre-training Procedure
4.1 IIT-CDIP Dataset
4.2 Synthetic Wikipedia
4.3 Synthetic Handwriting
4.4 Synthetic Forms
4.5 Distillation
4.6 Training
5 Experiments
5.1 RVL-CDIP
5.2 DocVQA and HW-SQuAD
5.3 FUNSD and NAF
5.4 IAM Database
5.5 Ablation
6 Conclusion
References
Task Grouping for Multilingual Text Recognition
1 Introduction
2 Related Work
2.1 Multilingual Text Spotting
2.2 Multitask Learning and Grouping
3 Methodology
3.1 Grouping Module
3.2 Integrated Loss
3.3 Integrated Loss with a Base Loss Coefficient
3.4 Grouping Loss
4 Experimentals
4.1 Datasets
4.2 Model Training
4.3 Task Grouping Results
4.4 Ablation Study
4.5 Task Assignment on Models with Different Hyper-parameters
4.6 E2E Text Recognition
5 Conclusions
References
Incorporating Self-attention Mechanism and Multi-task Learning into Scene Text Detection
1 Introduction
2 Related Work
2.1 Mask R-CNN
2.2 Attention Based Methods
2.3 Multi-task Learning Based Methods
3 Methodology
3.1 Self-attention Mechanism-based Backbone
3.2 Multi-task Cascade Refinement Text Detection
4 Experiments
4.1 Experiment Setup
4.2 Main Results
4.3 Ablation Studies
4.4 Visualization
4.5 Inference Speed
4.6 Error Analysis
5 Conclusion
References
Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks
1 Introduction
2 Related Work
3 Method
3.1 Documents Graph Structure
3.2 Node and Edge Features
3.3 Architecture
4 Experiments and Results
4.1 Proposed Model
4.2 FUNSD
4.3 RVL-CDIP Invoices
5 Conclusion
References
MUST-VQA: MUltilingual Scene-Text VQA
1 Introduction
2 Related Work
3 Method
3.1 Task Definition
3.2 Visual Encoder
3.3 Textual Encoders
3.4 Baselines
4 Experiments
4.1 Implementation Details
4.2 TextVQA Results
4.3 ST-VQA Results
5 Analysis
6 Conclusions and Future Work
References
Out-of-Vocabulary Challenge Report
1 Introduction
2 Related Work
3 Competition Protocol
4 The OOV Dataset
4.1 Dataset Analysis
5 The OOV Challenge
5.1 Task 1
5.2 Task 2
5.3 Baselines
5.4 Evaluation Metrics
6 Results
6.1 Task 1
6.2 Task 2
7 Analysis
7.1 Task 1
7.2 Task 2
8 Conclusion and Future Work
References
W14 - BioImage Computing
W14 - BioImage Computing
Towards Structured Noise Models for Unsupervised Denoising
1 Introduction
2 Related Work
2.1 Self- And Unsupervised Methods for Removing Structured Noise
2.2 Noise Modelling
3 Background
3.1 DivNoising and HDN
3.2 Pixel-Independent Noise Models
3.3 Signal-Independent Noise Models (a Simplification)
3.4 Deep Autoregressive Models
4 Methods
5 Experiments
5.1 Synthetic Noise Datasets
5.2 Photoacoustic Dataset
5.3 Training the Noise Model
5.4 Training HDN
5.5 Denoising with Autoregressive Noise Models
5.6 Evaluating the Noise Model
5.7 Choice of Autoregressive Pixel Ordering
6 Conclusion
References
Comparison of Semi-supervised Learning Methods for High Content Screening Quality Control
1 Introduction
2 Related Work
2.1 Transfer Learning
2.2 Self-supervised Learning
3 Method
3.1 Data
3.2 Encoder Training
3.3 Downstream Classification Tasks
3.4 Evaluation Criteria
4 Results
4.1 Evaluations on Downstream Tasks
4.2 Effect of a Decreasing Amount of Annotated Data
4.3 Effect of a Domain Shift
5 Conclusion
References
Discriminative Attribution from Paired Images
1 Introduction
2 Related Work
3 Method
3.1 Creation of Counterfactuals
3.2 Discriminative Attribution
3.3 Evaluation of Attribution Maps
4 Experiments
5 Discussion
References
Learning with Minimal Effort: Leveraging in Silico Labeling for Cell and Nucleus Segmentation*-4pt
1 Introduction
2 Materials and Methods
2.1 Image Acquisition
2.2 Nucleus Segmentation
2.3 Cell Segmentation
3 Results
3.1 Evaluation Metrics
3.2 Nucleus Segmentation
3.3 Cell Segmentation
4 Discussion
5 Conclusion
References
Towards Better Guided Attention and Human Knowledge Insertion in Deep Convolutional Neural Networks
1 Introduction
2 Related Work
3 Methods
3.1 Multi-Scale Attention Branch Network
3.2 Puzzle Module to Improve Fine-grained Recognition
3.3 Embedding Human Knowledge with Copy-Replace Augmentation
4 Experiments
4.1 Image Classification
4.2 Fine-grained Recognition
4.3 Attention Editing Performance
5 Conclusion
References
Characterization of AI Model Configurations for Model Reuse
1 Introduction
2 Methods
3 Experimental Results
4 Discussion
5 Summary
6 Disclaimer
References
Empirical Evaluation of Deep Learning Approaches for Landmark Detection in Fish Bioimages
1 Introduction
2 Related Work
3 Dataset Description
3.1 Zebrafish Microscropy Dataset
3.2 Medaka Microscopy Dataset
3.3 Seabream Radiography Dataset
4 Method Description
4.1 Direct Coordinates Regression
4.2 Heatmap-Based Regression
4.3 Training and Prediction Phases
4.4 Network Architectures
4.5 Experimental Protocol and Implementation
5 Results and Discussion
6 Conclusions
References
PointFISH: Learning Point Cloud Representations for RNA Localization Patterns
1 Introduction
2 Related Work
3 Problem Statement
4 PointFISH
4.1 Input Preparation
4.2 Model Architecture
5 Experiment
5.1 Training on Simulated Patterns
5.2 Analysis of the Embeddings Provided by PointFISH
5.3 Ablation Studies
6 Discussion
7 Conclusion
References
N2V2 - Fixing Noise2Void Checkerboard Artifacts with Modified Sampling Strategies and a Tweaked Network Architecture
1 Introduction
2 Related Work
3 Method
3.1 A Modified Network Architecture for N2V2
3.2 New Sampling Strategies to Cover Blind-Spots
4 Evaluation
4.1 Dataset Descriptions and Training Details
4.2 Evaluation Metrics
4.3 Results on Mouse SP3, SP6, and SP12 (Salt&Pepper Noise)
4.4 Evaluation Flywing G70, Mouse G20, BSD68
4.5 Evaluation of Real Noisy Data: Convallaria_95 and Convallaria_1
5 Discussion and Conclusions
References
W15 - Visual Object-Oriented Learning Meets Interaction: Discovery, Representations, and Applications
W15 - Visual Object-Oriented Learning Meets Interaction: Discovery, Representations, and Applications
Object Detection in Aerial Images with Uncertainty-Aware Graph Network
1 Introduction
2 Related Works
3 Method
3.1 Uncertainty-Aware Initial Object Detection
3.2 Uncertainty-Based Spatial-Semantic Graph Generation
3.3 Feature Refinement via GNNs with Spatial-Semantic Graph
3.4 Final Detection Pipeline and Training Losses
4 Experiments
4.1 Datasets
4.2 Experimental Setups
4.3 Quantitative Results
4.4 Analyses
5 Conclusion
References
W16 - AI for Creative Video Editing and Understanding
W16 - AI for Creative Video Editing and Understanding
STC: Spatio-Temporal Contrastive Learning for Video Instance Segmentation
1 Introduction
2 Related Works
2.1 Instance Segmentation
2.2 Video Instance Segmentation
2.3 Contrastive Learning
3 Method
3.1 Mask Generation for Still-Image
3.2 Proposed Framework for VIS
3.3 Spatio-Temporal Contrastive Learning
3.4 Temporal Consistency
3.5 Training and Inference
4 Experiments
4.1 Dataset
4.2 Metrics
4.3 Implementation Details
4.4 Main Results
4.5 Ablation Studies
4.6 Visualizations
5 Conclusion
References
Mitigating Representation Bias in Action Recognition: Algorithms and Benchmarks
1 Introduction
2 Related Work
3 Formulation
4 Evaluation Benchmark
4.1 Crafting Evaluation Datasets
4.2 Evaluation of Existing Methods
5 Method
5.1 Spatial-aware Multi-Aspect Debiasing
5.2 Exploiting Web Media with OmniDebias
6 Experiments
6.1 Experiment Setting
6.2 Main Results
6.3 Spatial-aware Multi-Aspect Debiasing
6.4 Exploiting Web Media with OmniDebias
7 Conclusion
References
SegTAD: Precise Temporal Action Detection via Semantic Segmentation
1 Introduction
2 Related Work
2.1 Temporal Action Detection
2.2 Object Detection and Semantic Segmentation
3 Proposed SegTAD
3.1 Problem Formulation and SegTAD Framework
3.2 1D Semantic Segmentation Network
3.3 Proposal Detection Network
3.4 Training and Inference
4 Experimental Results
4.1 Datasets and Implementation Details
4.2 Comparison to State-of-the-Art
4.3 Ablation Study
4.4 Visualization of Segmentation Output
5 Conclusion
References
Text-Driven Stylization of Video Objects
1 Introduction
2 Related Work
2.1 Video Editing
2.2 Text-Based Stylization
3 Method
3.1 CLIP
3.2 Neural Layered Atlases (NLA)
3.3 Our Stylization Pipeline
4 Experiments
4.1 Varied Stylizations
4.2 Prompt Specificity
4.3 Text Augmentation
4.4 Ablation Study
4.5 Limitations
5 Conclusion
References
MND: A New Dataset and Benchmark of Movie Scenes Classified by Their Narrative Function
1 Introduction
1.1 Background
1.2 Research Objectives and Contributions
2 Background and Related Work
2.1 Introduction to Story Models
2.2 Related Datasets and Research
3 The Story Model, 15 Story Beats
4 The MND Dataset
4.1 Data Collection - Movies and Scenes
4.2 Collecting Story Beats Labels for the Scenes
4.3 Dataset Analytics
5 A MND Task: Movie Scenes Classification by Their Narrative Function
5.1 Data Pre-processing
5.2 Feature Engineering
5.3 Baseline Approach
5.4 Classification Experiment and Baseline Results
6 Conclusion and Future Research
References
Are All Combinations Equal? Combining Textual and Visual Features with Multiple Space Learning for Text-Based Video Retrieval
1 Introduction
2 Related Work
3 Proposed Approach
3.1 Overall Architecture
3.2 Multiple Space Learning
3.3 Dual Softmax Inference
3.4 Specifics of Textual Information Processing
3.5 Specifics of Visual Information Processing
4 Experimental Results
4.1 Datasets and Experimental Setup
4.2 Results and Comparisons
4.3 Ablation Study
5 Conclusions
References
Scene-Adaptive Temporal Stabilisation for Video Colourisation Using Deep Video Priors
1 Introduction
2 Related Work
2.1 Video Colourisation
2.2 Deep Video Prior
2.3 Few-Shot Learning
3 Method
3.1 Extension of DVP to Multiple Scenes
3.2 Few-Shot Training Strategy
3.3 Network Architecture
4 Experiments
4.1 Training Strategy
4.2 Evaluation Metrics
4.3 Results
4.4 Ablations
5 Conclusions
References
Movie Lens: Discovering and Characterizing Editing Patterns in the Analysis of Short Movie Sequences
1 Introduction
2 Related Works
3 Data
4 Methodology
4.1 Label Estimation Phase
4.2 Editing Patterns Analysis
4.3 Technical Details
5 Preliminary Results
5.1 Character-Environment Relationship
5.2 Environment Description
5.3 Character-Character Interaction
5.4 Undefined Classes
5.5 Misclassified Sequences
6 Discussion
References
W17 - Visual Inductive Priors for Data-Efficient Deep Learning
W17 - Visual Inductive Priors for Data-Efficient Deep Learning
SKDCGN: Source-free Knowledge Distillation of Counterfactual Generative Networks Using cGANs
1 Introduction
2 Related Work
3 Approach
3.1 SKDCGN
4 Experiments and Results
4.1 Datasets
4.2 Baseline Model: CGN with Generator Replaced by TinyGAN Generator
4.3 Results of SKDCGN
4.4 Improving the SKDCGN Model
4.5 Additional Results: Study of the Shape IM
5 Discussion and Conclusion
6 Future Work
References
C-3PO: Towards Rotation Equivariant Feature Detection and Description
1 Introduction
2 Background
2.1 Theory
2.2 Related Work
3 Methodology
4 Experiments
5 Conclusion
References
Towards Flexible Inductive Bias via Progressive Reparameterization Scheduling
1 Introduction
2 Related Work
2.1 Convolution Neural Networks
2.2 Vision Transformers
2.3 Vision Transformers and Convolutions
3 Preliminaries
4 Inductive Bias Analysis of Various Architectures
4.1 Our Hypothesis
4.2 Data Scale Experiment
4.3 Fourier Analysis
5 Reparameterization Can Interpolate Inductive Biases
5.1 Experimental Settings
5.2 Interpolation of Convolutional Inductive Bias
6 Progressive Reparameterization Scheduling
7 Conclusion
References
Zero-Shot Image Enhancement with Renovated Laplacian Pyramid*-4pt
1 Introduction
2 Related Work
2.1 Traditional Signal Processing Method and Deep Learning
2.2 Laplacian Pyramid and Image Restoration
3 Multiscale Laplacian Enhancement for Image Manipulation
3.1 Formulation of Multiscale Laplacian Enhancement
3.2 Internal Results of Multiscale Laplacian Enhancement
3.3 Comparison with Unsharp Masking Filter
3.4 Ablation Study of MLE
3.5 Application of MLE to Underwater Images
4 Zero-Shot Attention Network with Multiscale Laplacian Enhancement (ZA-MLE)
4.1 Process of ZA-MLE
4.2 Loss Function
5 Experiment
5.1 Experimental Setting
5.2 Results and Discussions of ZA-MLE
5.3 Ablation Study of Loss Function
6 Conclusion
References
Beyond a Video Frame Interpolator: A Space Decoupled Learning Approach to Continuous Image Transition
1 Introduction
2 Related Work
2.1 Video Frame Interpolation (VFI)
2.2 Continuous Image Transition (CIT)
3 Proposed Method
3.1 Problem Formulation
3.2 Space Decoupled Learning
3.3 Training Strategy
4 Experiments and Applications
4.1 Datasets and Training Settings for VFI
4.2 Comparisons with State-of-the-Arts
4.3 Ablation Experiments
4.4 Applications Beyond VFI
5 Conclusion
References
Diversified Dynamic Routing for Vision Tasks
1 Introduction
2 Related Work
3 DivDR: Diversified Dynamic Routing
3.1 Dynamic Routing Preliminaries
3.2 Metric Learning in A-space
4 Experiments
4.1 Datasets
4.2 Implementation Details
4.3 Semantic Segmentation
4.4 Object Detection and Instance Segmentation
5 Discussion and Future Work
References
Author Index

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Computer Vision - ECCV 2022 Workshops

Description

More details

Other editions

Additional editions

Content

System requirements