Computer Vision - ACCV 2018

Name: Computer Vision - ACCV 2018 | 14th Asian Conference on Computer Vision, Perth, Australia, December 2-6, 2018, Revised Selected Papers, Part IV
Brand: Springer
Price: 53.49 EUR
Availability: OnlineOnly

14th Asian Conference on Computer Vision, Perth, Australia, December 2-6, 2018, Revised Selected Papers, Part IV

C.V. Jawahar Hongdong Li Greg Mori Konrad Schindler(Editor)

Springer (Publisher)

Published on 24. May 2019

XX, 749 pages

E-Book

PDF with digital watermarking

System requirements

978-3-030-20870-7 (ISBN)

€53.49incl. 7% vat

System requirements

for PDF with digital watermarking

E-Book Single Licence

Available for download

Description

More details

Other editions

Content

Intro
Preface
Organization
Contents - Part IV
Poster Session P2
Gaussian Process Deep Belief Networks: A Smooth Generative Model of Shape with Uncertainty Propagation
1 Introduction
2 Related Work
3 Background
3.1 Deep Belief Networks
3.2 GPLVM
3.3 Shape Boltzmann Machine
4 The GPDBN Model
5 Experiments
6 Conclusion
References
Gated Hierarchical Attention for Image Captioning
1 Introduction
2 Related Work
2.1 Image Captioning with Attention Mechanisms
2.2 Word-CNNs for NLP
3 Gated Hierarchical Attention
4 Experiments
4.1 Dataset and Preprocessing
4.2 Implementation Details
4.3 Results on Karpathy Test Split
4.4 Results on Online Test Set
4.5 Ablation Study and Analysis
5 Conclusion and Future Work
References
Dealing with Ambiguity in Robotic Grasping via Multiple Predictions
1 Introduction
2 Related Work
3 Methods
3.1 Grasp Belief Maps
3.2 CNN Regression
3.3 Multiple Grasp Predictions
3.4 Grasp Option Ranking
4 Experiments and Results
4.1 Dataset
4.2 Implementation Details
4.3 Grasp Detection Metric
4.4 Evaluation and Comparisons
4.5 Evaluation over Multiple Grasps
4.6 Generalization
5 Conclusion
References
Adaptive Visual-Depth Fusion Transfer
1 Introduction
2 Related Work
3 Adaptive Visual-Depth Fusion Transfer
3.1 Notations
3.2 Visual-Depth Metric Fusion
3.3 Adaptive Transfer
3.4 Reparametrization and Optimization
4 Experiments and Results
4.1 Datasets
4.2 Benchmarks and Settings
5 Experimental Results
5.1 Parameter Sensitivity Analysis
5.2 Special Cases Analysis
5.3 Conclusion
References
Solving Minimum Cost Lifted Multicut Problems by Node Agglomeration
1 Introduction
2 Related Work
3 Optimization Problem
3.1 Minimum Cost Multicut Problem
3.2 Minimum Cost Lifted Multicut Problem
4 Objectives
5 Proposed Approach
5.1 Algorithms
6 Experiments
6.1 Image Decomposition
6.2 Mesh Segmentation
6.3 ISBI 2012 Challenge
7 Conclusion
References
Robust Deep Multi-modal Learning Based on Gated Information Fusion Network
1 Introduction
2 Related Works
2.1 Deep Multi-modal Learning
2.2 Object Detection Using Multi-modal Data
3 Robust Deep Multi-modal Learning (R-DML)
3.1 R-DML Architecture
3.2 Training
4 Experimental Results
4.1 Datasets
4.2 Experimental Results on KITTI Dataset
4.3 Experimental Results on SUN-RGBD Dataset
5 Conclusions
References
Hardware-Aware Softmax Approximation for Deep Neural Networks
1 Introduction
2 Background and Motivation
3 Exploring the Search Space of Operand Bit-Width
4 Approximating Softmax Operation
5 Training with Softmax Approximation
6 Experiments
6.1 Impact of Operand Bit-Width
6.2 Evaluations on Softmax Approximation Variants
6.3 Tradeoff Between Energy/Area Cost and Accuracy
6.4 Evaluations on Clipped Training
6.5 Discussion
7 Related Work
8 Conclusions and Future Work
References
Video Object Segmentation with Language Referring Expressions
1 Introduction
2 Related Work
2.1 Grounding Natural Language Expressions
2.2 Video Object Segmentation
3 Method
3.1 Grounding Objects in Video by Referring Expressions
3.2 Pixel-Level Video Object Segmentation
4 Collecting Referring Expressions for Video
5 Evaluation of Natural Language Grounding in Video
5.1 DAVIS16/DAVIS17 Referring Expression Grounding
6 Video Object Segmentation Results
6.1 DAVIS16 Single Object Segmentation
6.2 DAVIS17 Multiple Object Segmentation
7 Conclusion
References
Nonlinear Subspace Feature Enhancement for Image Set Classification
1 Introduction
2 Related Work
3 Nonlinear Subspace Feature Enhancement (NSFE)
3.1 Structured Loss Function
3.2 Learning Algorithm
3.3 Concrete Embeddings
3.4 Classification
4 Experiments
4.1 YouTube Celebrities (YTC)
4.2 YouTube Faces (YTF)
4.3 Mobile Faces (MobFaces)
4.4 Results
5 Conclusion
References
Continual Occlusion and Optical Flow Estimation
1 Introduction
2 Related Work
3 ContinualFlow
3.1 Occlusion Estimation
3.2 Refinement Network
3.3 ContinualFlow Estimation over Image Sequence
3.4 Training Loss
4 Experiments
4.1 Ablation Study
4.2 Comparison with State of the Art
5 Conclusion
References
Adversarial Learning for Visual Storytelling with Sense Group Partition
1 Introduction
2 Related Work
3 Model of Adversarial Storytelling
3.1 Sense Group
3.2 Generative Model
3.3 Reward Model
4 Experiments
4.1 Dataset
4.2 Evaluation
5 Conclusion
References
Laser Scar Detection in Fundus Images Using Convolutional Neural Networks
1 Introduction
2 Related Work
3 A Large-Scale Dataset for Laser Scar Detection
4 Our Approach
4.1 CNNs for Laser Scar Detection
4.2 Transfer Learning
5 Evaluation
5.1 Experimental Setup
5.2 Experiments
6 Conclusions
References
Gradient-Guided DCNN for Inverse Halftoning and Image Expanding
1 Introduction
2 Background and Related Work
2.1 Halftoning and Inverse Halftoning
2.2 Image Companding
3 Proposed Method
3.1 Two-Stage DCNN
3.2 Loss Function and Training
4 Experimental Results
4.1 Experiment Settings
4.2 Inverse Halftoning
4.3 Image Expanding
4.4 Model Analysis
5 Conclusions
References
Learning from PhotoShop Operation Videos: The PSOV Dataset
1 Introduction
2 Dataset Construction Procedure
3 Dataset Description
4 Tasks and Evaluation
5 Methodology
5.1 Attention-Aware Filtering
5.2 Video Regularization
5.3 3-D CNN
5.4 Data Augmentation
6 Experiments
6.1 Ablation Study
6.2 Analysis on the Command Classification Task
7 Conclusion
References
A Joint Local and Global Deep Metric Learning Method for Caricature Recognition
1 Introduction
2 Related Work
2.1 Caricature Recognition
2.2 Deep Metric Learning
3 Joint Local and Global Deep Metric Learning
3.1 Network Structure
3.2 Pairwise Loss Function
3.3 Implementation
4 Experiments
4.1 Dataset
4.2 Data Preprocessing
4.3 Results of Different Deep Network Structures
4.4 Local and Global Methods
4.5 Indirect and Direct Fine-Tuning
4.6 Deep and Hand-Crafted Features
4.7 Deep and Shallow Metric Learning
5 Conclusions
References
Fast Single Shot Instance Segmentation
1 Introduction
2 Related Work
3 Fast Single Shot Instance Segmentation
3.1 Multi-task Design for Instance Segmentation
3.2 Global View of the Pipeline
3.3 Fusion Feature
3.4 SSD Head for Object Detection
3.5 Segmentation Sub-networks
3.6 Direction Map of Objects
3.7 Post-process for Generating Instance Mask
3.8 Training and Loss Functions
4 Experiments
4.1 Experiments on PASCAL VOC
4.2 Inference Time Comparison
4.3 Microsoft COCO Dataset
5 Conclusions
References
A Stable Algebraic Camera Pose Estimation for Minimal Configurations of 2D/3D Point and Line Correspondences
1 Introduction
2 Related Work
3 Notation and Geometrical Constraints
3.1 2D/3D Point Correspondence
3.2 2D/3D Line Correspondence
4 Minimal Solution
4.1 P3P
4.2 P2P1L
4.3 P1P2L
4.4 P3L
4.5 Solve the Rotation Matrix
4.6 Algorithm Summary
5 Simulation Results
5.1 Results of P3P Problem
5.2 Results of P2P1L and P1P2L Problem
5.3 Results of P3L Problem
5.4 Computational Time
6 Conclusion
References
Symmetry-Aware Face Completion with Generative Adversarial Networks
1 Introduction
2 Related Work
3 Proposed Method
3.1 Generator
3.2 Discriminator
3.3 Symmetry Detection for Face Components
3.4 Loss Functions
3.5 Training
4 Experimental Results
4.1 Datasets
4.2 Qualitative Results
4.3 Comparison with the State of the Art
4.4 Quantitative Comparison
4.5 Limitations and Discussion
5 Conclusions
References
GrowBit: Incremental Hashing for Cross-Modal Retrieval
1 Introduction
2 Related Work
3 Proposed Approach
3.1 Stage 1: Learning the Hash Code
3.2 Stage 2: Learning the Hash Functions
4 Experiments
4.1 Datasets and Evaluation Protocol
4.2 Baseline and Implementation Details
4.3 Results
4.4 Analysis of the Proposed Approach.
5 Summary
References
Region-Semantics Preserving Image Synthesis
1 Introduction
2 Related Work
3 Fast RSP Image Synthesis
3.1 Technical Contributions
4 Experiments
4.1 Results Given Single-Object Regions
4.2 Results Given Complex Regions
4.3 Semantics Preserving Vs. Mode Collapse
4.4 Quantitative Comparison
4.5 Effect of l and Gradient Noises
5 Conclusion
References
SemiStarGAN: Semi-supervised Generative Adversarial Networks for Multi-domain Image-to-Image Translation
1 Introduction
2 Related Work
3 Proposed Method
3.1 Formulation
3.2 GAN Objectice
3.3 Domain Classification Loss and Self-Ensembling
3.4 Cycle Consistency and Pseudo Cycle Consistency Loss
3.5 Y Model: Splitting Classifier and Discriminator
3.6 Full Objective
4 Experimental Validation
4.1 Evaluation Metrics
4.2 Implementation and Training
4.3 Experimental Results
5 Conclusion
References
Gated Transfer Network for Transfer Learning
1 Introduction
2 Related Work
3 Gated Transfer Network
3.1 Transfer Module
3.2 Interpretation
3.3 Auxiliary Classifier
3.4 Network Architecture
4 Experiments
4.1 Datasets
4.2 Implementation Details
4.3 Ablation Study
4.4 Visualizing Transfer Module
4.5 Learning Without Forgetting
4.6 Comparison to State-of-the-art
5 Conclusion
References
Detecting Anomalous Trajectories via Recurrent Neural Networks
1 Introduction
2 Related Work
2.1 Trajectory Anomaly Detection
2.2 Trajectory Similarity Measures
2.3 RNN-Based Autoencoder
3 Proposed Method
3.1 RNN Autoencoder Based Trajectory Distance
3.2 Distance Based Anomaly Detection
4 Experimental Results
4.1 Comparison of Distances on Different Trajectory Patterns
4.2 Anomaly Detection Performances
5 Conclusion
References
A Binary Optimization Approach for Constrained K-Means Clustering
1 Introduction
2 Related Work
3 Problem Formulation
3.1 K-Means Clustering
3.2 Constrained K-Means
4 Constrained K-Means as Binary Optimization
5 Optimization Strategy
5.1 Updating the Centroids
5.2 Updating the Assignment Matrix
6 Main Algorithm
7 Experiments
7.1 Balanced Clustering on Synthetic Data
7.2 Clustering on Real Datasets
8 Conclusion
References
LS3D: Single-View Gestalt 3D Surface Reconstruction from Manhattan Line Segments
1 Introduction
2 Prior Work
3 The LS3D Algorithm
3.1 Manhattan Line Segment Detection
3.2 Manhattan Tree Construction
3.3 Lifting 2D MTs to 3D
3.4 From Line Segments to Surfaces
3.5 Constrained L1-Minimization for Manhattan Building Reconstruction
4 Evaluation Dataset
5 Evaluation
6 Conclusion and Future Work
References
Deep Supervised Hashing with Spherical Embedding
1 Introduction
2 Related Work
3 Problem Overview
4 Hash Function Learning
5 Spherical Embedding
6 Quantization
7 Triplet Spherical Loss
7.1 Margin Loss
7.2 Label Likelihood Loss
7.3 Spring Loss
8 Experiments
8.1 Experimental Setup
8.2 Results
8.3 Ablation Study
9 Conclusions
References
Semantic Aware Attention Based Deep Object Co-segmentation
1 Introduction
2 Related Work
3 Model
3.1 Channel Wise Attention (CA)
3.2 Fused Channel Wise Attention (FCA)
3.3 Channel Spatial Attention (CSA)
3.4 Instant Group Co-segmentation
3.5 Training and Implementation Details
4 Experiments
4.1 Datasets
4.2 Baselines
4.3 Results and Visualization of Co-segmentation
4.4 Instant Group Co-segmentation Results
5 Conclusion
References
PIRC Net: Using Proposal Indexing, Relationships and Context for Phrase Grounding
1 Introduction
2 Related Work
3 Our Network
3.1 Framework Overview
3.2 Proposal Indexing Network (PIN)
3.3 Inter-phrase Regeression Network (IRN)
3.4 Proposal Ranking Network (PRN)
3.5 Supervised Training and Inference
4 Weakly Supervised Training
4.1 Weak Proposal Indexing Network (WPIN)
4.2 Training and Inference
5 Experiments and Results
5.1 Datasets
5.2 Experimental Setup
5.3 Results on Flickr30k Entities
5.4 Results on ReferIt Game
5.5 Qualitative Results
6 Conclusions
References
Paired-D GAN for Semantic Image Synthesis
1 Introduction
2 Related Work
3 Semantic Levels of Image Features for Foreground/Background
4 Proposed Method
4.1 Network Design
4.2 Network Architecture
4.3 Adversarial Learning for Paired-GAN
5 Experiments
5.1 Dataset and Compared Methods
5.2 Implementation and Training Details
5.3 Evaluation Metrics
5.4 Qualitative Evaluation
5.5 Quantitative Evaluation
5.6 Detailed Analysis
6 Conclusion
References
Skeleton Transformer Networks: 3D Human Pose and Skinned Mesh from Single RGB Image
1 Introduction
2 Related Work
3 Skeleton Transformer Networks
3.1 Bone Rotation Regressor
3.2 Cross Heatmap Regressor
3.3 Loss Function
4 In-the-wild 3D Human Pose Dataset
5 Experiments
5.1 Dataset and Evaluation Protocols
5.2 Baselines
5.3 Implementation and Training Detail
5.4 Results
6 Conclusion
References
Detecting Text in the Wild with Deep Character Embedding Network
1 Introduction
2 Related Works
3 Method
3.1 Network Design
3.2 Training Character Detector
3.3 Learning Character Embedding
3.4 Post-processing
4 Experiments
4.1 Datasets and Evaluation
4.2 Implementation Details
4.3 Ablation Study
4.4 Experiments on Scene Text Benchmarks
4.5 Future Works
5 Conclusion
References
Design Pseudo Ground Truth with Motion Cue for Unsupervised Video Object Segmentation
1 Introduction
2 Related Work
3 Proposed Method
3.1 Learning to Tag the Foreground Object
3.2 Unsupervised Video Object Segmentation
4 Experiments
4.1 Datasets
4.2 Implementation Details
4.3 Comparison with State-of-the-art Methods
4.4 Ablation Studies
5 Conclusion and Future Work
References
Identity-Enhanced Network for Facial Expression Recognition
1 Introduction
2 Related Work
2.1 Facial Expression Recognition
2.2 Multi-task Learning
3 Identity-Enhanced Network
3.1 Spatial Fusion
3.2 Self-constrained Multi-task Learning
3.3 Network Architecture
4 Experiments
4.1 Dataset Description
4.2 Preprocessing
4.3 Implementation Details
4.4 Results
4.5 Ablation Study
5 Conclusions
References
A Novel Multi-scale Invariant Descriptor Based on Contour and Texture for Shape Recognition
1 Introduction
2 The Proposed Method
2.1 Feature Extraction
2.2 Invariance of the Descriptor
2.3 Dissimilarity Measure
3 Experiments
3.1 Performance Comparison on COIL20 Dataset
3.2 Performance Comparison on Flavia Dataset
3.3 Performance Comparison on Swedish Dataset
3.4 Performance Comparison on Leaf100 Dataset
3.5 Performance Comparison on ETH-80 Dataset
4 Conclusion
References
PReMVOS: Proposal-Generation, Refinement and Merging for Video Object Segmentation
1 Introduction
2 Related Work
3 Approach
3.1 Image Augmentation
3.2 Proposal Generation
3.3 Proposal Refinement
3.4 Mask Propagation Using Optical Flow
3.5 ReID Embedding Vectors
3.6 Proposal Merging
4 Experiments
4.1 Proposal Refinement
4.2 Proposal Merging
4.3 Runtime Evaluation
4.4 Further Large-Scale Evaluation
5 Conclusion
References
ColorNet: Investigating the Importance of Color Spaces for Image Classification
1 Introduction
2 Related Works
3 Proposed Approach
3.1 Architecture of Model Used
4 Experimental Analysis
4.1 Datasets
4.2 Training
4.3 Classification Results on CIFAR-10
4.4 Classification Results on CIFAR-100
4.5 Classification Results on Imagenet
4.6 Classification Results on SVHN
4.7 Further Analysis of Results
5 Conclusion
References
Pooling-Based Feature Extraction and Coarse-to-fine Patch Matching for Optical Flow Estimation
1 Introduction
2 Related Work
3 Proposed Method
3.1 Pooling-Based Feature Extraction
3.2 Coarse-to-fine Patch Matching
3.3 Post-processing and Optical Flow Estimation
4 Experiments
4.1 Experimental Methods
4.2 Algorithm Parameters
4.3 Ablation Study
4.4 Evaluation of Patch Matching
4.5 Evaluation of Optical Flow Estimation
5 Conclusion
References
Oral Session O4: Detection, Segmentation, and Action
Unseen Object Segmentation in Videos via Transferable Representations
1 Introduction
2 Related Work
3 Algorithmic Overview
3.1 Overview of the Proposed Framework
3.2 Objective Function
4 Transferring Visual Information for Segmentation
4.1 Mining Segment Proposals
4.2 Learning Transferable Feature Representations
4.3 Joint Formulation and Model Training
5 Experimental Results
5.1 Implementation Details
5.2 DAVIS Dataset
5.3 YouTube-Objects Dataset
6 Concluding Remarks
References
Forget and Diversify: Regularized Refinement for Weakly Supervised Object Detection
1 Introduction
2 Related Work
2.1 Weakly Supervised Object Detection
2.2 Regularization of Deep Neural Networks
3 Preliminaries
4 Our Approach
4.1 Multi-round Regularization of Refinement
4.2 Graph-Based Label Generation
5 Experiments
5.1 Implementation Details
5.2 Datasets and Evaluation Metrics
5.3 Ablation Study
5.4 Results on PASCAL VOC Datasets
6 Conclusion
References
Task-Adaptive Feature Reweighting for Few Shot Classification
1 Introduction
2 Related Work
3 Method
3.1 Problem
3.2 Baseline Method: Prototypical Network Prototypical
3.3 Our Method: Task-Adaptive Prototypical Network
4 Experiments
4.1 Experimental Setting
4.2 Few-Shot Classification on miniImageNet
4.3 Few-Shot Classification on tieredImageNet
4.4 Visualization of Generated Feature Weight
5 Conclusions
References
Deep Attention-Based Classification Network for Robust Depth Prediction
1 Introduction
2 Related Work
3 Method
3.1 Network Architecture
3.2 Depth Discretization Strategy
3.3 Learning and Inference
4 Experiments
4.1 Datasets and Metrics
4.2 Experimental Setting
4.3 Experiment Results
5 Discussion
5.1 Effect of Multi-class Classification
5.2 Effect of attention mechanism
6 Conclusion
References
Predicting Video Frames Using Feature Based Locally Guided Objectives
1 Introduction
2 Proposed Architecture
2.1 Stage-1: Feature Generation
2.2 Stage-2: Reconstruction
3 Locally Guided Gram Loss (LGGL)
4 Multi-scale Correlational Loss (MSCL)
5 Overall Objective Function
6 Experiments
6.1 Results on KTH and Weizmann
6.2 Results on UCF-101
6.3 Results on KITTI
6.4 Cross-Dataset Evaluation
7 Conclusion
References
A New Temporal Deconvolutional Pyramid Network for Action Detection
1 Introduction
2 Related Work
3 Approach
3.1 Video Unit Feature Extraction
3.2 TDPN Network
3.3 Training
4 Experiments
4.1 Evaluation Datasets
4.2 Implementation Details
4.3 Ablation Study
4.4 Comparison with the State of the Art
5 Conclusions
References
Dynamic Temporal Pyramid Network: A Closer Look at Multi-scale Modeling for Activity Detection
1 Introduction
2 Related Work
3 Approach
3.1 Pyramidal Input Feature Extraction with Dynamic Sampling
3.2 Multi-scale Feature Hierarchy with Two-Branch Network
3.3 Local and Global Temporal Contexts
4 Experiments
4.1 Experimental Settings
4.2 Implementation Details
4.3 Comparison with State-of-the-Art
4.4 Ablation Study
5 Conclusions
References
Global Regularizer and Temporal-Aware Cross-Entropy for Skeleton-Based Early Action Recognition
1 Introduction
2 Related Works
3 Proposed Method
3.1 Global Regularization
3.2 Temporal-Aware Cross-Entropy
3.3 Network Training and Action Inference
4 Experiments
4.1 Datasets
4.2 Experimental Settings
4.3 Results on the NTU Dataset
4.4 Results on the CMU Dataset
4.5 Results on the SYSU 3DHOI Dataset
4.6 Comparison with Pair-Wise Distance
4.7 Parameter Analysis
5 Conclusion
References
Correction to: Symmetry-Aware Face Completion with Generative Adversarial Networks
Correction to: Chapter "Symmetry-Aware Face Completion with Generative Adversarial Networks" in: C. V. Jawahar et al. (Eds.): Computer Vision - ACCV 2018, LNCS 11364, https://doi.org/10.1007/978-3-030-20870-7_18
Author Index

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Computer Vision - ACCV 2018

Description

More details

Other editions

Additional editions

Content

System requirements