
Artificial Neural Networks and Machine Learning - ICANN 2022
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
The 4-volumes set of LNCS 13529, 13530, 13531, and 13532 constitutes the proceedings of the 31st International Conference on Artificial Neural Networks, ICANN 2022, held in Bristol, UK, in September 2022.
The total of 255 full papers presented in these proceedings was carefully reviewed and selected from 561 submissions. ICANN 2022 is a dual-track conference featuring tracks in brain inspired computing and machine learning and artificial neural networks, with strong cross-disciplinary interactions and applications.
More details
Other editions
Additional editions

Content
- Intro
- Preface
- Organization
- Contents - Part IV
- Analysing the Predictivity of Features to Characterise the Search Space
- 1 Introduction
- 2 Related Work
- 3 Landscape Features
- 4 Experimental Results
- 4.1 Feature Exploratory Analysis
- 4.2 Operator Classification
- 5 Conclusions and Future Work
- References
- Boosting Feature-Aware Network for Salient Object Detection
- 1 Introduction
- 2 Related Work
- 3 Proposed Model
- 3.1 Overall Framework
- 3.2 Edge Guidance Sub-network
- 3.3 Object Sub-network
- 3.4 Loss Function
- 4 Experimental Results
- 4.1 Datasets and Evaluation Metrics
- 4.2 Implementation Details
- 4.3 Comparison with the State-of-the-Arts
- 4.4 Ablation Studies
- 5 Conclusion
- References
- Continual Learning Based on Knowledge Distillation and Representation Learning
- 1 Introduction
- 2 Related Works
- 2.1 Class Incremental Learning
- 2.2 Beta-VAE
- 2.3 Knowledge Distillation
- 3 Model and Methodology
- 3.1 KRCL Model
- 3.2 KRCL Loss Function
- 3.3 Model Parameters and Update Rules
- 4 Experimental Comparison
- 4.1 Benchmark Datasets
- 4.2 Baseline Methods
- 4.3 Network Architecture
- 4.4 Evaluation Metrics
- 4.5 Experimental Results and Analysis
- 5 Conclusions and Future Works
- References
- Deep Feature Learning for Medical Acoustics
- 1 Introduction
- 2 The Considered Frontends
- 2.1 Mel-filterbanks
- 2.2 LEAF
- 2.3 nnAudio
- 3 Models
- 3.1 EfficientNet
- 3.2 VGG
- 4 Datasets
- 4.1 Respiratory Dataset
- 4.2 Heartbeat Dataset
- 5 Experiments
- 5.1 Pre-processing
- 5.2 System Parameterization
- 6 Results
- 6.1 Test 1 - Respiratory
- 6.2 Test 2 - Heartbeat
- 6.3 Overall
- 7 Conclusion
- References
- Feature Fusion Distillation
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Feature Fusion Module
- 3.2 Asymmetric Switch Function
- 3.3 Total Loss Function
- 4 Experiments
- 4.1 Image Classification (CIFAR-100)
- 4.2 Image Classification (ImageNet-1K)
- 4.3 Object Detection
- 4.4 Semantic Segmentation
- 5 Ablation Study
- 6 Conclusion
- A Margin Value
- References
- Feature Recalibration Network for Salient Object Detection
- 1 Introduction
- 2 Proposed Method
- 2.1 Consistency Recalibration Module
- 2.2 Multi-source Feature Recalibration Module
- 2.3 Loss Function
- 3 Experiments
- 3.1 Datasets and Evaluation Metrics
- 3.2 Implementation Details
- 3.3 Comparison with the State-of-the-Art
- 3.4 Ablation Studies
- 4 Conclusion
- References
- Feature Selection for Trustworthy Regression Using Higher Moments
- 1 Introduction
- 2 Trustworthy Regression
- 3 Feature Relevance
- 3.1 Feature Relevance for Classification
- 3.2 Feature Relevance for (MSE-)Regression
- 4 Feature Selection Methods
- 5 On the Relation of Relevance Notions
- 6 Application: Moment Feature Relevance
- 7 Empirical Evaluation
- 8 Conclusion
- References
- Fire Detection Based on Improved-YOLOv5s
- 1 Introduction
- 2 Method
- 2.1 Data Collection and Preprocessing
- 2.2 Network Model
- 2.3 Cosine Annealing + Warm-Up
- 2.4 Label Smoothing
- 2.5 Multi-scale
- 3 Result
- 3.1 Evaluation Index Calculation Formula
- 3.2 Results Presentation
- 4 Discussion
- 5 Conclusion
- References
- Heterogeneous Graph Neural Network for Multi-behavior Feature-Interaction Recommendation
- 1 Introduction
- 2 Methodology
- 2.1 Heterogeneous Bipartite Graph
- 2.2 User-Features Interaction
- 2.3 Graph Neural Network Aggregation Layer
- 2.4 Prediction Layer
- 2.5 Model Training
- 2.6 Complexity Analysis
- 3 Experiment
- 3.1 Dataset Description
- 3.2 Experimental Settings
- 3.3 Overall Performance
- 3.4 Model Analysis
- 4 Conclusion
- References
- JointFusionNet: Parallel Learning Human Structural Local and Global Joint Features for 3D Human Pose Estimation
- 1 Introduction
- 2 Related Works
- 2.1 3D Human Pose Estimation
- 2.2 Global-Local Features Fusion
- 3 Method
- 3.1 Inspiring Pattern of Human Pose
- 3.2 Global and Local Features Fusion
- 4 Experiments
- 4.1 Datasets, Evaluation Metrics and Details
- 4.2 Comparison with State-of-the-Art Methods
- 4.3 Cross-dataset Results on 3DPW
- 4.4 Visualization and Explanation
- 4.5 Ablation Study
- 5 Conclusion
- References
- Multi-scale Feature Extraction and Fusion for Online Knowledge Distillation
- 1 Introduction
- 2 Related Work
- 2.1 Traditional Knowledge Distillation
- 2.2 Online Knowledge Distillation
- 2.3 Multi-scale Feature
- 3 Proposed Method
- 3.1 Problem Definition
- 3.2 MFEF Framework
- 3.3 Loss Function
- 4 Experiment
- 4.1 Experiment Settings
- 4.2 Experiment Results
- 5 Conclusion
- References
- Multi-scale Vertical Cross-layer Feature Aggregation and Attention Fusion Network for Object Detection
- 1 Introduction
- 2 Related Work
- 3 Architecture of Proposed Network
- 3.1 Multi-scale Vertical Cross-Layer Feature Aggregation Network
- 3.2 Attention Fusion Module
- 3.3 Anchor Optimization Strategy
- 4 Experiment
- 4.1 Implementation Details
- 4.2 Comparison with Other Methods
- 4.3 Ablation Study
- 5 Conclusion
- References
- Multi-spectral Dynamic Feature Encoding Network for Image Demoiréing
- 1 Introduction
- 2 Proposed Method
- 2.1 Overall Network Architecture
- 2.2 DCT and Channel Attention
- 2.3 Multi-spectral Channel Attention (MSCA)
- 2.4 Multi-spectral Dynamic Feature Encoding (MSDFE)
- 2.5 Loss Function
- 3 Experiments
- 3.1 Datasets and Training Details
- 3.2 Comparison with State-of-the-Arts
- 3.3 Visual Results
- 3.4 Model Parameters
- 4 Ablation Study
- 4.1 Network Branches
- 4.2 Multi-spectral Dynamic Feature Encoding
- 5 Conclusion
- References
- Ranking Feature-Block Importance in Artificial Multiblock Neural Networks
- 1 Introduction
- 2 Block Importance Ranking Methods
- 2.1 Composite Strategy
- 2.2 Knock-In Strategy
- 2.3 Knock-Out Strategy
- 3 Experiments
- 3.1 Simulation Experiment
- 3.2 Real-World Experiment
- 4 Discussion
- 5 Conclusion
- References
- Robust Sparse Learning Based Sensor Array Optimization for Multi-feature Fusion Classification
- 1 Introduction
- 2 The Proposed Method
- 2.1 F,1 Norm Regularization Term
- 2.2 Sensor Selection Model
- 2.3 Model Optimization
- 2.4 Complexity Analysis
- 3 Experiment
- 3.1 Data Sets
- 3.2 Experiments Settings
- 3.3 Comparison of Classification Accuracy
- 4 Conclusion and Future Work
- References
- Stimulates Potential for Knowledge Distillation
- 1 Introduction
- 2 Related Literature
- 2.1 Knowledge Distillation
- 2.2 Normalization
- 3 Approach
- 3.1 Residual-Based Local Feature Normalization
- 3.2 Local Feature Normalized Extraction
- 3.3 How to Use Structure
- 4 Experiment
- 4.1 Experiments on CIFAR-10
- 4.2 Experiments on CIFAR-100
- 4.3 Ablation Experiments
- 5 Conclusion
- References
- Adaptive Compatibility Matrix for Superpixel-CRF
- 1 Introduction
- 2 Related Work
- 2.1 CRF and Superpixel-CRF
- 2.2 Compatibility Function
- 3 Preliminary
- 4 Adaptive Compatibility Matrix
- 5 Apply Adaptive Compatibility Matrix to Superpixel CRF
- 5.1 Binary Class
- 5.2 Multi-class
- 6 Experiments
- 7 Conclusions
- References
- BERT-Based Scientific Paper Quality Prediction
- 1 Introduction
- 2 BERT
- 3 Proposed Quality Prediction of Scientific Papers
- 3.1 Dataset of Scientific Papers
- 3.2 Quality Classification of Papers
- 3.3 BERT-Based Model of Quality Prediction of Scientific Papers
- 4 Experimental Results
- 4.1 Training in the Pre-training Phase on Abstracts from S2ORC
- 4.2 Training in the Fine-Tuning Phase
- 4.3 The Test Accuracy of Prediction of the Trained Model
- 4.4 Detailed Analysis of the Prediction
- 5 Conclusions
- References
- Effective ML-Block and Weighted IoU Loss for Object Detection
- 1 Introduction
- 2 Related Work
- 2.1 Box Regression Loss
- 2.2 One-Stage Object Detectors
- 3 Approach
- 3.1 Weighted IoU Loss
- 3.2 MobileLight Block
- 4 Experiments
- 4.1 Experimental Setup
- 4.2 Ablation Studies
- 4.3 Evaluation on PASCAL VOC
- 4.4 Evaluation on COCOmini
- 5 Conclusion
- References
- FedNet2Net: Saving Communication and Computations in Federated Learning with Model Growing
- 1 Introduction
- 2 Related Work
- 3 Proposed Approach
- 4 Datasets and Detailed Model Implementations
- 4.1 Data Description
- 4.2 Performance Evaluation
- 4.3 Parameters for Switching
- 4.4 Model Description and Hyper-parameters
- 5 Results
- 6 Conclusion
- References
- Reject Options for Incremental Regression Scenarios
- 1 Introduction
- 2 Problem Setting
- 3 Rejection Models
- 3.1 Drift Rejection
- 3.2 Local Outlier Probabilities Rejector
- 3.3 Baseline Rejection
- 4 Experiments
- 4.1 Chaotic Time Series Data
- 4.2 Real World Data
- 4.3 RMSE-Reject Curves
- 4.4 Chaotic Data Experiment
- 4.5 Real World Data Experiment
- 5 Results
- 5.1 Chaotic Data Results
- 5.2 Real World Data Results
- 5.3 Tabular Evaluation
- 6 Conclusion
- References
- Stream-Based Active Learning with Verification Latency in Non-stationary Environments
- 1 Introduction
- 2 Related Work
- 3 Proposed Active Learning Framework
- 3.1 Proposed Utility Estimator: PRopagate Labels
- 3.2 Proposed Budget Strategy: Dynamic Budget Allocation
- 4 Experimental Setup
- 5 Results and Discussion
- 6 Conclusion and Future Work
- References
- StTime-Net: Combining both Historical and Textual Factors for Stock Movement Prediction
- 1 Introduction
- 2 Related Work
- 3 Problem Formulation
- 4 Dataset Setup
- 5 Methodology
- 5.1 Price Encoder
- 5.2 Text Information Encoder
- 5.3 Combined Encoder
- 5.4 Metrics and Optimizer
- 6 Evaluation and Results
- 6.1 Baselines
- 6.2 Discussion
- 7 Conclusion and Future Work
- References
- Subspace Clustering Multi-module Self-organizing Maps with Two-Stage Learning
- 1 Introduction
- 2 Self-organizing Maps and High-Dimensional Data Clustering
- 2.1 Clustering of High-Dimensional Data
- 2.2 Self-organizing Maps for Subspace Clustering
- 2.3 Learning Enhancement
- 3 Subspace Clustering Multi-module Self-organizing Maps (SC-MuSOM)
- 3.1 The Description of SC-MuSOM
- 3.2 The Pseudo-code
- 4 Experiments
- 4.1 Datasets and Setups
- 4.2 Results
- 4.3 Comparisons
- 5 Discussion and Conclusion
- References
- SVM Ensembles on a Budget
- 1 Introduction
- 2 Structured Subbagging for SVM Ensembles
- 3 Training Complexity of the Proposed Model
- 4 Experimental Evaluation
- 4.1 Comparison of Structured and Non-structured Sampling
- 4.2 Training Complexity
- 4.3 Accuracy
- 5 Conclusions
- References
- The Parallelization and Optimization of K-means Algorithm Based on MGPUSim
- 1 Introduction
- 2 Related Work
- 2.1 K-means Algorithm Implemented on GPU
- 2.2 MGPUSim System Architecture
- 3 The Proposed Method
- 3.1 Parallelized Design of the k-means Algorithm
- 3.2 Optimization of Initialized Clustering Centers
- 4 Experiment and Evaluation
- 4.1 Daisen Validation Performance Improvement Analysis
- 4.2 Experimental Analysis of the K-means Algorithm on the Real Data Set
- 4.3 Experimental Analysis of K-means Algorithm on Synthetic Data
- 5 Conclusion and Future Work
- References
- Two-Dimensional Encoding Method for Neural Synthesis of Tabular Transformation by Example
- 1 Introduction
- 2 Problem Formulation
- 2.1 Problem
- 2.2 Program Components
- 3 Related Work
- 3.1 ML-Based PBE for Tabular Transformation
- 3.2 Two-Dimensional Positional Encoding
- 4 Our Previous Work
- 4.1 The Transformer-Based Model
- 4.2 Tabular Data Linearization
- 4.3 Positional Encoding
- 5 Proposed Approach
- 5.1 Tabular Positional Encoding
- 5.2 Positional Encoding Implementations
- 6 Experiments
- 6.1 Experimental Settings
- 6.2 Experimental Results
- 7 Conclusion
- References
- Variational Autoencoders for Anomaly Detection in Respiratory Sounds
- 1 Introduction
- 2 The Dataset of Respiratory Sounds
- 3 Preprocessing of Respiratory Sounds
- 4 Anomaly Detection Model
- 4.1 VAE Training
- 5 Experimental Set-up and Results
- 6 Conclusions
- References
- An Innovate Hybrid Approach for Residence Price Using Fuzzy C-Means and Machine Learning Techniques
- 1 Introduction
- 2 Dataset Description
- 3 Dataset Pre-processing
- 3.1 Fuzzy C-Means
- 4 Classification Methodology
- 4.1 Coarse Tree
- 4.2 Evaluation of Model Classifier
- 5 Experimental Results
- 6 Conclusion
- References
- Contrastive Learning for Session-Based Recommendation
- 1 Introduction
- 2 Related Works
- 3 Preliminaries
- 3.1 Problem Statement
- 3.2 Supervised Recommendation Model Optimization
- 4 Methods
- 4.1 Data Augmentation
- 4.2 Contrastive Learning
- 4.3 Multi-task Training
- 5 Experiments
- 5.1 Experimental Setup
- 5.2 Baseline Methods
- 5.3 Experimental Results
- 6 Conclusion
- References
- Discriminative and Robust Analysis Dictionary Learning for Pattern Classification
- 1 Introduction
- 2 Preliminary
- 2.1 Analysis Dictionary Learning
- 2.2 Non-negative Representation
- 3 Discriminative and Robust ADL Model
- 3.1 Model Formulation
- 3.2 Solution to DR-ADL
- 3.3 Initialization
- 4 Robust Classification Scheme
- 5 Experiments
- 5.1 Experimental Settings
- 5.2 Results and Analysis
- 6 Conclusion
- References
- Domain Adaptative Semantic Segmentation by Fine-Grained Alignment
- 1 Introduction
- 2 Methods
- 2.1 Overview
- 2.2 Background and Foreground Classes Alignment
- 2.3 Self-supervised Learning
- 2.4 Spatial and Channel Parallel Attention Module
- 3 Experiments
- 3.1 Experimental Results
- 3.2 Ablation Studies
- 3.3 Hyperparameters Analysis
- 4 Conclusion
- References
- Hypergraph Variational Autoencoder for Multimodal Semi-supervised Representation Learning
- 1 Introduction
- 2 Preliminary
- 2.1 Introduction of Hypergraph
- 2.2 Constructing Hypergraph with Multimodal Data
- 3 Method
- 3.1 Problem Statement
- 3.2 Mask-Based Inference Model
- 3.3 Generative Model
- 4 Experiments
- 4.1 Dataset
- 4.2 Experimental Settings
- 4.3 Result Analyses and Discussions
- 4.4 Ablation
- 5 Conclusion
- References
- Intention-Aware Frequency Domain Transformer Networks for Video Prediction
- 1 Introduction
- 2 Related Work
- 3 Models
- 4 Experimental Results
- 5 Conclusion and Future Work
- References
- Knowledge-Aware Self-supervised Graph Representation Learning for Recommendation
- 1 Introduction
- 2 Related Work
- 3 KSGL Model
- 3.1 Framework
- 3.2 Supervised Learning Model
- 3.3 Self-supervised Learning Model
- 3.4 Multi-tasking Optimization
- 4 Experiment
- 4.1 Experimental Setup
- 4.2 Performance Comparison
- 4.3 Study of KSGL
- 5 Conclusion and Future Work
- References
- Meta-Style: Few-Shot Learning Dataset for Social Media Field
- 1 Introduction
- 2 Related Work
- 2.1 Prototype Estimation
- 2.2 Related Dataset
- 3 Meta-Style Dataset
- 3.1 Data Collection and Annotation
- 3.2 Dataset Analysis
- 4 Latent Prototype Estimation Based on Global Perception and Neighborhood Adaptation
- 4.1 Latent Prototype Estimation
- 4.2 Global Perception and Neighborhood Adaptation
- 4.3 Overview
- 5 Experimental Evaluation
- 5.1 Implementation Details
- 5.2 Quantitative Comparison
- 5.3 Ablation Experiments
- 6 Conclusion
- References
- Problem Classification for Tailored Helpdesk Auto-replies
- 1 Introduction
- 2 Natural Language Processing
- 3 Methodology
- 3.1 Training/Test Data Split
- 3.2 Data Augmentation
- 3.3 Stemming and Stop-Word Removal
- 4 Problem Classification
- 5 Solution
- 6 Evaluation
- 7 Conclusion
- References
- Self-Enhancer: A Self-supervised Framework for Low-Supervision, Drifted Data with Significant Missing Values
- 1 Introduction
- 2 Problem Setting
- 3 The Self-supervised Framework: Self-Enhancer
- References
- Self-supervised Anomaly Detection by Self-distillation and Negative Sampling
- 1 Introduction
- 2 Related Works
- 3 Proposed Method
- 3.1 The Vanilla DINO Framework
- 3.2 Negative Samples
- 3.3 Auxiliary Objective
- 3.4 Sensitivity Score
- 4 Experiments
- 4.1 Datasets and Negative Sample Variants
- 4.2 Evaluation Protocol for OOD Detection
- 4.3 Experimental Results
- 5 Discussion
- 6 Conclusion
- References
- Self-supervised Detransformation Autoencoder for Representation Learning in Open Set Recognition
- 1 Introduction
- 2 Related Work
- 3 Approach
- 3.1 Pre-training Step
- 3.2 Fine-Tuning Step
- 3.3 Open Set Recognition (OSR)
- 4 Experimental Evaluation
- 4.1 Evaluation Network Architectures and Evaluation Criteria
- 4.2 Experimental Results
- 4.3 Analysis
- 5 Conclusion
- References
- Supervised Learning for Convolutional Neural Network with Barlow Twins
- 1 Introduction
- 2 Related Work
- 3 Proposed Method
- 3.1 Architecture for Supervised Learning with Barlow Twins
- 3.2 Cross Entropy Loss
- 3.3 Barlow Twins Loss
- 3.4 Barlow Twins in Supervised Classification CNN
- 4 Experiments
- 4.1 Datasets
- 4.2 Experimental Setup
- 4.3 Evaluation on CIFAR-10 Dataset
- 4.4 Evaluation on STL-10 Dataset
- 5 Conclusion
- References
- Using Multiple Heads to Subsize Meta-memorization Problem
- 1 Introduction
- 2 Related Works
- 3 MIMO-MAML
- 3.1 Meta Learning Problem
- 3.2 The Design
- 3.3 Analysis
- 3.4 Early-Exiting or Ensemble
- 4 Experiment
- 5 Conclusion
- References
- Video Motion Perception for Self-supervised Representation Learning
- 1 Introduction
- 2 Related Work
- 2.1 Self-supervised Representation Learning
- 2.2 Video Action Recognition
- 3 Methods
- 3.1 Motion Direction Perception Module
- 3.2 Motion Change Perception Module
- 3.3 Representation Learning
- 4 Experiment
- 4.1 Ablation Study
- 4.2 Action Recognition
- 4.3 Video Retrieval
- 4.4 Action Similarity Labeling
- 5 Conclusion
- References
- A Differentiable Architecture Search Approach for Few-Shot Image Classification
- 1 Introduction
- 2 Related Work
- 2.1 Few-Shot Learning
- 2.2 Differentiable Architecture Search
- 3 Methodology
- 3.1 Preliminary
- 3.2 Improved Search Strategy
- 3.3 Spatial Pyramid Self-attention Mechanism
- 4 Experiments
- 4.1 Datasets
- 4.2 Results
- 5 Conclusions
- References
- AFS: Attention Using First and Second Order Information to Enrich Features
- 1 Introduction
- 2 Related Work
- 3 Methods
- 3.1 AFS
- 3.2 Parallel-AFS
- 3.3 Serial-AFS
- 4 Experiments
- 4.1 Ablation Studies on CIFAR-100
- 4.2 CIFAR Classification
- 4.3 Object Detection
- 5 Conclusion
- References
- Autonomous Driving Model Defense Study on Hijacking Adversarial Attack
- 1 Introduction
- 2 Background and Related Work
- 3 Methodology
- 3.1 Dataset
- 3.2 Attack Scenarios - Hijacking Adversarial Attacks
- 3.3 Defense Mechanism Against Adversarial Attack
- 4 Experiment Results
- 4.1 Experiment Environment
- 4.2 Attack Injection and Success Against Autonomous Driving
- 4.3 Defense Performance Result - Proposed
- 4.4 Performance of Autoencoder - Whitebox Setup
- 4.5 Performance of Proposed Method - Blackbox Setup
- 4.6 Time and Computation Overhead on Defense Mechanism
- 5 Discussion
- 6 Conclusion
- References
- Chinese Character Style Transfer Model Based on Convolutional Neural Network
- 1 Introduction
- 2 Related Work
- 3 Algorithm Design
- 3.1 Concept and Principle of Convolutional Neural Network
- 3.2 Overview of Transfer Model
- 3.3 Image Preprocessing
- 3.4 Feature Extraction of Chinese Character Samples
- 3.5 Main Architecture
- 3.6 Model Algorithm Description
- 4 Experimental Analysis
- 4.1 Dataset
- 4.2 Experimental Analysis
- 5 Conclusion
- References
- CNN-Transformer Hybrid Architecture for Early Fire Detection
- 1 Introduction
- 2 Related Works
- 2.1 Feature Analysis Based Fire Detection Methods
- 2.2 CNN Based Fire Detection Methods
- 2.3 Transformer Based Fire Detection Methods
- 3 Methods
- 3.1 Locality Feed Forward Network
- 3.2 Linear Highlighted Attention
- 3.3 GLCT
- 4 Experiments
- 4.1 Implementation Details
- 4.2 Compared with CNN
- 4.3 Compared with Transformer
- 5 Conclusion
- References
- DeepArtist: A Dual-Stream Network for Painter Classification of Highly-Varying Image Resolutions
- 1 Introduction
- 2 Background
- 2.1 Early Painting Classification
- 2.2 CNN-Based Extraction of Painting Features
- 2.3 CNN-Based Painter Classification
- 3 Proposed Method
- 3.1 The WikiArt Dataset
- 3.2 Local Structures and Global Elements of a Painting
- 3.3 Network Architecture
- 4 Experimental Results
- 5 Conclusions
- References
- Dual Branch Network Towards Accurate Printed Mathematical Expression Recognition
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Context Coupling Module
- 3.2 DST Strategy
- 4 Experiment
- 4.1 Implementation Details
- 4.2 Abations Study
- 4.3 Impact of Hyper-parameters
- 4.4 Results on PMER Benchmarks
- 5 Conclusion
- References
- Efficient Search of Multiple Neural Architectures with Different Complexities via Importance Sampling
- 1 Introduction
- 2 Probabilistic Model-Based One-Shot NAS with Complexity Regularization
- 3 Proposed Method
- 3.1 Introducing Categorical Distributions
- 3.2 Searching Multiple Architectures via Importance Sampling
- 3.3 Overall Algorithm
- 4 Experiment and Results
- 4.1 CIFAR-10
- 4.2 ImageNet
- 5 Conclusion
- References
- End-to-End Large-Scale Image Retrieval Network with Convolution and Vision Transformers
- 1 Introduction
- 2 Related Work
- 3 Proposed Method
- 3.1 Network Architecture
- 3.2 Improved Triplet Loss and SOS Loss
- 4 Experiment and Discussion
- 4.1 Datasets and Evaluation Metrics
- 4.2 Implementation Details
- 4.3 Comparison with Baseline and State-of-the-Art Methods
- 4.4 Ablation Studies
- 5 Conclusion
- References
- Ensemble Ranking for Image Retrieval via Deep Hash
- 1 Introduction
- 2 Related Work
- 2.1 Unsupervised Hashing Methods
- 2.2 Semi-supervised Hashing Methods
- 2.3 Supervised Hashing Methods
- 2.4 Deep Hash-Based Retrieval Method
- 3 Ensemble Learning and Deep Hashing CNNs
- 3.1 Motivation
- 3.2 Ensemble Learning
- 3.3 Hidden Hash Layer
- 3.4 Ensemble Learning Methods
- 4 Experiments
- 4.1 Evaluation Protocols
- 4.2 Retrieval Results on CIFAR-10
- 4.3 Retrieval Results on SUN397
- 4.4 Retrieval Results on ILSVRC2012
- 5 Conclusion
- References
- Examining the Proximity of Adversarial Examples to Class Manifolds in Deep Networks
- 1 Introduction
- 2 Experimental Setup
- 2.1 Models
- 2.2 Attack Methods
- 3 Distances to Classes
- 4 Proximity to Manifolds
- 4.1 Counting the Nearest Neighbours
- 4.2 Computing Distances to Class-Specific Manifolds
- 5 Assessment of Entanglement
- 6 Conclusion
- References
- Gesture MNIST: A New Free-Hand Gesture Dataset
- 1 Introduction
- 1.1 Related Work
- 1.2 Contribution
- 2 Dataset
- 3 Experiments
- 3.1 LSTM Network
- 3.2 CNN
- 4 Outlier Detection
- 4.1 Gaussian Mixture Models (GMMs)
- 4.2 Results
- 5 Discussion and Conclusion
- References
- GH-CNN: A New CNN for Coherent Hierarchical Classification
- 1 Introduction
- 2 State of the Art About Hierarchical Classification
- 3 Notations
- 4 An Architecture Compliant with ICE and ICH
- 4.1 The Hidden Layers
- 4.2 The Penultimate Layer of Primary Outputs (PNPO)
- 4.3 Updating the Probability of a Subclass in the Layer FAFO
- 4.4 Hierarchical Loss Function
- 5 Experiments and Results
- 6 Conclusion and Perspectives
- References
- Hyperspectral Endoscopy Using Deep Learning for Laryngeal Cancer Segmentation
- 1 Introduction
- 2 Related Work
- 3 Mobile Intraoperative HSI (miHSI) Acquisition
- 3.1 Clinical Hardware Setup
- 3.2 Image Acquisition and Data Set
- 4 Pre-processing
- 5 Efficient HSI Deep Learning System
- 5.1 U-Net
- 5.2 The EFX-UNet and the Deep UNet Architectures
- 5.3 Loss Function and Optimizer
- 6 Results
- 6.1 Evaluation Metric: Dice Score
- 6.2 Wavelength Analysis
- 6.3 Preliminary Evaluation of Accuracy and Discussion
- 7 Conclusion and Outlook
- References
- Image Super-Resolution Using Deep RCSA Network
- 1 Introduction
- 2 Related Works
- 3 Our Approach
- 3.1 Network Structure
- 3.2 Dense Module
- 3.3 Channel Spatial Attention Module
- 4 Experiments and Results
- 4.1 Settings
- 4.2 Study of the Basic Module
- 4.3 Quantitative Analysis
- 4.4 Visual Analysis
- 4.5 Model Complexity Analysis
- 5 Discussion
- References
- Lip Reading Using Deformable 3D Convolution and Channel-Temporal Attention
- 1 Introduction
- 2 Related Works
- 2.1 Lip Reading
- 2.2 Focus on Target Word
- 2.3 Attention Mechanism
- 3 The Proposed Work
- 3.1 Overview
- 3.2 Deformable 3D Convolution
- 3.3 Channel-Temporal Attention
- 4 Experiments
- 4.1 Datasets
- 4.2 Implementation Details
- 4.3 Experimental Results
- 4.4 Ablation Studies
- 5 Conclusion
- References
- Loop Closure Detection Based on Siamese ConvNet Features and Geometrical Verification for Visual SLAM
- 1 Introduction
- 2 Related Work
- 2.1 Hand-Crafted Features
- 2.2 CNN-Based Features
- 3 Method
- 3.1 Generation of Loop Closure Candidate Frames
- 3.2 Geometrical Verification
- 3.3 Temporal Consistency Check
- 4 Experiments
- 4.1 Experimental Settings
- 4.2 Experiment Results
- 4.3 Comparative Result
- 4.4 Visualization of the Loop Detection Results
- 5 Conclusion
- References
- Multi-Sensor Data Fusion for Short-Term Traffic Flow Prediction: A Novel Multi-Channel Data Structure Integrated with Mixed-Pointwise Convolution and Channel Attention Mechanism
- 1 Introduction
- 2 Related Work
- 2.1 Traditional Traffic Prediction Methods
- 2.2 Deep Learning-Based Traffic Prediction Methods
- 2.3 Pointwise Convolution and Channel Attention Mechanism
- 3 Methodology
- 3.1 Multi-channel Data Structure
- 3.2 Mixed-Pointwise Convolution Integrated with Channel Attention Mechanism
- 4 Experiments
- 4.1 Computing Environment and Datasets
- 4.2 Baselines and Benchmarks
- 5 Results and Analysis
- 5.1 Experimental Results on PeMSD4
- 5.2 Experimental Results on PeMSD7
- 6 Conclusions and Future Work
- References
- Neural Architecture Search for Low-Precision Neural Networks
- 1 Introduction
- 2 Background and Related Work
- 3 Method
- 3.1 Shared Mixed-Precision Convolution Block
- 3.2 Forward-and-Backward Scaling
- 3.3 Power Estimation
- 4 Experiments
- 4.1 Effectiveness of Dominance Problem Solutions
- 4.2 Search for Models
- 5 Conclusion
- References
- RegionDrop: Fast Human Pose Estimation Using Annotation-Aware Spatial Sparsity
- 1 Introduction
- 2 Related Work
- 2.1 Pose Estimation
- 2.2 Spatial Sparsity
- 3 Annotation-Aware Spatial Sparsity
- 3.1 Spatially Sparse Network
- 3.2 Non-blind Mask Unit for Annotation-Aware Spatial Sparsity
- 4 Experimental Results
- 4.1 Experimental Setting for Hourglass Networks
- 4.2 Experimental Setting for HRNet
- 4.3 Accuracy and Execution Time Comparison for RegionDrop Using Hourglass Networks
- 4.4 Accuracy and Execution Time Comparison for RegionDrop Using HRNet
- 4.5 Comparison Between Top-Down and Bottom-Up Pose Estimation
- 5 Conclusions
- References
- SDCN: A Species-Disease Hybrid Convolutional Neural Network for Plant Disease Recognition
- 1 Introduction
- 2 Related Works
- 3 SDCN: Species-Disease Hybrid CNN
- 3.1 Decreasing CNN
- 3.2 A Lightly Decresing CNN for Plant Disease Recognition (D-CNN)
- 3.3 SDCN: Species-Disease Hybrid Convolutional Neural Network
- 3.4 Training of SDCN
- 4 Experiment
- 4.1 Experimental Setting
- 4.2 Experiments
- 5 Conclusion and Future Work
- References
- TFCNs: A CNN-Transformer Hybrid Network for Medical Image Segmentation
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Overall Structure of TFCNs
- 3.2 RL-Transformer
- 3.3 CLAB (Convolutional Linear Attention Block)
- 4 Experimental Results and Discussion
- 4.1 Implementation Details
- 4.2 Comparison with Other SOTA Methods
- 4.3 Interpretability Analysis
- 4.4 Ablation Studies
- 5 Conclusion
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.