Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track

Name: Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track | European Conference, ECML PKDD 2021, Bilbao, Spain, September 13-17, 2021, Proceedings, Part V
Brand: Springer
Price: 85.59 EUR
Availability: OnlineOnly

European Conference, ECML PKDD 2021, Bilbao, Spain, September 13-17, 2021, Proceedings, Part V

Yuxiao Dong Nicolas Kourtellis Barbara Hammer Jose A. Lozano(Editor)

Springer (Publisher)

Published on 9. September 2021

XXXIV, 516 pages

E-Book

PDF with digital watermarking

System requirements

978-3-030-86517-7 (ISBN)

€85.59incl. 7% vat

System requirements

for PDF with digital watermarking

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

The multi-volume set LNAI 12975 until 12979 constitutes the refereed proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2021, which was held during September 13-17, 2021. The conference was originally planned to take place in Bilbao, Spain, but changed to an online event due to the COVID-19 pandemic.

The 210 full papers presented in these proceedings were carefully reviewed and selected from a total of 869 submissions.

The volumes are organized in topical sections as follows:

Research Track:

Part I: Online learning; reinforcement learning; time series, streams, and sequence models; transfer and multi-task learning; semi-supervised and few-shot learning; learning algorithms and applications.

Part II: Generative models; algorithms and learning theory; graphs and networks; interpretation, explainability, transparency, safety.

Part III: Generative models; search and optimization; supervised learning; text mining and natural language processing; image processing, computer vision and visual analytics.

Applied Data Science Track:

Part IV: Anomaly detection and malware; spatio-temporal data; e-commerce and finance; healthcare and medical applications (including Covid); mobility and transportation.

Part V: Automating machine learning, optimization, and feature engineering; machine learning based simulations and knowledge discovery; recommender systems and behavior modeling; natural language processing; remote sensing, image and video processing; social media.

More details

Other editions

Content

Intro
Preface
Organization
Contents - Part V
Automating Machine Learning, Optimization, and Feature Engineering
PuzzleShuffle: Undesirable Feature Learning for Semantic Shift Detection
1 Introduction
2 Related Work
2.1 Out-of-Distribution Detection
2.2 Data Augmentation
2.3 Uncertainty Calibration
3 Preliminaries
3.1 The Effects by Perturbation
3.2 Adversarial Undesirable Feature Learning
4 Proposed Method
4.1 PuzzleShuffle Augmentation
4.2 Adaptive Label Smoothing
4.3 Motivation
5 Experiments
5.1 Experimental Settings
5.2 Compared Methods
5.3 Results
5.4 Analysis
6 Conclusion
References
Enabling Machine Learning on the Edge Using SRAM Conserving Efficient Neural Networks Execution Approach
1 Introduction
2 Background and Related Work
2.1 Deep Model Compression
2.2 Executing Neural Networks on Microcontrollers
3 Efficient Neural Network Execution Approach Design
3.1 Tensor Memory Mapping (TMM) Method Design
3.2 Loading Fewer Tensors and Tensors Re-usage
3.3 Finding the Cheapest NN Graph Execution Sequence
3.4 Core Algorithm
4 Experimental Evaluation
4.1 SRAM Usage
4.2 Model Performance
4.3 Inference Time and Energy Consumption
5 Conclusion
References
AutoML Meets Time Series Regression Design and Analysis of the AutoSeries Challenge
1 Introduction
2 Challenge Setting
2.1 Phases
2.2 Protocol
2.3 Datasets
2.4 Metrics
2.5 Platform, Hardware and Limitations
2.6 Baseline
2.7 Results
3 Post Challenge Experiments
3.1 Reproducibility
3.2 Overfitting and Generalisation
3.3 Comparison to Open Source AutoML Solutions
3.4 Impact of Time Budget
3.5 Dataset Difficulty
4 Conclusion and Future Work
References
Methods for Automatic Machine-Learning Workflow Analysis
1 Introduction
2 Problem Definition
3 Related Work
4 Residual Graph-Level Graph Convolutional Networks
5 Datasets
6 Workflow Similarity
7 Structural Performance Prediction
8 Component Refinement and Suggestion
9 Conclusion
References
ConCAD: Contrastive Learning-Based Cross Attention for Sleep Apnea Detection
1 Introduction
2 Related Work
2.1 Sleep Apnea Detection
2.2 Attention-Based Feature Fusion
2.3 Contrastive Learning
3 Methodology
3.1 Expert Feature Extraction and Data Augmentation
3.2 Feature Extractor
3.3 Cross Attention
3.4 Contrastive Learning.
4 Experiments and Results
4.1 Datasets
4.2 Compared Methods
4.3 Experiment Setup
4.4 Results and Discussions
5 Conclusions and Future Work
References
Machine Learning Based Simulations and Knowledge Discovery
DeepPE: Emulating Parameterization in Numerical Weather Forecast Model Through Bidirectional Network
1 Introduction
2 Related Work
3 Methods
3.1 Problem Definition
3.2 Deep Parameterization Emulator
3.3 Transfer Scheme
3.4 Training
4 Experiments
4.1 Datasets
4.2 Experimental Setup
5 Results
5.1 DeepPE Performance Analysis
5.2 Transfer Analysis
6 Conclusion
References
Effects of Boundary Conditions in Fully Convolutional Networks for Learning Spatio-Temporal Dynamics
1 Introduction
2 Method
2.1 Learning an Auto-Regressive Model
2.2 Neural Network Convolutional Architecture
2.3 Boundary Condition Treatment
2.4 Loss Function
3 Applications: Time-Evolving PDEs
3.1 Acoustic Propagation of Gaussian Pulses
3.2 Diffusion of Temperature Spots
3.3 Datasets Generation and Parameters
4 Results
5 Conclusion
References
Physics Knowledge Discovery via Neural Differential Equation Embedding
1 Introduction
2 Phase-Field Model
3 Problem Statement
4 Neural Differential Equation Embedding
5 Related Work
6 Experiments
7 Conclusion
References
A Bayesian Convolutional Neural Network for Robust Galaxy Ellipticity Regression
1 Introduction
2 Estimating Galaxy Ellipticity from Images
3 A Method to Assess Uncertainty in Ellipticity Estimation
3.1 Estimation of Noise Related Uncertainty
3.2 Estimation of Blend Related Uncertainty
3.3 Training Protocol
4 Experiments
4.1 Estimation of Uncertainty Related to Noise
4.2 Estimation of Uncertainty Related to Blending
5 Conclusion
References
Precise Weather Parameter Predictions for Target Regions via Neural Networks
1 Introduction
2 Related Work
3 Pertinent Background
4 Learning-Based Modelets for Weather Forecasting
4.1 Micro Model
4.2 Micro-Macro Model
5 Experiment
5.1 Setting
5.2 Overall Performance
5.3 Comparing to Other Methods
5.4 Ablation Study
5.5 Abnormal Weather Forecasting
6 Conclusion
References
Action Set Based Policy Optimization for Safe Power Grid Management
1 Introduction
2 Related Work
3 Preliminary
3.1 Power Grid Management
3.2 Search-Based Planning
4 Methodology
4.1 Search with the Action Set
4.2 Policy Optimization
4.3 Discussion on Action Set Size
4.4 Algorithm Summary
5 Experiments
5.1 Experiment Setup
5.2 Implementation
5.3 Competition
6 Conclusion
A Grid2Op Environment
References
Conditional Neural Relational Inference for Interacting Systems
1 Introduction
2 Related Work
3 The Conditional Neural Inference Model
3.1 Encoding, Establishing the Body-Part Interactions
3.2 Decoding, Establishing the Dynamics
3.3 Conditional Generation
4 Experiments
4.1 Experimental Setup
4.2 Results
5 Conclusion
References
Recommender Systems and Behavior Modeling
MMNet: Multi-granularity Multi-mode Network for Item-Level Share Rate Prediction
1 Introduction
2 Related Works
3 Preliminary
4 Methodology
4.1 Overall Framework
4.2 Fine-Granularity Module
4.3 Coarse-Granularity Module
4.4 Meta-info Modeling Module
4.5 Optimization Objectives
5 Online Deployment
6 Experiments
6.1 Datasets
6.2 Baselines and Experimental Settings
6.3 Offline Item-Level Share Rate Prediction
6.4 Online A/B Tests
6.5 Ablation Studies
6.6 Parameter Analyses
7 Conclusion and Future Work
References
The Joy of Dressing Is an Art: Outfit Generation Using Self-attention Bi-LSTM
1 Introduction
2 Related Work
3 Methodology
3.1 Bayesian Personalized Ranking (MF) Embedding
3.2 Training Dataset Generation
3.3 Bi-LSTM
3.4 Self-attention Bi-LSTM
3.5 Generation of New Outfits
4 Results
5 Conclusion
References
On Inferring a Meaningful Similarity Metric for Customer Behaviour
1 Introduction
2 Problem Definition
3 SIMPRIM Framework
3.1 Journey Log to Journey Profiles
3.2 Measuring Similarity
3.3 Dimensionality Reduction
3.4 Co-learning of Metric Weights and Journey Clustering
3.5 Evaluation
4 Experimental Evaluation
4.1 Customer Service Process at Anonycomm
4.2 BPIC 2012 Real Dataset
5 Related Work
6 Conclusion
References
Quantifying Explanations of Neural Networks in E-Commerce Based on LRP
1 Introduction
2 Preliminaries
3 Formal Model of an Online Shop
4 Explanation Approach
4.1 Explanation via Layer-Wise Relevance Propagation
4.2 Input Analysis with Leave-One-Out Method
4.3 Explanation Quantity Measures
5 Evaluation
5.1 Evaluation Setting
5.2 Evaluation Data Set
5.3 Evaluation Results
6 Conclusion
References
Natural Language Processing
Balancing Speed and Accuracy in Neural-Enhanced Phonetic Name Matching
1 Introduction
1.1 Challenges
2 Related Work
3 Phonetic Name Matching Systems
3.1 Neural Name Transliteration
3.2 Neural Name Matching
4 Experimental Results
4.1 Training and Hyperparameters
4.2 Results
5 Conclusion and Future Work
References
Robust Learning for Text Classification with Multi-source Noise Simulation and Hard Example Mining
1 Introduction
2 Related Work
2.1 Noise Reduction
2.2 Adversarial Training
2.3 Training with Noisy Data
3 Problem
3.1 Notation
3.2 Text Classification
3.3 A Practical Scenario
3.4 OCR Noise Simulation
3.5 Robust Training
4 Approach
4.1 OCR Noise Simulation
4.2 Noise Invariance Representation
4.3 Hard Example Mining
4.4 The Overall Framework
5 Experiment
5.1 Dataset
5.2 Implementation
5.3 Results
6 Analysis
6.1 Naive Training with a Single Noise Simulation Method
6.2 The Impact of Different Noise Level
6.3 The Impact of Hard Example Mining
6.4 The Impact of Stability Loss
7 Conclusion
References
Topic-to-Essay Generation with Comprehensive Knowledge Enhancement
1 Introduction
2 Related Work
3 Methodology
3.1 Task Formulation
3.2 Model Description
3.3 Training and Inference
4 Experiments
4.1 Datasets
4.2 Settings
4.3 Baselines
4.4 Evaluation Metrics
4.5 Experimental Results
4.6 Validity of Knowledge Transfer
4.7 Case Study
5 Conclusion
References
Analyzing Research Trends in Inorganic Materials Literature Using NLP
1 Introduction
2 Related Work
3 Corpus Preparation
3.1 Definition of Types
3.2 Collecting Literature
3.3 Annotation
4 Approach
4.1 Sequence Labeling Architecture
4.2 Numeric Normalization
5 Results
5.1 Inter-Annotator Agreement
5.2 Comparing Language Models
5.3 Tuning Hyperparameters
5.4 Evaluation of Extracted NE Result
6 Research Trends Analysis
7 Conclusion
References
An Optimized NL2SQL System for Enterprise Data Mart
1 Introduction
2 Related Work
3 NL2SQL System
3.1 Question Textbox
3.2 Schema
3.3 Auto Completion
3.4 Log
4 Method
4.1 Problem Statement
4.2 Model Overview
4.3 Table Part
4.4 Table Expand
4.5 Where Value Matching
5 Template-Based Data Simulation
5.1 SQL Query
5.2 Natural Language Question
5.3 Iterative Template Writing
6 Experiment
6.1 Data
6.2 Experiment Settings
6.3 Experiment Results
7 Conclusion
References
Time Aspect in Making an Actionable Prediction of a Conversation Breakdown
1 Introduction
2 Related Works
3 Time Aspect in Prediction of Conversation Breakdown
3.1 Proposed Neural Network Architecture
3.2 Loss Functions Incorporating Time Aspect
3.3 Metrics Considering the Time-to-breakdown of Prediction
4 Experiments
4.1 Datasets
4.2 Experimental Setup
4.3 Results of Experiments
5 Conclusions
References
Feature Enhanced Capsule Networks for Robust Automatic Essay Scoring
1 Introduction
2 Related Work
3 Methodology
3.1 CapsRater
3.2 FeatureCapture
4 Experimentation
4.1 Dataset
4.2 Evaluation Metric
4.3 Baselines
4.4 Implementation
5 Result and Analysis
5.1 Testing with Adversarial Essays
6 Conclusion
References
TagRec: Automated Tagging of Questions with Hierarchical Learning Taxonomy
1 Introduction
2 Related Work
2.1 Multi-class Classification with Hierarchical Taxonomy
2.2 Sentence Representation Methods
3 Methodology
3.1 Contextualized Input Representations
3.2 Hierarchical Label Representations
3.3 Loss Function
4 Experiments
4.1 Datasets
4.2 Analysis of Representation Methods for Encoding the Hierarchical Labels
4.3 Methods and Experimental Setup
5 Results and Discussion
6 Conclusion
References
Remote Sensing, Image and Video Processing
Checking Robustness of Representations Learned by Deep Neural Networks
1 Introduction
2 Method
3 Computational Experiments
3.1 ImageNet Feasibility Study
3.2 Sensitivity Study for Different Deep Models and Saliency Map Generators
3.3 Pascal VOC Feasibility Study
3.4 Inconsistency in the ImageNet Annotations
3.5 Adversarial Attacks
4 Conclusion
References
CHECKER: Detecting Clickbait Thumbnails with Weak Supervision and Co-teaching
1 Introduction
2 Related Work
2.1 Clickbait Headline Detection
2.2 Clickbait Thumbnail Detection
2.3 Vision-Language Model
3 Building Dataset
3.1 Data Acquisition
3.2 Label Collection
4 The Proposed Method: CHECKER
4.1 Generating Noisy Labels
4.2 Learning from Noisy Labels
5 Experimental Validation
5.1 Set-Up
5.2 Performance Comparison
5.3 Understanding Co-teaching
5.4 Limitation and Future Work
6 Conclusion
References
Crowdsourcing Evaluation of Saliency-Based XAI Methods
1 Introduction
2 Related Work
2.1 XAI Methods
2.2 Automated Evaluation Schemes for XAI Methods
2.3 Crowd-Based Evaluation Schemes for XAI Methods
3 Proposed Crowd-Based Evaluation Scheme for XAI Methods
4 Results
4.1 Experimental Settings
4.2 Results
5 Conclusion
References
Automated Machine Learning for Satellite Data: Integrating Remote Sensing Pre-trained Models into AutoML Systems-12pt
1 Introduction
2 Related Work
3 Methodology
3.1 Original Auto-Keras System (V-AK)
3.2 Models Pre-trained Using ImageNet Dataset (IMG-AK)
3.3 Models Pre-trained Using Remote Sensing Datasets (RS-AK)
4 Experiments
4.1 Datasets
4.2 Experimental Setup
5 Results
5.1 AutoML vs Non-automated Models
5.2 AutoML Variants and the Different Type of Datasets
5.3 The Remote Sensing Block RS-AK
6 Conclusions and Future Work
References
Multi-task Learning for User Engagement and Adoption in Live Video Streaming Events
1 Introduction
2 Live Video Streaming Events
3 Proposed Model
3.1 MERLIN's Architecture
3.2 Policy Component
3.3 Task Importance Component
3.4 Multi-task Learner Component
4 Experiments
4.1 Setup
4.2 Performance Evaluation
4.3 Multi-task Vs Single-Task Learning in Parameter Configuration
5 Conclusions
References
Social Media
Explainable Abusive Language Classification Leveraging User and Network Data
1 Introduction
2 Related Work
3 Data
4 Methodology
4.1 Multimodal Classification Model
4.2 Explainable AI Technique
5 Results
5.1 Classification Performance
5.2 Explainability
6 Discussion
7 Conclusion and Outlook
References
Calling to CNN-LSTM for Rumor Detection: A Deep Multi-channel Model for Message Veracity Classification in Microblogs
1 Introduction
2 Related Works
2.1 Monomodal-Based Rumor Detection
2.2 Multimodal Rumor Detection
3 deepMONITOR Model
3.1 Problem Definition and Model Overview
3.2 LSTM Networks
3.3 Multimodal Feature Learning
3.4 Model Learning
4 Experimental Validation
4.1 Datasets
4.2 Experimental Settings
4.3 Baselines
4.4 Performance Analysis
5 Conclusion
References
Correction to: Automated Machine Learning for Satellite Data: Integrating Remote Sensing Pre-trained Models into AutoML Systems
Correction to: Chapter "Automated Machine Learning for Satellite Data: Integrating Remote Sensing Pre-trained Models into AutoML Systems" in: Y. Dong et al. (Eds.): Machine Learning and Knowledge Discovery in Databases, LNAI 12979, https://doi.org/10.1007/978-3-030-86517-7_28
Author Index

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track

Description

More details

Other editions

Additional editions

Content

System requirements