
Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
The multi-volume set LNAI 12975 until 12979 constitutes the refereed proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2021, which was held during September 13-17, 2021. The conference was originally planned to take place in Bilbao, Spain, but changed to an online event due to the COVID-19 pandemic.
The 210 full papers presented in these proceedings were carefully reviewed and selected from a total of 869 submissions.
The volumes are organized in topical sections as follows:
Research Track:
Part I: Online learning; reinforcement learning; time series, streams, and sequence models; transfer and multi-task learning; semi-supervised and few-shot learning; learning algorithms and applications.
Part II: Generative models; algorithms and learning theory; graphs and networks; interpretation, explainability, transparency, safety.
Part III: Generative models; search and optimization; supervised learning; text mining and natural language processing; image processing, computer vision and visual analytics.
Applied Data Science Track:
Part IV: Anomaly detection and malware; spatio-temporal data; e-commerce and finance; healthcare and medical applications (including Covid); mobility and transportation.
Part V: Automating machine learning, optimization, and feature engineering; machine learning based simulations and knowledge discovery; recommender systems and behavior modeling; natural language processing; remote sensing, image and video processing; social media.
More details
Other editions
Additional editions

Content
- Intro
- Preface
- Organization
- Contents - Part V
- Automating Machine Learning, Optimization, and Feature Engineering
- PuzzleShuffle: Undesirable Feature Learning for Semantic Shift Detection
- 1 Introduction
- 2 Related Work
- 2.1 Out-of-Distribution Detection
- 2.2 Data Augmentation
- 2.3 Uncertainty Calibration
- 3 Preliminaries
- 3.1 The Effects by Perturbation
- 3.2 Adversarial Undesirable Feature Learning
- 4 Proposed Method
- 4.1 PuzzleShuffle Augmentation
- 4.2 Adaptive Label Smoothing
- 4.3 Motivation
- 5 Experiments
- 5.1 Experimental Settings
- 5.2 Compared Methods
- 5.3 Results
- 5.4 Analysis
- 6 Conclusion
- References
- Enabling Machine Learning on the Edge Using SRAM Conserving Efficient Neural Networks Execution Approach
- 1 Introduction
- 2 Background and Related Work
- 2.1 Deep Model Compression
- 2.2 Executing Neural Networks on Microcontrollers
- 3 Efficient Neural Network Execution Approach Design
- 3.1 Tensor Memory Mapping (TMM) Method Design
- 3.2 Loading Fewer Tensors and Tensors Re-usage
- 3.3 Finding the Cheapest NN Graph Execution Sequence
- 3.4 Core Algorithm
- 4 Experimental Evaluation
- 4.1 SRAM Usage
- 4.2 Model Performance
- 4.3 Inference Time and Energy Consumption
- 5 Conclusion
- References
- AutoML Meets Time Series Regression Design and Analysis of the AutoSeries Challenge
- 1 Introduction
- 2 Challenge Setting
- 2.1 Phases
- 2.2 Protocol
- 2.3 Datasets
- 2.4 Metrics
- 2.5 Platform, Hardware and Limitations
- 2.6 Baseline
- 2.7 Results
- 3 Post Challenge Experiments
- 3.1 Reproducibility
- 3.2 Overfitting and Generalisation
- 3.3 Comparison to Open Source AutoML Solutions
- 3.4 Impact of Time Budget
- 3.5 Dataset Difficulty
- 4 Conclusion and Future Work
- References
- Methods for Automatic Machine-Learning Workflow Analysis
- 1 Introduction
- 2 Problem Definition
- 3 Related Work
- 4 Residual Graph-Level Graph Convolutional Networks
- 5 Datasets
- 6 Workflow Similarity
- 7 Structural Performance Prediction
- 8 Component Refinement and Suggestion
- 9 Conclusion
- References
- ConCAD: Contrastive Learning-Based Cross Attention for Sleep Apnea Detection
- 1 Introduction
- 2 Related Work
- 2.1 Sleep Apnea Detection
- 2.2 Attention-Based Feature Fusion
- 2.3 Contrastive Learning
- 3 Methodology
- 3.1 Expert Feature Extraction and Data Augmentation
- 3.2 Feature Extractor
- 3.3 Cross Attention
- 3.4 Contrastive Learning.
- 4 Experiments and Results
- 4.1 Datasets
- 4.2 Compared Methods
- 4.3 Experiment Setup
- 4.4 Results and Discussions
- 5 Conclusions and Future Work
- References
- Machine Learning Based Simulations and Knowledge Discovery
- DeepPE: Emulating Parameterization in Numerical Weather Forecast Model Through Bidirectional Network
- 1 Introduction
- 2 Related Work
- 3 Methods
- 3.1 Problem Definition
- 3.2 Deep Parameterization Emulator
- 3.3 Transfer Scheme
- 3.4 Training
- 4 Experiments
- 4.1 Datasets
- 4.2 Experimental Setup
- 5 Results
- 5.1 DeepPE Performance Analysis
- 5.2 Transfer Analysis
- 6 Conclusion
- References
- Effects of Boundary Conditions in Fully Convolutional Networks for Learning Spatio-Temporal Dynamics
- 1 Introduction
- 2 Method
- 2.1 Learning an Auto-Regressive Model
- 2.2 Neural Network Convolutional Architecture
- 2.3 Boundary Condition Treatment
- 2.4 Loss Function
- 3 Applications: Time-Evolving PDEs
- 3.1 Acoustic Propagation of Gaussian Pulses
- 3.2 Diffusion of Temperature Spots
- 3.3 Datasets Generation and Parameters
- 4 Results
- 5 Conclusion
- References
- Physics Knowledge Discovery via Neural Differential Equation Embedding
- 1 Introduction
- 2 Phase-Field Model
- 3 Problem Statement
- 4 Neural Differential Equation Embedding
- 5 Related Work
- 6 Experiments
- 7 Conclusion
- References
- A Bayesian Convolutional Neural Network for Robust Galaxy Ellipticity Regression
- 1 Introduction
- 2 Estimating Galaxy Ellipticity from Images
- 3 A Method to Assess Uncertainty in Ellipticity Estimation
- 3.1 Estimation of Noise Related Uncertainty
- 3.2 Estimation of Blend Related Uncertainty
- 3.3 Training Protocol
- 4 Experiments
- 4.1 Estimation of Uncertainty Related to Noise
- 4.2 Estimation of Uncertainty Related to Blending
- 5 Conclusion
- References
- Precise Weather Parameter Predictions for Target Regions via Neural Networks
- 1 Introduction
- 2 Related Work
- 3 Pertinent Background
- 4 Learning-Based Modelets for Weather Forecasting
- 4.1 Micro Model
- 4.2 Micro-Macro Model
- 5 Experiment
- 5.1 Setting
- 5.2 Overall Performance
- 5.3 Comparing to Other Methods
- 5.4 Ablation Study
- 5.5 Abnormal Weather Forecasting
- 6 Conclusion
- References
- Action Set Based Policy Optimization for Safe Power Grid Management
- 1 Introduction
- 2 Related Work
- 3 Preliminary
- 3.1 Power Grid Management
- 3.2 Search-Based Planning
- 4 Methodology
- 4.1 Search with the Action Set
- 4.2 Policy Optimization
- 4.3 Discussion on Action Set Size
- 4.4 Algorithm Summary
- 5 Experiments
- 5.1 Experiment Setup
- 5.2 Implementation
- 5.3 Competition
- 6 Conclusion
- A Grid2Op Environment
- References
- Conditional Neural Relational Inference for Interacting Systems
- 1 Introduction
- 2 Related Work
- 3 The Conditional Neural Inference Model
- 3.1 Encoding, Establishing the Body-Part Interactions
- 3.2 Decoding, Establishing the Dynamics
- 3.3 Conditional Generation
- 4 Experiments
- 4.1 Experimental Setup
- 4.2 Results
- 5 Conclusion
- References
- Recommender Systems and Behavior Modeling
- MMNet: Multi-granularity Multi-mode Network for Item-Level Share Rate Prediction
- 1 Introduction
- 2 Related Works
- 3 Preliminary
- 4 Methodology
- 4.1 Overall Framework
- 4.2 Fine-Granularity Module
- 4.3 Coarse-Granularity Module
- 4.4 Meta-info Modeling Module
- 4.5 Optimization Objectives
- 5 Online Deployment
- 6 Experiments
- 6.1 Datasets
- 6.2 Baselines and Experimental Settings
- 6.3 Offline Item-Level Share Rate Prediction
- 6.4 Online A/B Tests
- 6.5 Ablation Studies
- 6.6 Parameter Analyses
- 7 Conclusion and Future Work
- References
- The Joy of Dressing Is an Art: Outfit Generation Using Self-attention Bi-LSTM
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Bayesian Personalized Ranking (MF) Embedding
- 3.2 Training Dataset Generation
- 3.3 Bi-LSTM
- 3.4 Self-attention Bi-LSTM
- 3.5 Generation of New Outfits
- 4 Results
- 5 Conclusion
- References
- On Inferring a Meaningful Similarity Metric for Customer Behaviour
- 1 Introduction
- 2 Problem Definition
- 3 SIMPRIM Framework
- 3.1 Journey Log to Journey Profiles
- 3.2 Measuring Similarity
- 3.3 Dimensionality Reduction
- 3.4 Co-learning of Metric Weights and Journey Clustering
- 3.5 Evaluation
- 4 Experimental Evaluation
- 4.1 Customer Service Process at Anonycomm
- 4.2 BPIC 2012 Real Dataset
- 5 Related Work
- 6 Conclusion
- References
- Quantifying Explanations of Neural Networks in E-Commerce Based on LRP
- 1 Introduction
- 2 Preliminaries
- 3 Formal Model of an Online Shop
- 4 Explanation Approach
- 4.1 Explanation via Layer-Wise Relevance Propagation
- 4.2 Input Analysis with Leave-One-Out Method
- 4.3 Explanation Quantity Measures
- 5 Evaluation
- 5.1 Evaluation Setting
- 5.2 Evaluation Data Set
- 5.3 Evaluation Results
- 6 Conclusion
- References
- Natural Language Processing
- Balancing Speed and Accuracy in Neural-Enhanced Phonetic Name Matching
- 1 Introduction
- 1.1 Challenges
- 2 Related Work
- 3 Phonetic Name Matching Systems
- 3.1 Neural Name Transliteration
- 3.2 Neural Name Matching
- 4 Experimental Results
- 4.1 Training and Hyperparameters
- 4.2 Results
- 5 Conclusion and Future Work
- References
- Robust Learning for Text Classification with Multi-source Noise Simulation and Hard Example Mining
- 1 Introduction
- 2 Related Work
- 2.1 Noise Reduction
- 2.2 Adversarial Training
- 2.3 Training with Noisy Data
- 3 Problem
- 3.1 Notation
- 3.2 Text Classification
- 3.3 A Practical Scenario
- 3.4 OCR Noise Simulation
- 3.5 Robust Training
- 4 Approach
- 4.1 OCR Noise Simulation
- 4.2 Noise Invariance Representation
- 4.3 Hard Example Mining
- 4.4 The Overall Framework
- 5 Experiment
- 5.1 Dataset
- 5.2 Implementation
- 5.3 Results
- 6 Analysis
- 6.1 Naive Training with a Single Noise Simulation Method
- 6.2 The Impact of Different Noise Level
- 6.3 The Impact of Hard Example Mining
- 6.4 The Impact of Stability Loss
- 7 Conclusion
- References
- Topic-to-Essay Generation with Comprehensive Knowledge Enhancement
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Task Formulation
- 3.2 Model Description
- 3.3 Training and Inference
- 4 Experiments
- 4.1 Datasets
- 4.2 Settings
- 4.3 Baselines
- 4.4 Evaluation Metrics
- 4.5 Experimental Results
- 4.6 Validity of Knowledge Transfer
- 4.7 Case Study
- 5 Conclusion
- References
- Analyzing Research Trends in Inorganic Materials Literature Using NLP
- 1 Introduction
- 2 Related Work
- 3 Corpus Preparation
- 3.1 Definition of Types
- 3.2 Collecting Literature
- 3.3 Annotation
- 4 Approach
- 4.1 Sequence Labeling Architecture
- 4.2 Numeric Normalization
- 5 Results
- 5.1 Inter-Annotator Agreement
- 5.2 Comparing Language Models
- 5.3 Tuning Hyperparameters
- 5.4 Evaluation of Extracted NE Result
- 6 Research Trends Analysis
- 7 Conclusion
- References
- An Optimized NL2SQL System for Enterprise Data Mart
- 1 Introduction
- 2 Related Work
- 3 NL2SQL System
- 3.1 Question Textbox
- 3.2 Schema
- 3.3 Auto Completion
- 3.4 Log
- 4 Method
- 4.1 Problem Statement
- 4.2 Model Overview
- 4.3 Table Part
- 4.4 Table Expand
- 4.5 Where Value Matching
- 5 Template-Based Data Simulation
- 5.1 SQL Query
- 5.2 Natural Language Question
- 5.3 Iterative Template Writing
- 6 Experiment
- 6.1 Data
- 6.2 Experiment Settings
- 6.3 Experiment Results
- 7 Conclusion
- References
- Time Aspect in Making an Actionable Prediction of a Conversation Breakdown
- 1 Introduction
- 2 Related Works
- 3 Time Aspect in Prediction of Conversation Breakdown
- 3.1 Proposed Neural Network Architecture
- 3.2 Loss Functions Incorporating Time Aspect
- 3.3 Metrics Considering the Time-to-breakdown of Prediction
- 4 Experiments
- 4.1 Datasets
- 4.2 Experimental Setup
- 4.3 Results of Experiments
- 5 Conclusions
- References
- Feature Enhanced Capsule Networks for Robust Automatic Essay Scoring
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 CapsRater
- 3.2 FeatureCapture
- 4 Experimentation
- 4.1 Dataset
- 4.2 Evaluation Metric
- 4.3 Baselines
- 4.4 Implementation
- 5 Result and Analysis
- 5.1 Testing with Adversarial Essays
- 6 Conclusion
- References
- TagRec: Automated Tagging of Questions with Hierarchical Learning Taxonomy
- 1 Introduction
- 2 Related Work
- 2.1 Multi-class Classification with Hierarchical Taxonomy
- 2.2 Sentence Representation Methods
- 3 Methodology
- 3.1 Contextualized Input Representations
- 3.2 Hierarchical Label Representations
- 3.3 Loss Function
- 4 Experiments
- 4.1 Datasets
- 4.2 Analysis of Representation Methods for Encoding the Hierarchical Labels
- 4.3 Methods and Experimental Setup
- 5 Results and Discussion
- 6 Conclusion
- References
- Remote Sensing, Image and Video Processing
- Checking Robustness of Representations Learned by Deep Neural Networks
- 1 Introduction
- 2 Method
- 3 Computational Experiments
- 3.1 ImageNet Feasibility Study
- 3.2 Sensitivity Study for Different Deep Models and Saliency Map Generators
- 3.3 Pascal VOC Feasibility Study
- 3.4 Inconsistency in the ImageNet Annotations
- 3.5 Adversarial Attacks
- 4 Conclusion
- References
- CHECKER: Detecting Clickbait Thumbnails with Weak Supervision and Co-teaching
- 1 Introduction
- 2 Related Work
- 2.1 Clickbait Headline Detection
- 2.2 Clickbait Thumbnail Detection
- 2.3 Vision-Language Model
- 3 Building Dataset
- 3.1 Data Acquisition
- 3.2 Label Collection
- 4 The Proposed Method: CHECKER
- 4.1 Generating Noisy Labels
- 4.2 Learning from Noisy Labels
- 5 Experimental Validation
- 5.1 Set-Up
- 5.2 Performance Comparison
- 5.3 Understanding Co-teaching
- 5.4 Limitation and Future Work
- 6 Conclusion
- References
- Crowdsourcing Evaluation of Saliency-Based XAI Methods
- 1 Introduction
- 2 Related Work
- 2.1 XAI Methods
- 2.2 Automated Evaluation Schemes for XAI Methods
- 2.3 Crowd-Based Evaluation Schemes for XAI Methods
- 3 Proposed Crowd-Based Evaluation Scheme for XAI Methods
- 4 Results
- 4.1 Experimental Settings
- 4.2 Results
- 5 Conclusion
- References
- Automated Machine Learning for Satellite Data: Integrating Remote Sensing Pre-trained Models into AutoML Systems-12pt
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Original Auto-Keras System (V-AK)
- 3.2 Models Pre-trained Using ImageNet Dataset (IMG-AK)
- 3.3 Models Pre-trained Using Remote Sensing Datasets (RS-AK)
- 4 Experiments
- 4.1 Datasets
- 4.2 Experimental Setup
- 5 Results
- 5.1 AutoML vs Non-automated Models
- 5.2 AutoML Variants and the Different Type of Datasets
- 5.3 The Remote Sensing Block RS-AK
- 6 Conclusions and Future Work
- References
- Multi-task Learning for User Engagement and Adoption in Live Video Streaming Events
- 1 Introduction
- 2 Live Video Streaming Events
- 3 Proposed Model
- 3.1 MERLIN's Architecture
- 3.2 Policy Component
- 3.3 Task Importance Component
- 3.4 Multi-task Learner Component
- 4 Experiments
- 4.1 Setup
- 4.2 Performance Evaluation
- 4.3 Multi-task Vs Single-Task Learning in Parameter Configuration
- 5 Conclusions
- References
- Social Media
- Explainable Abusive Language Classification Leveraging User and Network Data
- 1 Introduction
- 2 Related Work
- 3 Data
- 4 Methodology
- 4.1 Multimodal Classification Model
- 4.2 Explainable AI Technique
- 5 Results
- 5.1 Classification Performance
- 5.2 Explainability
- 6 Discussion
- 7 Conclusion and Outlook
- References
- Calling to CNN-LSTM for Rumor Detection: A Deep Multi-channel Model for Message Veracity Classification in Microblogs
- 1 Introduction
- 2 Related Works
- 2.1 Monomodal-Based Rumor Detection
- 2.2 Multimodal Rumor Detection
- 3 deepMONITOR Model
- 3.1 Problem Definition and Model Overview
- 3.2 LSTM Networks
- 3.3 Multimodal Feature Learning
- 3.4 Model Learning
- 4 Experimental Validation
- 4.1 Datasets
- 4.2 Experimental Settings
- 4.3 Baselines
- 4.4 Performance Analysis
- 5 Conclusion
- References
- Correction to: Automated Machine Learning for Satellite Data: Integrating Remote Sensing Pre-trained Models into AutoML Systems
- Correction to: Chapter "Automated Machine Learning for Satellite Data: Integrating Remote Sensing Pre-trained Models into AutoML Systems" in: Y. Dong et al. (Eds.): Machine Learning and Knowledge Discovery in Databases, LNAI 12979, https://doi.org/10.1007/978-3-030-86517-7_28
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.