Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track

Name: Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track | European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9-13, 2024, Proceedings, Part X
Brand: Springer
Price: 85.59 EUR
Availability: OnlineOnly

European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9-13, 2024, Proceedings, Part X

Albert Bifet Tomas Krilavicius Ioanna Miliou Slawomir Nowaczyk(Editor)

Springer (Publisher)

Published on 1. September 2024

LVII, 465 pages

E-Book

PDF with digital watermarking

System requirements

978-3-031-70381-2 (ISBN)

€85.59incl. 7% vat

System requirements

for PDF with digital watermarking

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

This multi-volume set, LNAI 14941 to LNAI 14950, constitutes the refereed proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2024, held in Vilnius, Lithuania, in September 2024.

The papers presented in these proceedings are from the following three conference tracks: -

Research Track: The 202 full papers presented here, from this track, were carefully reviewed and selected from 826 submissions. These papers are present in the following volumes: Part I, II, III, IV, V, VI, VII, VIII.

Demo Track: The 14 papers presented here, from this track, were selected from 30 submissions. These papers are present in the following volume: Part VIII.

Applied Data Science Track: The 56 full papers presented here, from this track, were carefully reviewed and selected from 224 submissions. These papers are present in the following volumes: Part IX and Part X.

More details

Other editions

Content

Intro
Preface
Organization
Invited Talks Abstracts
The Dynamics of Memorization and Unlearning
The Emerging Science of Benchmarks
Enhancing User Experience with AI-Powered Search and Recommendations at Spotify
How to Utilize (and Generate) Player Tracking Data in Sport
Resource-Aware Machine Learning-A User-Oriented Approach
Contents - Part X
Applied Data Science Track
MT-HCCAR: Multi-task Deep Learning with Hierarchical Classification and Attention-Based Regression for Cloud Property Retrieval
1 Introduction
2 Related Work
3 Problem Statement and Data Simulation
3.1 Radiative Transfer Simulation
3.2 Cloud Property Retrieval
4 MT-HCCAR Model
4.1 Encoder-Decoder Sub-Network
4.2 Hierarchical Classification (HC) Sub-Network
4.3 Classification Assisted Regression Sub-Network Based on Cross Attention Mechanism (CAR)
4.4 Model Training of MT-HCCAR
5 Experiments
5.1 Experiment Setup
5.2 Evaluation Metrics
5.3 Comparison with Baseline Models
5.4 Ablation Study
5.5 Earth Science Evaluation
6 Conclusions
References
Machine Learning Based Tool for Automated Sperm Cell Tracking and Sperm Bundle Detection
1 Introduction
2 Background
2.1 Computer-Assisted Sperm Analysis Systems
3 Methodology
3.1 Sperm Cell Detection
3.2 Path Reconstruction
3.3 The Kalman Filter Implementation
3.4 Bounding Box Classification
3.5 Final Analysis
4 Results
5 Conclusion
References
DISCO: An End-to-End Bandit Framework for Personalised Discount Allocation
1 Introduction
2 Problem Formulation
3 Disco Architecture
3.1 Action Feature Representation
3.2 Context Feature Representation
3.3 Reward Prediction: Bayesian Log-Linear Regression
3.4 Optimisation of Discount Code Allocation
4 Experiments
4.1 Information Sharing and Price Elasticity with RBF Encoding
4.2 Reward Prediction Model
4.3 Active Learning with Global Constraints
5 Online A/B Test
6 Concluding Discussion
References
Advancing Solar Flare Prediction Using Deep Learning with Active Region Patches
1 Introduction
2 Related Work
3 Data and Model
4 Experimental Evaluation
4.1 Experimental Settings
4.2 Evaluation
4.3 Discussion
5 Conclusion and Future Work
References
Exceptional Subitizing Patterns: Exploring Mathematical Abilities of Finnish Primary School Children with Piecewise Linear Regression
1 Introduction
2 The FUnctional Numerical Assessment Study
3 Background
3.1 Segmented Linear Regression
3.2 Connections to Existing SD/EMM Approaches
4 Our Proposed Flattening Approach
4.1 Domain-Specific Aggregations Functions
5 Our Proposed Target Model
6 Experiments
6.1 Results Experiment 1
6.2 Results Experiment 2
7 Discussion and Conclusion
References
Intent Enhanced Self-supervised Hypergraph Learning for Session-Based Recommendation
1 Introduction
2 Related Work
2.1 Traditional Methods
2.2 Deep Learning-Based Methods
2.3 Self-supervised Learning
3 Preliminaries
3.1 Problem Statement
3.2 Hypergraph
4 Methodology
4.1 Hypergraph Construction
4.2 Hypergraph Convolutional Neural Network
4.3 Session Representation Learning
4.4 Recommendation Generation
4.5 Enhancing SBR with Self-supervised Learning Task
4.6 Model Optimization
5 Experiments
5.1 Experimental Setup
5.2 Overall Performance (Q1)
5.3 Ablation Study (Q2)
5.4 Hyperparameters Analysis (Q3)
6 Conclusion
References
Missing Data Imputation: Do Advanced ML/DL Techniques Outperform Traditional Approaches?
1 Introduction
2 Related Work
3 Background
4 Imputation Methods
4.1 Statistical Methods
4.2 Machine Learning Methods
4.3 Deep Learning Methods
5 Experiments
5.1 Datasets and Experimental Setting
5.2 Experimental Results
6 Discussion
7 Conclusions and Future Directions
References
Evaluating Vision Transformer Models for Visual Quality Control in Industrial Manufacturing
1 Introduction
2 Related Work
2.1 Vision Backbone
2.2 Anomaly Detection and Localization
3 Experimental Setup
3.1 Backbone Architectures
3.2 Anomaly Detection Architectures
3.3 Datasets
3.4 Implementation Details
3.5 Metrics
3.6 Experiments
4 Results and Discussion
4.1 Comparison with the VT-ADL and FastFlow Models
4.2 Comparison of GMMs and NF Models
4.3 Performance of the Backbones
4.4 Considerations and Limitations for Practical Application
5 Conclusion
References
GraphRPM: Risk Pattern Mining on Industrial Large Attributed Graphs
1 Introduction
2 Problem Formulation
3 Methods
3.1 Potential Subgraph Enumeration
3.2 Two-Staged Pattern Mining
3.3 Pattern Risk Assessment
4 Experiments
5 Deployment
6 Conclusion
References
Solving a Real-World Optimization Problem Using Proximal Policy Optimization with Curriculum Learning and Reward Engineering
1 Introduction
2 Related Work
3 Real Environment and Problem Description
4 Reinforcement Learning Problem Formulation
4.1 State Space
4.2 Action Space
4.3 Environment Dynamics
5 Reward Tuning
5.1 Simple Gaussian Reward
5.2 Custom Reward
5.3 Precision Reward
6 Methodology
7 Experimental Evaluation
7.1 Experimental Setup
7.2 Results
7.3 Discussion
8 Conclusion
References
Spatial-Temporal PDE Networks for Traffic Flow Forecasting
1 Introduction
2 Related Work
3 Preliminaries
3.1 Problem Formulation
3.2 Network Architectures
4 Methods
4.1 Vanilla PDE for Traffic Flow
4.2 Discrete-Time PDE Solutions
4.3 Graph PDE Layer
4.4 Integrating PDE Layer with GNNs
5 Experiments
5.1 Datasets and Baselines
5.2 Performance Evaluation
5.3 Model Analysis
5.4 Case Study
6 Conclusion
References
Symbolic Prompt Tuning Completes the App Promotion Graph
1 Introduction
2 Background and Related Work
2.1 Definitions
2.2 Collected App Promotion Dataset
2.3 APHG Completion Task
2.4 Related Work
3 SymPrompt
3.1 Embedding-Based Symbolic Prompts
3.2 Metapath-Based Symbolic Prompts
3.3 Combined Input Tokens
4 Experiment
4.1 Setup
4.2 Performance on App Promotion HGC
4.3 Component Analysis
4.4 Random Permutation on Model Learning
5 Conclusion
References
Boosting Protein Language Models with Negative Sample Mining
1 Introduction
2 Related Work
3 Method
3.1 Negative Sampling
3.2 Negative Mining in Cross Attention Space
3.3 Inference Phase of NM-Transformer
4 Experiments
4.1 Experimental Settings
4.2 Main Results
4.3 Interpretability of NM-Transformer, Case Study
5 Conclusion
References
MedSyn: LLM-Based Synthetic Medical Text Generation Framework
1 Introduction
2 Related Work
2.1 Medical Knowledge Graphs
2.2 LLMs in Medical Domain
3 Method
3.1 Medical Knowledge Graph
3.2 Instruction-Following Dataset
3.3 Fine-Tuning
3.4 Generation Task
3.5 Symptoms Sampling
3.6 Synthetic Dataset
4 Experiments
4.1 Datasets and Tasks
4.2 Models
4.3 Evaluation
4.4 Results
4.5 Human Assessment
5 Discussion
6 Conclusion
References
A Crystal Knowledge-Enhanced Pre-training Framework for Crystal Property Estimation
1 Introduction
2 Related Work
3 Preliminaries
4 Methodology
4.1 Framework Overview
4.2 Reconstruction Under Mutually Exclusive Masked Views
4.3 Multi-graph Attention Module
4.4 Crystal Knowledge Enhanced Module
4.5 Optimization Objectives
5 Experiments
5.1 Dataset Description
5.2 CROP Configurations
5.3 Experimental Results
5.4 Ablation Studies
5.5 Parameter Sensitivity Analysis
6 Conclusion
References
Multiplex Community Detection for Resilient Electrical Segmentation Enabling Management of an Increasingly Complex Power Grid
1 Introduction
2 Related Work
3 Modeling of Resilient Electrical Segmentation
3.1 Formulation of Multiplex Graph Flattening
3.2 Optimization Methods
4 Experiments
4.1 Resilient Segmentation Pipeline
4.2 Simulation
4.3 Electrical Application: Security Analysis
5 Conclusion
References
Bandits for Sponsored Search Auctions Under Unknown Valuation Model: Case Study in E-Commerce Advertising
1 Introduction
2 Related Work
3 Problem Formulation
3.1 Learning in Sponsored Search Auctions
4 BatchEXP3: Algorithm for Learning in SSA
5 Deployment
5.1 Bidding System Architecture
5.2 Live Test Design and Unfolding
6 Experimental Results and Discussion
6.1 Group Level Analysis
6.2 Risk of Decreasing Costs
7 From Practice Back to Theory: Additional Insights
8 Conclusion
References
Unbiased Recommendation Through Invariant Representation Learning
1 Introduction
2 Related Work
2.1 Unbiased Recommendation
2.2 Causal Inference and Invariant Risk Minimization
3 Problem Formulation
4 Causal Analysis of Recommendation Bias
5 Causal Invariant Recommendation Model
5.1 Recommendation Model
5.2 Invariant Representation Learning
5.3 Data Partition Learning
6 Experiments
6.1 Experimental Settings
6.2 Main Results
6.3 Ablation Study
7 Conclusion
References
Enhancing Multi-objective Optimisation Through Machine Learning-Supported Multiphysics Simulation
1 Introduction
2 Related Work
2.1 Surrogate Modelling
2.2 Multiphysics Optimisation and Data Extension
3 Machine Learning Supported Optimisation Strategy
3.1 Data Acquisition
3.2 Surrogate Models
3.3 Interpretable Surrogate Modelling (xAI Module)
3.4 Multiobjective Optimisation and Validation
4 Experimental Design
4.1 Use Case 1: Motor Dataset
4.2 Use Case 2: U-Bend Dataset
5 Results
5.1 Prediction Performance of Surrogate Models
5.2 Identifying Critical Features and Relevant Dependencies
5.3 Evaluation of Solution Candidates in the Multiobjective Optimisation Task
5.4 Validation of Solution Candidates in the Multiobjective Optimisation
6 Conclusion
References
DistALANER: Distantly Supervised Active Learning Augmented Named Entity Recognition in the Open Source Software Ecosystem
1 Introduction
2 Related Work
3 Dataset
4 Source Details
5 Preliminaries
6 Methodology
6.1 Stage 1: Construction and Matching of Dictionary
6.2 Stage 2: Entity Distillation and Dictionary Expansion
6.3 Stage 3: The NER Model
7 Heuristics
8 Experimental Setup
9 Results
9.1 Progressive Learning
9.2 Motivation of Different Split of HOn
10 Issues in Finding Distant Labels Using LLMs
11 Additional Experiments for Task-Based Evaluation
11.1 Relation Extraction
12 Error Analysis
13 Conclusion
References
DiffSynth: Latent In-Iteration Deflickering for Realistic Video Synthesis
1 Introduction
2 Related Work
2.1 Diffusion Models
2.2 Diffusion-Based Video Synthesis
3 Methodology
3.1 Preliminaries
3.2 Latent In-Iteration Deflickering
3.3 Patch Blending Algorithm
4 Experiments
4.1 Experimental Settings
4.2 Quantitive Comparison
4.3 Ablation Study
5 More Pipelines for Video Synthesis Applications
5.1 Image-Guided Video Stylization
5.2 Video Restoring
5.3 3D Rendering
6 Industrial Application
7 Conclusion and Future Work
References
Offline Imitation of Badminton Player Behavior via Experiential Contexts and Brownian Motion
1 Introduction
2 Related Work
2.1 Inverse Reinforcement Learning
2.2 Offline Imitation Learning via Behavior Cloning
3 Preliminaries
3.1 Badminton: A Typical Example of a Turn-Based Sport
3.2 The Contextual Markov Decision Process
3.3 Problem Formulation
4 Methodology
4.1 Experiential Context Selector (ECS)
4.2 Latent Geometric Brownian Motion (LGBM)
4.3 Action Projection Layer
4.4 Loss Function
5 Experiments
5.1 Experimental Setup
5.2 Quantitative Results (RQ1)
5.3 Length Distribution Difference (RQ2)
5.4 Win Rate Difference (RQ3)
5.5 Case Studies (RQ4)
6 Conclusion
References
Fast and Adaptive Questionnaires for Voting Advice Applications
1 Introduction
2 Methods
2.1 Data
2.2 Spatial Models
2.3 Selection Methods
2.4 Evaluation Metrics
3 Results
3.1 Selecting the Spatial Model
3.2 Optimizing the Questionnaire
4 Conclusions
References
Job Title Prediction as a Dual Task of Expertise Prediction in Open Source Software
1 Introduction
2 Related Work
3 The TOSE Dataset
3.1 Raw Data Collection
3.2 API Expertise Sequence Construction
3.3 Job Title Sequence Construction
3.4 Sequence Alignment
4 The DualJE Model
4.1 The Primal Task: Expertise to Job Titles
4.2 The Dual Task: Job Titles to Expertise
4.3 Model Training
5 Performance Evaluation
5.1 Data Configuration
5.2 Baseline Models
5.3 Hyperparameters and Evaluation Metrics
5.4 RQ1: DualJE vs. Baseline Models
5.5 RQ2: DualJE vs. Ablation Models
5.6 RQ3: Hyperparameter Tuning
6 Discussion and Conclusion
References
LLMs in the Loop: Leveraging Large Language Model Annotations for Active Learning in Low-Resource Languages
1 Introduction
2 LLMs in the Loop
3 Experiments
3.1 Foundation Model Selection
3.2 Effect of Prompt Design and Querying LLMs in Batches
3.3 Data Contamination
3.4 Active Learning
4 Conclusion
References
Multi-spectral Gradient Residual Network for Haze Removal in Multi-sensor Remote Sensing Imagery
1 Introduction
2 Related Work
2.1 Haze Removal with Priors and Assumptions
2.2 Haze Removal with Deep Learning
3 Methodology
3.1 Problem Formulation - Input and Output
3.2 Model Architecture
3.3 Loss Function
4 Experiments
4.1 Dataset
4.2 Experimental Setup
5 Results
5.1 Quantitative Results
6 Conclusions
References
ExTea: An Evolutionary Algorithm-Based Approach for Enhancing Explainability in Time-Series Models
1 Introduction
2 Related Work
3 Method
3.1 Problem Definition and Individual Coding
3.2 Population Generation
3.3 Fitness Function Design
3.4 Growth
3.5 Crossover and Mutation
3.6 Explanation
4 Experiment
4.1 Experiment Setup
4.2 Evaluation
5 Conclusion
References
BiCAE - A Bimodal Convolutional Autoencoder for Seed Purity Testing
1 Introduction
2 Related Work
2.1 Computer Vision for Seed Analysis
2.2 Multimodal Autoencoders
3 Methodology
3.1 Unimodal Baselines
3.2 Bimodal Convolutional Autoencoder (BiCAE)
4 Experiments
4.1 Data
4.2 Training Setting
4.3 Results
5 Discussion
5.1 Model Comparison
5.2 Societal Impact
6 Conclusion and Outlook
References
Author Index

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track

Description

More details

Other editions

Additional editions

Content

System requirements