
Machine Learning and Knowledge Discovery in Databases. Research Track
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
This multi-volume set, LNAI 14941 to LNAI 14950, constitutes the refereed proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2024, held in Vilnius, Lithuania, in September 2024.
The papers presented in these proceedings are from the following three conference tracks: -
Research Track: The 202 full papers presented here, from this track, were carefully reviewed and selected from 826 submissions. These papers are present in the following volumes: Part I, II, III, IV, V, VI, VII, VIII.
Demo Track: The 14 papers presented here, from this track, were selected from 30 submissions. These papers are present in the following volume: Part VIII.
Applied Data Science Track: The 56 full papers presented here, from this track, were carefully reviewed and selected from 224 submissions. These papers are present in the following volumes: Part IX and Part X.
More details
Other editions
Additional editions

Content
- Intro
- Preface
- Organization
- Invited Talks Abstracts
- The Dynamics of Memorization and Unlearning
- The Emerging Science of Benchmarks
- Enhancing User Experience with AI-Powered Search and Recommendations at Spotify
- How to Utilize (and Generate) Player Tracking Data in Sport
- Resource-Aware Machine Learning-A User-Oriented Approach
- Contents - Part VII
- Research Track
- Data with Density-Based Clusters: A Generator for Systematic Evaluation of Clustering Algorithms
- 1 Introduction
- 2 Related Work
- 3 A Reliable Data Generator for Density-Based Clusters
- 3.1 Main Concept of DENSIRED
- 3.2 Generation of Skeletons
- 3.3 Instantiating Data Points
- 3.4 Delimitations
- 3.5 Analysis Intrinsic Dimensionality
- 4 Experiments
- 4.1 Discussion of the Data Generator
- 4.2 Benchmarking
- 5 Conclusion
- References
- Model-Based Reinforcement Learning with Multi-task Offline Pretraining
- 1 Introduction
- 2 Related Work
- 3 Problem Formulation
- 4 Method
- 4.1 Why Model-Based RL for Domain Transfer?
- 4.2 Multi-task Offline Pretraining
- 4.3 Domain-Selective Dynamics Transfer
- 4.4 Domain-Selective Behavior Transfer
- 5 Experiments
- 5.1 Experimental Setup
- 5.2 Main Results
- 5.3 Ablation Studies
- 5.4 Analyses of Task Relations
- 5.5 Results on CARLA Environment
- 5.6 Results with Medium Offline Data
- 6 Conclusion
- References
- Advancing Graph Counterfactual Fairness Through Fair Representation Learning
- 1 Introduction
- 2 Related Work
- 2.1 Graph Neural Networks
- 2.2 Fairness in Graph
- 3 Notations
- 4 Methodology
- 4.1 Causal Model
- 4.2 Framework Overview
- 4.3 Fair Ego-Graph Generation Module
- 4.4 Counterfactual Data Augmentation Module
- 4.5 Fair Disentangled Representation Learning Module
- 4.6 Final Optimization Objectives
- 5 Experiment
- 5.1 Datasets
- 5.2 Evaluation Metrics
- 5.3 Baselines
- 5.4 Experiment Results
- 6 Conclusion
- References
- Continuously Deep Recurrent Neural Networks
- 1 Introduction
- 2 Shallow and Deep Echo State Networks
- 3 Continuously Deep Echo State Networks
- 4 Analysis of Deep Dynamics
- 5 Mathematical Analysis
- 6 Experiments
- 6.1 Memory Capacity
- 6.2 Time-Series Reconstruction
- 7 Conclusions
- References
- Dynamics Adaptive Safe Reinforcement Learning with a Misspecified Simulator
- 1 Introduction
- 2 Related Work
- 2.1 Safe Reinforcement Learning
- 2.2 Sim-to-Real Reinforcement Learning
- 3 Problem Formulation
- 4 Method
- 4.1 Theoretical Motivation
- 4.2 Value Estimation Alignment with an Inverse Dynamics Model
- 4.3 Conservative Cost Critic Learning via Uncertainty Estimation
- 5 Experiments
- 5.1 Baselines and Environments
- 5.2 Overall Performance Comparison
- 5.3 Ablation Studies and Data Sensitivity Study
- 5.4 Visualization Analysis
- 5.5 Parameter Sensitivity Studies
- 6 Final Remarks
- References
- CRISPert: A Transformer-Based Model for CRISPR-Cas Off-Target Prediction
- 1 Introduction
- 2 Computational Methods for Off-Target Prediction
- 3 Method
- 3.1 Problem Formalisation
- 3.2 Model Architecture
- 3.3 CRISPR-Cas Binding Concentration Features
- 3.4 Data Imbalance Handling
- 3.5 Model Implementation
- 4 Experimental Setting
- 4.1 Data
- 4.2 Test Scenarios
- 4.3 Hyper-parameter Optimisation
- 4.4 Pre-training
- 5 Results and Analysis
- 6 Conclusion
- References
- Improved Topology Features for Node Classification on Heterophilic Graphs
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Notation
- 3.2 Motivations
- 3.3 Bin of Paths Embedding
- 3.4 Confidence and Class-Wise Training Accuracy Weighting
- 4 Evaluation
- 4.1 Experimental Settings
- 4.2 Node Classification
- 4.3 Improvements on Base GNN Models
- 4.4 Distribution of CCAW Weights
- 4.5 Class-Wise Node Classification Accuracy
- 4.6 Ablations
- 4.7 Hyperparameter Analysis
- 4.8 Efficiency Analysis
- 5 Conclusion
- References
- Fast Redescription Mining Using Locality-Sensitive Hashing
- 1 Introduction
- 2 The Algorithm
- 2.1 The ReReMi Algorithm
- 2.2 Primer on LSH
- 2.3 Finding Initial Pairs
- 2.4 Extending Initial Pairs
- 2.5 Time Complexity
- 3 Experimental Evaluation
- 3.1 Experimental Setup
- 3.2 Finding Initial Pairs
- 3.3 Extending Initial Pairs
- 3.4 Building Full Redescriptions
- 4 Conclusions
- References
- sigma-GPTs: A New Approach to Autoregressive Models
- 1 Introduction
- 2 Methodology
- 2.1 sigma-GPTs: Shuffled Autoregression
- 2.2 Double Positional Encodings
- 2.3 Conditional Probabilities and Infilling
- 2.4 Token-Based Rejection Sampling
- 2.5 Other Orders
- 2.6 Denoising Diffusion Models
- 3 Results
- 3.1 General Performance
- 3.2 Training Efficiency
- 3.3 Curriculum Learning
- 3.4 Open Text Generation: t-SNE of Generated Sequences
- 3.5 Training and Generating in Fractal Order
- 3.6 Memorizing
- 3.7 Infilling and Conditional Density Estimation
- 3.8 Token-Based Rejection Sampling Scheme
- 4 Related Works
- 5 Conclusion
- References
- FairFlow: An Automated Approach to Model-Based Counterfactual Data Augmentation for NLP
- 1 Introduction
- 2 Background and Related Literature
- 3 Approach
- 3.1 Attribute Classifier Training
- 3.2 Generating Word-Pair List
- 3.3 Error Correction
- 3.4 Training the Generative Model
- 4 Experimental Set-Up
- 4.1 Training Set-Up
- 4.2 Evaluation Datasets
- 4.3 Comparative Techniques
- 5 Evaluation and Results
- 5.1 Utility
- 5.2 Extrinsic Bias Mitigation
- 5.3 Task Performance
- 5.4 Qualitative Analysis and Key Observations
- 6 Conclusion
- References
- GrINd: Grid Interpolation Network for Scattered Observations
- 1 Introduction
- 2 Related Work
- 3 Method
- 3.1 Fourier Interpolation Layer
- 3.2 NeuralPDE
- 3.3 GrINd
- 4 Experiments
- 4.1 Data
- 4.2 Baseline Models
- 4.3 Model Configuration
- 4.4 Training
- 5 Results and Discussion
- 5.1 Interpolation Accuracy
- 5.2 DynaBench
- 5.3 Limitations
- 6 Conclusion and Future Work
- References
- MEGA: Multi-encoder GNN Architecture for Stronger Task Collaboration and Generalization
- 1 Introduction
- 2 Related Works
- 3 Methods
- 3.1 Preliminaries
- 3.2 Task Interference Problem in MT-SSL
- 3.3 MEGA Architecture
- 3.4 Pretext Tasks
- 4 Experiments
- 4.1 Experiment Setting
- 4.2 Results
- 5 Conclusion
- References
- MetaQuRe: Meta-learning from Model Quality and Resource Consumption
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Automated Algorithm Selection
- 3.2 Incorporating Resource Awareness
- 3.3 Relative Index Scaling
- 3.4 Compositional Meta-learning
- 3.5 Additional Remarks
- 4 Data on Model Quality and Resource Consumption
- 5 Experimental Results
- 5.1 Insights from MetaQuRe
- 5.2 Learning from MetaQuRe
- 6 Conclusion
- References
- Propagation Structure-Semantic Transfer Learning for Robust Fake News Detection
- 1 Introduction
- 2 Related Work
- 3 Propagation Structure-Semantic Transfer Learning Framework
- 3.1 Overview
- 3.2 Dual Teacher Models
- 3.3 Local-Global Propagation Interaction Enhanced Student Model
- 3.4 Multi-channel Knowledge Distillation Training Objective
- 4 Experiment
- 4.1 Experimental Setups
- 4.2 Main Results
- 4.3 Ablation Study
- 4.4 Generalization Evaluation
- 4.5 Robustness Evaluation
- 4.6 Parameter Analysis
- 5 Conclusion
- References
- Exploring Contrastive Learning for Long-Tailed Multi-label Text Classification
- 1 Introduction
- 2 Related Work
- 2.1 Supervised Contrastive Learning
- 2.2 Multi-label Classification
- 2.3 Supervised Contrastive Learning for Multi-label Classification
- 3 Method
- 3.1 Contrastive Baseline LBase
- 3.2 Motivation
- 3.3 Multi-label Supervised Contrastive Loss
- 4 Experimental Setup
- 4.1 Datasets
- 4.2 Comparison Baselines
- 4.3 Implementation Details
- 5 Experimental Results
- 5.1 Comparison with Standard MLTC Losses
- 5.2 Fine-Tuning After Supervised Contrastive Learning
- 5.3 Representation Analysis
- 6 Conclusion
- References
- Simultaneous Linear Connectivity of Neural Networks Modulo Permutation
- 1 Introduction
- 2 Methods
- 2.1 Preliminaries
- 2.2 Aligning Networks via Permutation
- 3 Related Work
- 4 Notions of Linear Connectivity Modulo Permutation
- 5 Empirical Findings
- 5.1 Training Trajectories Are Simultaneously Weak Linearly Connected Modulo Permutation
- 5.2 Iteratively Sparsified Networks Are Simultaneously Weak Linearly Connected Modulo Permutation
- 5.3 Evidence for Strong Linear Connectivity Modulo Permutation
- 6 Algorithmic Aspects of Network Alignment
- 7 Conclusion
- References
- Fast Fishing: Approximating Bait for Efficient and Scalable Deep Active Image Classification
- 1 Introduction
- 2 Related Work
- 3 Notation
- 4 Time and Space Complexity of Bait
- 5 Approximations
- 5.1 Expectation
- 5.2 Gradient
- 6 Experimental Results
- 6.1 Setup
- 6.2 Assessment of Approximations
- 6.3 Benchmark Experiments
- 7 Conclusion
- References
- Understanding Domain-Size Generalization in Markov Logic Networks
- 1 Introduction
- 2 Related Work
- 3 Background
- 3.1 Basic Definitions
- 3.2 First-Order Logic
- 4 Learning in Markov Logic
- 5 Markov Logic Across Domain Sizes
- 6 Domain-Size Generalization
- 7 Experiments
- 7.1 Datasets
- 7.2 Methodology
- 7.3 Results
- 8 Conclusion
- References
- Retrieval-Augmented Mining of Temporal Logic Specifications from Data
- 1 Introduction
- 2 Background
- 3 Retrieval-Augmented STL Requirement Mining
- 3.1 Building and Querying the Semantic Vector Database
- 3.2 Bayesian Optimization in the Semantic Space of Formulae
- 4 Experiments
- 4.1 Experimental Setting
- 4.2 Case Studies
- 5 Related Work
- 6 Conclusion
- References
- CAM-Based Methods Can See Through Walls
- 1 Introduction
- 1.1 Related Work
- 1.2 Organization of the Paper
- 2 Mathematical Description
- 2.1 A Simple CNN
- 2.2 Closed-Form Expression
- 2.3 Theoretical Analysis
- 3 Experiments
- 3.1 Model
- 3.2 Proposed Datasets
- 3.3 Results
- 4 Conclusion
- References
- Making Alice Appear Like Bob: A Probabilistic Preference Obfuscation Method For Implicit Feedback Recommendation Models
- 1 Introduction
- 2 Related Work
- 2.1 Privacy-Aware Recommender Systems
- 2.2 Fairness Through Adversarial Training in Recommendation
- 3 Methodology
- 3.1 Item's Group Inclination
- 3.2 Item Stereotypicality
- 3.3 User Group Stereotypicality
- 3.4 Stereotypicality-Based Obfuscation
- 3.5 Attacker Network
- 4 Experimental Setup
- 4.1 Datasets
- 4.2 Dataset Obfuscation
- 4.3 Algorithms
- 5 Results and Discussion
- 6 Conclusion and Future Work
- References
- Leiden-Fusion Partitioning Method for Effective Distributed Training of Graph Embeddings
- 1 Introduction
- 2 Background on Graph Embeddings
- 3 Related Work
- 3.1 Partitioning Methods
- 3.2 Distributed Training Frameworks
- 4 Leiden-Fusion Method
- 4.1 Essential Features for Graph Partitioning
- 4.2 Leiden Community Detection
- 4.3 Community Fusion
- 4.4 Partition Visualization on Karate Dataset
- 5 Experimental Results
- 5.1 Analysis of Partitions
- 5.2 Quality Comparison
- 5.3 Speed Analysis
- 5.4 Impact of Our Fusion Method on Other Partitioning Methods
- 6 Conclusion
- References
- Automated Design of Linear Bounding Functions for Sigmoidal Nonlinearities in Neural Networks
- 1 Introduction
- 2 Related Work
- 2.1 Convex Relaxation for Neural Network Verification
- 2.2 Automated Algorithm Configuration
- 3 Method
- 3.1 Configuration Objective
- 3.2 Configuration Space
- 4 Setup for Empirical Evaluation
- 5 Experimental Results and Discussion
- 5.1 Sigmoid-Based Networks
- 5.2 Tanh-Based Networks
- 5.3 Distribution of Tangent Points
- 6 Conclusions and Future Work
- References
- Efficiently Predicting Mutational Effect on Homologous Proteins by Evolution Encoding
- 1 Introduction
- 2 Preliminary and Problem
- 3 Related Work
- 4 Framework
- 4.1 Embedding Initialisation
- 4.2 Residue Embedding Update
- 4.3 Evolution Encoding
- 4.4 Final Embedding and Optimisation
- 4.5 Extensions on Observed Graph
- 4.6 Theoretical Analysis
- 5 Experimental Study
- 5.1 Effectiveness
- 5.2 Analysis of Performance
- 6 Conclusion and Future Work
- References
- Interpretable and Fair Mechanisms for Abstaining Classifiers
- 1 Introduction
- 2 Related Literature
- 3 Background
- 3.1 Selective Classification
- 3.2 Measuring Fairness With Association Rules and Situation Testing
- 4 Methodology
- 4.1 Step 2: Learn At-Risk Subgroups
- 4.2 Step 3: Situation Testing
- 4.3 Step 4: Calibrate Rejection Strategy
- 5 Experimental Evaluation
- 5.1 Experimental Settings
- 5.2 Results
- 6 Discussion and Conclusion
- References
- Boosting Long-Tail Data Classification with Sparse Prototypical Networks
- 1 Introduction
- 2 Related Work
- 3 Task: Diagnoses Prediction
- 4 Methods
- 5 Experiments
- 5.1 Finetuning and Hyperparameters
- 6 Results
- 7 Analysis and Discussion
- 8 Conclusion
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.