
Advances in Knowledge Discovery and Data Mining
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
The 143 papers presented in these proceedings were carefully reviewed and selected from 813 submissions. They deal with new ideas, original research results, and practical development experiences from all KDD related areas, including data mining, data warehousing, machine learning, artificial intelligence, databases, statistics, knowledge engineering, big data technologies, and foundations.
More details
Other editions
Additional editions

Content
- Intro
- General Chairs' Preface
- PC Chairs' Preface
- Organization
- Contents - Part I
- Anomaly and Outlier Detection
- BAARD: Blocking Adversarial Examples by Testing for Applicability, Reliability and Decidability
- 1 Introduction
- 2 Background
- 3 Baard: Blocking Adversarial Examples
- 4 Experiments
- 4.1 Experimental Setup
- 4.2 Detection Results
- 5 Conclusion and Future Work
- References
- Fast and Attributed Change Detection on Dynamic Graphs with Density of States
- 1 Introduction
- 2 Related Work
- 3 Problem Formulation and Notations
- 4 Scalable Change Point Detection
- 5 Synthetic Experiments
- 6 Real World Experiments
- 7 Conclusion
- References
- Outlying Aspect Mining via Sum-Product Networks
- 1 Introduction
- 2 Preliminaries and Related Work
- 2.1 Outlier Detection and Outlying Aspect Mining
- 2.2 Sum-Product Networks
- 3 Outlying Aspect Mining via SPNs
- 4 Experimental Evaluation
- 4.1 Data Sets
- 4.2 Experiments
- 5 Results
- 6 Discussion and Conclusion
- References
- TSI-GAN: Unsupervised Time Series Anomaly Detection Using Convolutional Cycle-Consistent Generative Adversarial Networks
- 1 Introduction
- 2 Related Work
- 3 Encoding Time-series to Images
- 3.1 Gramian Angular Field (GAF)
- 3.2 Recurrence Plot (RP)
- 3.3 Combining Two Channels
- 4 The TSI-GAN Model
- 4.1 Model Architecture
- 4.2 Loss Function and Training Strategy
- 4.3 Post-processing and Anomaly Detection
- 5 Performance Evaluation
- 5.1 Datasets
- 5.2 Performance Metrics
- 5.3 Experimental Results
- 6 Conclusion
- References
- Achieving Counterfactual Fairness for Anomaly Detection
- 1 Introduction
- 2 Preliminary
- 3 Counterfactually Fair Anomaly Detection
- 3.1 Counterfactual Fairness
- 3.2 Overview of Counterfactually Fair Anomaly Detection (CFAD)
- 3.3 Phase One: Counterfactual Data Generation
- 3.4 Phase Two: Fair Anomaly Detection
- 4 Experiments
- 4.1 Experimental Setup
- 4.2 Experimental Results
- 5 Conclusions
- References
- The Common-Neighbors Metric Is Noise-Robust and Reveals Substructures of Real-World Networks
- 1 Introduction
- 2 Preliminaries
- 2.1 Setting
- 2.2 Random-Graph Models
- 3 Theoretical Results
- 3.1 Probabilities of Vertex Placements
- 3.2 The CN Score of Different Edges
- 4 Empirical Results
- 4.1 Graph Model as Base Graph
- 4.2 Real-World Network as Base Graph
- 4.3 Real-World Graph as Mixed Graph
- 5 Conclusion
- References
- An Effective WGAN-Based Anomaly Detection Model for IoT Multivariate Time Series
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Unsupervised Time Series Anomaly Detection
- 3.2 WPS Model
- 3.3 Anomaly Scoring
- 4 Experiment
- 4.1 Experimental Setup
- 4.2 Baseline Comparisons
- 4.3 Convergence Analysis
- 4.4 Ablation Experiments
- 5 Conclusion
- References
- Association Rules
- RL-Net: Interpretable Rule Learning with Neural Networks
- 1 Introduction
- 2 Related Work
- 3 Approach
- 3.1 RL-Net
- 3.2 Training and Tuning of RL-Net
- 3.3 RL-Net for Multi-label Classification
- 4 Experiments
- 4.1 Datasets
- 4.2 Data Preprocessing
- 4.3 Algorithms
- 4.4 Protocol
- 4.5 Results
- 5 Conclusion
- References
- Classification
- The Causal Strength Bank: A New Benchmark for Causal Strength Classification
- 1 Introduction
- 2 Related Work
- 3 The Causality Strength Datasets
- 3.1 Data Collection and Processing
- 3.2 Data Labeling
- 3.3 Dataset Analysis
- 4 Method
- 5 Experiment
- 5.1 Datasets and Evaluation Metrics
- 5.2 Baselines
- 5.3 Results and Discussion
- 6 Conclusion and Future Work
- References
- Topological Graph Convolutional Networks Solutions for Power Distribution Grid Planning
- 1 Introduction
- 2 Related Work
- 3 Methods
- 3.1 Base Resilience Evaluation Method
- 3.2 Zigzag Persistent Homology
- 3.3 Bootstrapped Zigzag Image Representation Learning with Graph Convolutional Nets
- 4 Experimental Studies
- 4.1 Datasets and Baselines
- 4.2 Experimental Settings
- 4.3 Overall Results
- 4.4 Ablation Experiments
- 5 Discussion
- References
- Label Distribution Learning with Discriminative Instance Mapping
- 1 Introduction
- 2 Related Works
- 3 Approach
- 3.1 Notations
- 3.2 Discriminative Instance Pool
- 3.3 Output Model
- 4 Experiments
- 4.1 Datasets
- 4.2 Evaluations
- 4.3 Parameter Analysis
- 4.4 Baselines and Settings
- 4.5 Performance Comparison
- 4.6 DIP Selection Result
- 5 Conclusion and Further Works
- References
- Leveraging Generative Models for Combating Adversarial Attacks on Tabular Datasets
- 1 Introduction
- 2 Related Work and Preliminaries
- 3 Proposed Technique
- 3.1 Pre-conditioning Layer
- 3.2 Utilizing Generative Model During Discriminative Training
- 3.3 Algorithm
- 4 Experimental Results
- 4.1 Experiment Setup
- 4.2 Generative vs. Discriminative Models
- 4.3 Analysis on Deep Model
- 4.4 Analysis on Shallow Model
- 4.5 Ablation Study on Data Size and Feature Order
- 5 Conclusion and Future Work
- References
- Weak Correlation-Based Discriminative Dictionary Learning for Image Classification
- 1 Introduction
- 2 Proposed Method
- 2.1 Dictionary Learning Algorithm
- 2.2 Solution
- 2.3 Classification Algorithm
- 3 Experiments
- 3.1 Datasets
- 3.2 Performance Comparison
- 4 Conclusion
- References
- Data-dependent and Scale-Invariant Kernel for Support Vector Machine Classification
- 1 Introduction
- 2 PMK: Probability Mass-based Kernel
- 2.1 Notations and Preliminaries
- 2.2 m0-dissimilarity
- 2.3 Proposed New Kernel Function
- 2.4 Positive Definiteness of the PMK Kernel
- 3 Empirical Evaluation
- 3.1 Comparison of SVM with PMK and Other Kernels
- 3.2 Comparison of SVM with PMK Against Other SoTA Classifiers
- 3.3 Robustness Towards Scales of Measurement
- 3.4 Dimensionality Reduction Using Kernel PCA
- 4 Concluding Remarks
- References
- Enhancing Robustness of Prototype with Attentive Information Guided Alignment in Few-Shot Classification
- 1 Introduction
- 2 Related Work
- 3 Proposed Method
- 3.1 Problem Formulation
- 3.2 Query-Prototype Cross-Adaptation
- 3.3 Information-Rich Prototypes
- 3.4 Explicit Spatial and Channel Attention Module
- 4 Experimental Evaluation
- 4.1 Datasets
- 4.2 Implementation Details
- 4.3 Quantitative Assessment
- 5 Ablation Study
- 5.1 Analysis for Explicit Modules
- 5.2 Effect of Augmentation
- 5.3 Visualization
- 6 Conclusion
- References
- Clustering
- An Improved Visual Assessment with Data-Dependent Kernel for Stream Clustering
- 1 Introduction
- 2 Relate Work
- 3 Kernel-Based inc-siVAT
- 4 Experiment and Analysis
- 4.1 Evaluation of inc-siVAT Variations
- 4.2 Comparison with Different Clustering Algorithms
- 5 Conclusions
- References
- Selecting the Number of Clusters K with a Stability Trade-off: An Internal Validation Criterion
- 1 Introduction
- 2 Related Work
- 3 Clustering Stability
- 4 Between-cluster and Within-Cluster Stability
- 4.1 Stadion: A Novel Stability-Based Validity Index
- 5 Experiments
- 5.1 A Simple Example with Stability Paths
- 5.2 Benchmark of Clustering Validation Methods
- 6 Conclusion
- References
- Adaptive View-Aligned and Feature Augmentation Network for Partially View-Aligned Clustering
- 1 Introduction
- 2 Related Work
- 3 Proposed Method
- 3.1 Problem Formulation
- 3.2 Multi-view Autoencoder
- 3.3 Adaptive View-Aligned Module
- 3.4 Self-augmentation Strategy
- 4 Experiments
- 4.1 Experimental Settings
- 4.2 Experimental Results and Analysis
- 4.3 Ablation Studies
- 4.4 Parameter Analysis
- 4.5 Alignment Ratio Analysis
- 5 Conclusion
- References
- Data Mining Processes and Pipelines
- Continuously Predicting the Completion of a Time Intervals Related Pattern
- 1 Introduction
- 2 Background
- 3 Methods
- 3.1 The Unfinished Coinciding STIs Challenge
- 3.2 TIRP-Prefixes
- 3.3 Continuous Prediction Models (CPMs)
- 3.4 Early Warning Strategies
- 4 Evaluation
- 4.1 Datasets
- 4.2 Experimental Setup
- 4.3 Experiments and Results
- 5 Discussion
- References
- Interactive Pattern Mining Using Discriminant Sub-patterns as Dynamic Features
- 1 Introduction
- 2 Preliminaries
- 3 DiSPaLe: Discriminating Sub-Pattern Feature Learning
- 3.1 Towards More Expressive and Learnable Pattern Descriptions
- 3.2 Discriminating Sub-patterns as Descriptors
- 3.3 Updating the Weights of Feature Pattern Representations
- 4 Experiments
- 5 Conclusion
- References
- A Novel Explainable Link Forecasting Framework for Temporal Knowledge Graphs Using Time-Relaxed Cyclic and Acyclic Rules
- 1 Introduction
- 2 Related Work
- 3 Problem Statement
- 4 Proposed Framework: TRKG-Miner
- 5 Experiments and Results
- 6 Conclusion
- References
- A Consumer-Good-Type Aware Itemset Placement Framework for Retail Businesses
- 1 Introduction
- 2 Related Work
- 3 Proposed Framework of the Problem
- 4 Proposed Itemset Placement Framework
- 5 Performance Evaluation
- 6 Conclusion
- References
- Deep Learning
- M-EBM: Towards Understanding the Manifolds of Energy-Based Models
- 1 Introduction
- 2 Background
- 3 Manifold EBM
- 3.1 Informative Initialization and M-EBM
- 3.2 Injected Noise in M-EBM
- 4 Manifold JEM
- 4.1 Injected Noise in M-JEM
- 4.2 Energy Function Regularization in M-JEM
- 5 Experiments
- 5.1 M-EBM
- 5.2 M-JEM
- 5.3 Analysis
- 6 Conclusion
- References
- Small Temperature is All You Need for Differentiable Architecture Search
- 1 Introduction
- 2 Methodology
- 2.1 Sparse-Noisy Softmax
- 2.2 Exponential Temperature Schedule
- 2.3 Entropy-Based Adaptive Scheme
- 3 Evaluations
- 4 Further Experiments, Analyses and Conclusion
- 5 Conclusion
- References
- Document-Level Relation Extraction with Cross-sentence Reasoning Graph
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Encoding Module
- 3.2 Document-Level Graph Aggregation Module
- 3.3 Entity-Level Graph Reasoning Module
- 3.4 Classification Module
- 4 Experiments and Results
- 4.1 Dataset
- 4.2 Experiment Settings and Evaluation Metrics
- 4.3 Main Results
- 4.4 Ablation Study
- 4.5 Intra- and Inter-sentence Relation Extraction
- 4.6 Case Study
- 5 Conclusion
- References
- Weight Prediction Boosts the Convergence of AdamW
- 1 Introduction
- 2 Related Work
- 3 Methods
- 4 Experiments
- 4.1 Experiment Settings
- 4.2 CNNs on CIFAR-10
- 4.3 LSTMs on Penn TreeBank
- 5 Conclusions
- References
- Model-Agnostic Reachability Analysis on Deep Neural Networks
- 1 Introduction
- 2 Related Work
- 3 Preliminaries
- 4 Lipschitz Analysis on Neural Networks
- 4.1 Lipschitz Continuity of FNN
- 4.2 Lipschitz Analysis on Recurrent Neural Networks
- 5 Reachability Analysis with Provable Guarantees
- 5.1 Verification via Lipschitz Optimization
- 5.2 Global Convergence Analysis
- 6 Experiments
- 6.1 Performance Comparison with State-of-the-art Methods
- 6.2 Ablation Study
- 6.3 Case Study 1
- 6.4 Case Study 2
- 7 Conclusion
- References
- Adaptive Bi-nonlinear Neural Networks Based on Complex Numbers with Weights Constrained Along the Unit Circle
- 1 Introduction
- 2 Background
- 3 Related Work
- 4 Methods
- 4.1 Feed-Forward Layer
- 4.2 Convolutional Layer
- 5 Experimental Methods and Limitations
- 6 Analytical and Experimental Results
- 6.1 XOR-Problem
- 6.2 Minimal Networks and Expressive Power
- 6.3 Classification
- 7 Summary and Discussion
- References
- CopyCAT: Masking Strategy Conscious Augmented Text for Machine Generated Text Detection
- 1 Introduction
- 2 Related Work
- 2.1 Detecting Machine-Generated Articles in Real-World Settings
- 2.2 Left-to-Right Language Modeling
- 2.3 Text-Infilling
- 2.4 Saliency Score
- 3 Preliminary Analysis
- 3.1 Datasets
- 3.2 Experiment
- 4 Methodology
- 4.1 Training CopyCAT Generator
- 4.2 Creating Masked Dataset
- 4.3 Post-processing Methods
- 5 Experiments
- 5.1 Datasets
- 5.2 Data Augmentation Methods
- 5.3 Experiment Results
- 5.4 Selection of Mask Threshold
- 6 Conclusion
- References
- Federated Learning Under Statistical Heterogeneity on Riemannian Manifolds
- 1 Introduction
- 2 Problem Formulation and Related Work
- 3 Riemannian Geometry of SPD Matrices
- 4 Federated Learning on Riemannian Manifold
- 5 Experiments
- 6 Conclusions
- References
- Enhanced Topic Modeling with Multi-modal Representation Learning
- 1 Introduction
- 2 Related Work
- 3 Methodology: The GDF-NTM Framework
- 4 Experiments
- 5 Conclusion and Future Work
- References
- Dynamic Multi-View Fusion Mechanism for Chinese Relation Extraction
- 1 Introduction
- 2 Related Work
- 2.1 Chinese Relation Extraction
- 2.2 Chinese Character Representation
- 2.3 Multi-View Learning
- 3 Methodology
- 3.1 Multi-View Features Representation
- 3.2 Mixture-of-View-Experts
- 3.3 Relation Classifier
- 4 Experiments
- 4.1 Datasets
- 4.2 Baselines
- 4.3 Experimental Results
- 4.4 Ablation Studies
- 5 Conclusion
- References
- Alignment-Aware Word Distance
- 1 Introduction
- 2 Background
- 2.1 Earth Mover's Distance
- 2.2 Word Mover's Distance and Word Rotator's Distance
- 3 Alignment-Aware Word Degree
- 3.1 Position-Based AWD
- 3.2 Syntax-Based AWD
- 3.3 Imbalanced Alignment Adaptation
- 3.4 Transportation Problem
- 4 Experimental Settings
- 4.1 Task Definition and Evaluation Criterion
- 4.2 Datasets and Pre-trained Word Embeddings
- 4.3 Baselines
- 5 Experimental Results
- 5.1 Main Results
- 5.2 AWD(S) v.s. AWD(P)
- 5.3 Applying Task-Specific Word Embeddings
- 5.4 The Effectiveness of IAA
- 6 Related Work
- 7 Conclusion
- References
- MISNN: Multiple Imputation via Semi-parametric Neural Networks
- 1 Introduction
- 1.1 Missing Value Mechanisms and Imputation
- 1.2 Feature Selection in Imputation Models
- 1.3 Our Contribution
- 2 Related Work
- 3 Data Setup
- 3.1 A Framework for Multiple Imputation
- 4 Multiple Imputation with Semi-parametric Neural Network (MISNN)
- 4.1 Sampling from Posterior and Predictive Distributions
- 4.2 Flexibility of MISNN Framework
- 4.3 Other Properties of MISNN
- 4.4 MISNN for General Missing Patterns
- 5 Numerical Results
- 5.1 Viewpoint of Statistical Inference
- 5.2 Viewpoint of Prediction
- 6 Discussion
- References
- LSG Attention: Extrapolation of Pretrained Transformers to Long Sequences
- 1 Introduction
- 2 Related Works
- 3 LSG: Mixing Local, Sparse and Global Attentions
- 4 Experiments
- 4.1 RoBERTa Extrapolation on MLM
- 4.2 Classification Tasks
- 4.3 Summarization Tasks
- 5 Conclusion
- References
- Dimensionality Detection and Feature Selection
- Compressing the Embedding Matrix by a Dictionary Screening Approach in Text Classification
- 1 Introduction
- 2 Methodology
- 2.1 Problem Set up
- 2.2 Dictionary Screening
- 3 Experiments
- 3.1 Task Description and Datasets
- 3.2 Model Settings
- 3.3 Tuning Parameter Specification
- 3.4 Competing Methods
- 3.5 Performance Measures and Implementation
- 4 Results Analysis
- 4.1 Tuning Parameter Effects
- 4.2 Performance of Compression Results
- 4.3 Competing Methods
- 5 Conclusions
- References
- Ethics and Fairness
- Disentangled Representation with Causal Constraints for Counterfactual Fairness
- 1 Introduction
- 2 Background
- 3 Proposed Method
- 3.1 The Theory of Learning Counterfactually Fair Representations
- 3.2 CF-VAE
- 4 Experiments
- 4.1 Framework Comparison
- 4.2 Evaluation Metrics
- 4.3 Law School
- 4.4 Adult
- 4.5 Ablation Study
- 5 Related Works
- 6 Conclusion
- References
- F3: Fair and Federated Face Attribute Classification with Heterogeneous Data
- 1 Introduction
- 2 Preliminaries
- 2.1 Federated Learning (FL) Setting
- 2.2 Fairness Notions
- 3 Methodology
- 3.1 Heuristic-Based F3
- 3.2 FairGrad: A Gradient-Based F3
- 4 Experiments
- 4.1 Results
- 4.2 Discussion
- 5 Conclusion
- References
- Estimating the Risk of Individual Discrimination of Classifiers
- 1 Introduction
- 2 Related Work
- 3 Problem Setting
- 4 Discrimination Notion and Risk Scores
- 5 Risk Estimation
- 5.1 Method 1: BART
- 5.2 Method 2: FORESEE
- 6 Experimental Evaluation
- 7 Results and Discussions
- 8 Conclusions
- References
- Multi-fair Capacitated Students-Topics Grouping Problem
- 1 Introduction
- 2 Related Work
- 3 Problem Definition
- 4 Methodology for the MFC Grouping Problem
- 4.1 A Greedy Heuristic Approach
- 4.2 A Knapsack-Based Approach
- 4.3 An MFC Knapsack Approach
- 5 Evaluation
- 5.1 Datasets
- 5.2 Experimental Setup
- 5.3 Experimental Results
- 6 Conclusions and Outlook
- References
- GroupMixNorm Layer for Learning Fair Models
- 1 Introduction
- 2 Related Work
- 3 Proposed GroupMixNorm Layer
- 4 Datasets and Experimental Details
- 4.1 Fairness Evaluation Metrics
- 4.2 Implementation Details
- 5 Results and Analysis
- 5.1 Comparison with State-of-the-art Algorithms
- 5.2 Learned Representation Analysis
- 5.3 Generalizability to New Protected Groups
- 5.4 Debias Pre-trained Model with Limited Data
- 6 Conclusion and Future Work
- References
- Quantifying the Bias of Transformer-Based Language Models for African American English in Masked Language Modeling
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Corpora for Spoken English
- 3.2 Bias in Masked Language Modeling
- 4 Results and Discussion
- 4.1 Measuring the Bias of LMs
- 4.2 Bias on AAE Features
- 5 Conclusion
- References
- Fairness for Robust Learning to Rank
- 1 Introduction
- 2 Related Works
- 3 Preliminary
- 3.1 Probabilistic Ranking
- 4 Methodology
- 5 Optimization
- 5.1 Inference and Runtime Analysis
- 6 Experiments
- 6.1 Fairness Benchmark Datasets
- 6.2 Microsoft Learning to Rank Dataset
- 7 Conclusions
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.