Advances in Knowledge Discovery and Data Mining

Name: Advances in Knowledge Discovery and Data Mining | 23rd Pacific-Asia Conference, PAKDD 2019, Macau, China, April 14-17, 2019, Proceedings, Part I
Brand: Springer
Price: 85.59 EUR
Availability: OnlineOnly

23rd Pacific-Asia Conference, PAKDD 2019, Macau, China, April 14-17, 2019, Proceedings, Part I

Qiang Yang Zhi-Hua Zhou Zhiguo Gong Min-Ling Zhang Sheng-Jun Huang(Editor)

Springer (Publisher)

Published on 3. April 2019

XL, 627 pages

E-Book

PDF with digital watermarking

System requirements

978-3-030-16148-4 (ISBN)

€85.59incl. 7% vat

System requirements

for PDF with digital watermarking

E-Book Single Licence

Available for download

Description

More details

Other editions

Persons

Content

Intro
PC Chairs' Preface
General Chairs' Preface
Organization
Contents - Part I
Contents - Part II
Contents - Part III
Classification and Supervised Learning
Multitask Learning for Sparse Failure Prediction
1 Introduction
2 Related Work
2.1 Failure Predictions for Sparse Data
2.2 Multi-task Learning in Hierarchical BNP Models
3 Preliminary Model
3.1 Stick-Breaking Process
3.2 Chinese Restaurant Process
4 Our Proposed Model for Sparse Predictions
4.1 Problem Definition Based on HBP Model
4.2 Sharing Parameter Estimation with HDP-HBP Model
4.3 Inference Algorithm
5 Experiments
5.1 Synthetic Data
5.2 Case Study: Water Pipe Failure Prediction
6 Conclusion
References
Cost Sensitive Learning in the Presence of Symmetric Label Noise
1 Introduction
1.1 Some Relevant Background
2 Cost Sensitive Bayes Classifiers Using l0-1, and l,usq Need Not Be Uniform Noise Robust
3 (l,usq,Hlin) Is (,,) Robust
3.1 l,usq Based Classifier from Corrupted Data and Its Performance
4 A Re-sampling Based Algorithm (,)
5 Comparison of l,usq Based Regularized ERM and Algorithm (,) to Existing Methods on UCI Datasets
6 Discussion
References
Semantic Explanations in Ensemble Learning
1 Introduction
2 Problem Definition
3 Explanation-Based Combination Method (EBCM)
3.1 Explanation Extraction
3.2 Consistency Measurement
4 Experiments
4.1 Experiment Setup
4.2 Results
5 Related Work
6 Conclusion
References
Latent Gaussian-Multinomial Generative Model for Annotated Data
1 Introduction
2 Related Works
2.1 Gaussian-Multinomial Mixture Model
2.2 Variational Autoencoders
3 Latent Gaussian-Multinomial Generative Model
3.1 Generative Models
3.2 Neural Variational Inference
3.3 The Variational Bound
3.4 SGVB Estimator
4 Experimental Analysis
4.1 Automatic Image Annotation
4.2 Generative Likelihood Performance
5 Conclusion
References
Investigating Neighborhood Generation Methods for Explanations of Obscure Image Classifiers
1 Introduction
2 Related Work
3 Background
4 Explanation Methods
5 Experiments
5.1 Dataset and Black Box
5.2 Evaluation Measures
5.3 Assessing Explanation Quality
6 Conclusion
References
On Calibration of Nested Dichotomies
1 Introduction
2 Nested Dichotomies
3 Probability Calibration
3.1 Calibration Methods
3.2 Measuring Miscalibration
4 Internal Calibration
4.1 Theoretical Motivation
5 External Calibration
6 Experiments
6.1 Well-Calibrated Base Learners
6.2 Poorly Calibrated Base Learners
7 Conclusion
References
Ensembles of Nested Dichotomies with Multiple Subset Evaluation
1 Introduction
2 Class Subset Selection Methods
2.1 Random Selection
2.2 Balanced Selection
2.3 Random-Pair Selection
3 Multiple Subset Evaluation
3.1 Effect on Growth Functions
3.2 Analysis of Error
4 Experimental Results
4.1 Individual Nested Dichotomies
4.2 Ensembles of Nested Dichotomies
5 Related Work
6 Conclusion
References
Text and Opinion Mining
Topic-Level Bursty Study for Bursty Topic Detection in Microblogs
1 Introduction
2 Bursty Topic Detection Model
2.1 Bursty Level of Words
2.2 Topic-Level Bursty Detection Model
2.3 Parameter Estimation
2.4 Hypothesis Testing
3 Experiment Evaluation
3.1 Dataset
3.2 Experiment Setup
3.3 Topics Discovered from Sina Weibo Dataset
3.4 Precision and Recall for Bursty Topic Detection
4 Related Work
5 Conclusion
References
Adaptively Transfer Category-Classifier for Handwritten Chinese Character Recognition
1 Introduction
2 Related Work
2.1 Handwritten Chinese Character Recognition
2.2 Transfer Learning
3 Adaptively Transfer Category-Classifier for Chinese Handwriting Recognition
3.1 Problem Formulation
3.2 Adaptively Transferring Category-Classifier Model
4 Experimental Evaluation
4.1 Data Preparation
4.2 Baselines and Implementation Details
4.3 Experimental Results
4.4 The Influence of Trade-Off Parameter
5 Conclusion
References
Syntax-Aware Representation for Aspect Term Extraction
1 Introduction
2 Related Work
3 The Proposed Method
3.1 Overview of the Model
3.2 Initial Representation
3.3 Syntax-Directed Attention Network
3.4 Contextual Gating Mechanism
3.5 Training and Inference
4 Experiment
4.1 Datasets and Experimental Settings
4.2 Baselines
4.3 Main Results
4.4 Analysis of the Proposed Methods
5 Conculsion
References
Short Text Similarity Measurement Based on Coupled Semantic Relation and Strong Classification Features
1 Introduction
2 The Proposed Methods
2.1 Coupled Semantic Relation for Similarity Measure
2.2 The Strong Classification Feature-Based Similarity
2.3 Combination of the Coupled Semantic Relation and Strong Classification Feature for Similarity Measure
3 Experimental Results
3.1 Experimental Setup
3.2 Parameter Analysis
3.3 Performance Comparison with Different Similarity Methods
4 Conclusion
References
A Novel Hybrid Sequential Model for Review-Based Rating Prediction
1 Introduction
2 Related Work
3 Preliminary
3.1 Problem Formulation
3.2 Biased Matrix Factorization
4 Proposed Methodology
4.1 Review Representation
4.2 Review-Based States of User and Item
4.3 Joint Rating Prediction
4.4 Inference
5 Experiments
5.1 Dataset
5.2 Baselines
5.3 Hyper-parameter Setting
5.4 Results Analysis
5.5 Impact of Time Window Size
6 Conclusion
References
Integrating Topic Model and Heterogeneous Information Network for Aspect Mining with Rating Bias
1 Introduction
2 Preliminary
3 The THAM Model
3.1 The Phrase-Rating LDA
3.2 Topic Propagation on Review Network
3.3 Uniform Optimization Framework
3.4 Aspect Identification and Rating Prediction
4 Experiments
4.1 Dataset
4.2 Preparation
4.3 Comparison Methods
4.4 Aspect Identification
4.5 Effectiveness Experiments
5 Conclusion
References
Dependency-Aware Attention Model for Emotion Analysis for Online News
1 Introduction
2 Approach
2.1 Context Guided Attentive CNN
2.2 Emotion Dependency-Aware Attentive CNN
2.3 Composition of Networks
2.4 Training
3 Experiment
3.1 Datasets
3.2 Baselines
3.3 Model Configuration
4 Results
5 Discussion
5.1 Multi-task Training
5.2 Attention Visualization
6 Related Work
7 Conclusions
References
Multi-task Learning for Target-Dependent Sentiment Classification
1 Introduction
2 Related Work
3 MTTDSC: A Multi-task Approach to TDSC
3.1 Auxiliary Task
3.2 Main Task
3.3 Training the Tasks
3.4 Implementation Details
4 Experiments
4.1 Datasets for Auxiliary SC Task
4.2 Datasets for Main TDSC Task
4.3 Details of Performance Measures
4.4 Various Methods and Their Performance
4.5 Side-by-Side Diagnostics and Anecdotes
5 Conclusion
References
SC-NER: A Sequence-to-Sequence Model with Sentence Classification for Named Entity Recognition
1 Introduction
2 Related Work
3 Model
3.1 Overview
3.2 Encoder and Classifier
3.3 Decoder
3.4 Restricted Beam Search
4 Data
4.1 Data Preprocessing
4.2 Named Entity Recognition
5 Experiments and Analysis
5.1 Experiments
5.2 Results
6 Conclusion
References
BAB-QA: A New Neural Model for Emotion Detection in Multi-party Dialogue
1 Introduction
2 Related Work
3 Model
3.1 Word Embedding Layer
3.2 Sentence Encoding Layer
3.3 Contextualization and Classification Network
3.4 QA Network
4 Experiments and Results
4.1 Dataset
4.2 Training Settings
4.3 Results and Discussion
5 Conclusion
References
Unsupervised User Behavior Representation for Fraud Review Detection with Cold-Start Problem
1 Introduction
2 Related Work
2.1 Fraud Review Detection
2.2 Cold-Start Problem
3 Proposed Method
3.1 Behavior Representation Architecture
3.2 Entities Relation Embedding
3.3 Social Relation Mining
3.4 Integrating Entities and Social Relation for Behavior Representation
3.5 Dynamic Link Re-weighting Strategy
3.6 SUPER-COLD Fraud Review Detection
4 Experiments
4.1 Data Sets
4.2 Evaluation Metrics
4.3 Parameters Settings
4.4 Effectiveness on Cold-Start Fraud Detection
4.5 Effectiveness on General Fraud Detection
4.6 Quality of Behavior Representation
4.7 Ablation Study
5 Conclusion
References
Gated Convolutional Encoder-Decoder for Semi-supervised Affect Prediction
1 Introduction
2 Related Work
3 Method
4 Experiments
5 Discussion
6 Conclusion
References
Complaint Classification Using Hybrid-Attention GRU Neural Network
Abstract
1 Introduction
2 Related Work
3 Our Approach
3.1 Character Embedding
3.2 Sentiment Embedding
3.3 Attention Mechanism
3.4 Network Structure
4 Experiments
4.1 Evaluation
4.2 Datasets
4.3 Comparison Models
4.4 Results and Discussion
5 Conclusion
Acknowledgments
References
Spatio-Temporal and Stream Data Mining
FGST: Fine-Grained Spatial-Temporal Based Regression for Stationless Bike Traffic Prediction
1 Introduction
2 Problem Definition and Framework
2.1 Problem Definition
2.2 Framework
3 Methodology
3.1 The Basic Model
3.2 Spatial Correlation
3.3 Temporal Correlation
3.4 Flow Conservation Constraint
3.5 The Unified Optimization Model
3.6 Projection Inference for Traffic Prediction
4 Experiment
4.1 Dataset and Settings
4.2 Evaluation Metrics
4.3 Baseline
4.4 Experiment Results
4.5 Evaluation on Model Components
4.6 Case Study
5 Related Work
6 Conclusion
References
Customer Segmentation Based on Transactional Data Using Stream Clustering
1 Introduction
2 Background
2.1 Customer Segmentation
2.2 Stream Clustering
3 Customer Segmentation Using Stream Clustering
4 Evaluation
4.1 Experimental Setup
4.2 Results
5 Conclusion
References
Spatio-Temporal Event Detection from Multiple Data Sources
1 Introduction
2 Related Work
2.1 Topic Modeling
2.2 Event Extraction from Text
2.3 Geospatial and Temporal Models
3 The Proposed STED Model
3.1 Problem Statement
3.2 Model Definition
3.3 Model Inference
3.4 Priors for Model Initialization
4 Experiments
4.1 Dataset Description and Preprocessing
4.2 Performance Evaluation
4.3 Baseline Methods
4.4 Parameter Setting
4.5 Experimental Results
5 Conclusion
References
Discovering All-Chain Set in Streaming Time Series
Abstract
1 Introduction
2 Related Work
3 Notations and Problem Definition
3.1 Notations
3.2 Problem Definition
4 Proposed Method
4.1 Naive Algorithm
4.2 All-Chain Set Mining Algorithm About Streaming Time Series (ASMSTS)
5 Experiments
5.1 Dataset
5.2 Discovery of All-Chain Set
5.3 Performance
6 Conclusion and Future Work
Acknowledgement
References
Hawkes Process with Stochastic Triggering Kernel
1 Introduction
2 Related Work
3 Proposed Model
3.1 Hawkes Process
3.2 HP with Stochastic Triggering Kernel
3.3 Stability Condition
4 Inference
4.1 Inference with Uniform Triggering Kernel
4.2 Inference with VAE
5 Synthetic Data Experiment
5.1 Homoscedastic Stochastic Triggering Kernel
5.2 Heteroscedastic Stochastic Triggering Kernel
6 Applications
6.1 Datasets and Experiment Setting
6.2 Use Case for HP-STK
7 Conclusion
References
Concept Drift Based Multi-dimensional Data Streams Sampling Method
1 Introduction
2 Related Work
3 Summary for Concept Drift Data Steams
3.1 Probability Sampling for Concept Drift Data Streams
3.2 Change Detection
3.3 Complexity Analysis
4 Experimental Evaluation
4.1 Parameter Settings
4.2 Experiments on Synthetic Data
4.3 Experiments on Real Data
5 Conclusion
References
Spatial-Temporal Multi-Task Learning for Within-Field Cotton Yield Prediction
1 Introduction
2 Related Work
3 Proposed Model
3.1 Overview
3.2 Cotton Yield Prediction
3.3 Spatial Feature in the Loss Function
4 Experiments
4.1 Dataset and Feature Extraction
4.2 Competing Approaches and Comparison Metrics
4.3 Experimental Results
5 Conclusion
References
Factor and Tensor Analysis
Online Data Fusion Using Incremental Tensor Learning
1 Introduction
2 Related Work
3 Online Damage Identification Using Incrementally-Coupled Tensor Learning
3.1 Data Fusion Using Coupled Tensor-Tensor Decomposition
3.2 Incremental Tensor Update
3.3 Online Damage Identification
4 Experimental Results
4.1 Synthetic Data
4.2 Real Bridge Data
5 Conclusion
References
Co-clustering from Tensor Data
1 Introduction
2 From Latent Block Model for 2D Data Matrix to Tensor Data
2.1 Latent Block Model
2.2 Latent Block Model for Tensor Data (TLBM)
3 Variational EM Algorithm
3.1 E-step
3.2 M-step
4 Experimental Results
4.1 Synthetic Datasets
4.2 Competitive Methods
4.3 Recommender System Application
4.4 Multi-spectral Images Analysis
5 Conclusion
A Appendix: Update ik and j i,k,j,
References
A Data-Aware Latent Factor Model for Web Service QoS Prediction
Abstract
1 Introduction
2 Preliminaries
2.1 LF Model
2.2 DPClust Algorithm
3 The Proposed DALF Model
3.1 Extracting LF Matrices for Users and Services
3.2 Identifying Neighborhoods of QoS Data and Detecting Unreliable QoS Data
3.3 Prediction
3.4 Algorithm Design and Analysis
4 Experimental Results
4.1 Datasets
4.2 Evaluation Protocol
4.3 Prediction According to the Characteristics of QoS Data
4.4 Predicting Based on Neighborhoods of Users
4.5 Predicting Based on Reliable Services
4.6 Comparisons Between DALF and Other Models
5 Conclusions
Acknowledgments
References
Keyword Extraction with Character-Level Convolutional Neural Tensor Networks
1 Introduction
2 Keyword Extraction
2.1 Supervised Approaches
2.2 Unsupervised Approaches
2.3 Problem Formulation
3 Proposed CharCNTN Architecture
3.1 Document Model and Word Model
3.2 Learning for Keyword Extraction
3.3 Optimization
4 Experiments
4.1 Datasets and Preprocessing
4.2 Experimental Setup
4.3 Keyphrase Extraction
4.4 Experimental Results
5 Conclusion and Future Work
References
Neural Variational Matrix Factorization with Side Information for Collaborative Filtering
1 Introduction
2 Notation and Problem Definition
3 Neural Variational Matrix Factorization
3.1 Neural Variational Matrix Factorization
3.2 Optimization
3.3 Prediction
4 Experimental Setup
4.1 Dataset
4.2 Baselines and Experimental Settings
4.3 Evaluation Metrics
5 Results and Analysis
6 Conclusion
References
Variational Deep Collaborative Matrix Factorization for Social Recommendation
1 Introduction
2 Notations and Problem Definition
3 Variational Deep Collaborative Matrix Factorization
3.1 The Proposed Model
3.2 Inference
3.3 Prediction
4 Experiments
4.1 Experimental Setup
4.2 Experimental Results and Discussions
5 Conclusion
References
Healthcare, Bioinformatics and Related Topics
Time-Dependent Survival Neural Network for Remaining Useful Life Prediction
1 Introduction
2 Motivation
3 Proposed Approach
3.1 Time-Dependent Survival Neural Network
3.2 RUL-Specific Probability Evaluation
3.3 TSNN Learning
4 Experiments
4.1 Data and Pre-processing
4.2 Competitors
4.3 Evaluation Metrics
4.4 Results and Discussion
5 Conclusions
References
ACNet: Aggregated Channels Network for Automated Mitosis Detection
Abstract
1 Introduction
2 Related Work
3 Methods
3.1 CLBP
3.2 SIFT
3.3 Edge
3.4 Hard Examples Mining
4 Experimental Evaluation
4.1 Dataset
4.2 Implementation Details
4.3 Experimental Results and Comparison
5 Conclusion
Acknowledgments
References
Attention-Based Hierarchical Recurrent Neural Network for Phenotype Classification
1 Introduction
2 Preliminary
2.1 Problem Statement
2.2 A Basic RNN Solution
3 Methodology
3.1 AHRNN Model
3.2 Learning
3.3 Time Complexity Analysis
4 Experimental Evaluation
4.1 Dataset
4.2 Evaluation Metrics
4.3 Compared Methods
4.4 Implementation
4.5 Main Results
4.6 Case Study for Interpretability (RQ3)
5 Related Work
6 Conclusion
References
Identifying Mobility of Drug Addicts with Multilevel Spatial-Temporal Convolutional Neural Network
Abstract
1 Introduction
2 Related Work
2.1 Movement Patterns Mining
2.2 Behavior Understanding
3 Preliminaries
3.1 Problem Formulation
3.2 CNN in Sequence Classification
4 MST-CNN: Multiple Spatial-Temporal Convolutional Neural Network
4.1 Motivation and Overview
4.2 The MST-CNN Framework
5 Experiments
5.1 Data Set and Experimental Platform
5.2 Baselines Setup
5.3 Result Summary
5.4 Parameter Sensitivity
6 Conclusion
References
MC-eLDA: Towards Pathogenesis Analysis in Traditional Chinese Medicine by Multi-Content Embedding LDA
1 Introduction
2 Related Work
3 The Proposed Framework
3.1 Problem Definition
3.2 Multi-Content embedding LDA Model (MC-eLDA)
3.3 Parameter Estimation
3.4 Embedding with Prior Knowledge
4 Experiments
4.1 Experimental Settings
4.2 Pathogenesis Evaluation
4.3 Topic Coherence Evaluation
4.4 Qualitative Evaluation
4.5 Herbs Recommendation Accuracy
5 Conclusion
References
Enhancing the Healthcare Retrieval with a Self-adaptive Saturated Density Function
1 Introduction
2 Related Work
3 Preliminaries
3.1 Information Content
3.2 Kernel Density Functions and Influence Scope
3.3 Unit Influence
4 Methodology
4.1 Saturated Density Function
4.2 Self-adaptive SDF Building Approach
4.3 Actual Influence
4.4 Density Based Weighting Method
5 Experiment
5.1 Datasets and Evaluation Metrics
5.2 Experimental Results
6 Analysis
6.1 Influence of Proximity with Density Functions
6.2 Effectiveness of SDF
6.3 Triangle vs. Cosine
6.4 Comparisons with the State-of-the-Art Approaches
7 Conclusions and Future Work
References
CytoFA: Automated Gating of Mass Cytometry Data via Robust Skew Factor Analzyers
1 Introduction
2 Motivation
3 The CytoFA Algorithm
3.1 Multivariate Skew Normal Distributions
3.2 Finite Mixture Model
3.3 Multivariate Skew Normal Factor Analyzers
3.4 Finite Mixtures of Multivariate Skew Normal Factor Analyzers
3.5 Robust Double Trimming
3.6 The Final Model
4 Parameter Estimation of the CytoFA Model via an EM Algorithm
5 Analysis of High-Dimensional CyTOF Data
5.1 CyTOF-13 Data
5.2 CyTOF-32 Data
6 Conclusions
References
Clustering and Anomaly Detection
Consensus Graph Learning for Incomplete Multi-view Clustering
1 Introduction
2 Related Work
3 Graph-Based Incomplete Multi-view Clustering
3.1 Graph Construction for Incomplete Multi-view Data
3.2 Graph Fusion
4 Optimization Procedure
4.1 Optimization for Constructing Graph
4.2 Optimization for Graph Fusion
5 Experiments
5.1 Experiments on Real-World Data
5.2 Complexity and Convergence Study
6 Conclusions
References
Beyond Outliers and on to Micro-clusters: Vision-Guided Anomaly Detection
1 Introduction
2 Related Work
3 Proposed Model
4 Our Proposed Method
4.1 Water-Level Tree Algorithm
4.2 Tree Explore Algorithm
5 Experiments
5.1 Q1. Anomaly Detection
5.2 Case Study and Found Patterns
5.3 Q2. Summarization Evaluation on Real Data
5.4 Q3. Scalability
6 Conclusions
References
Clustering of Mixed-Type Data Considering Concept Hierarchies
1 Introduction
2 Clustering Mixed Data Types
2.1 Concept Hierarchy
2.2 Cluster-Specific Elements
2.3 Integrative Objective Function
3 Algorithm
4 Related Work
5 Evaluation
5.1 Mixed-Type Clustering of Synthetic Data
5.2 Real Experiments
6 Conclusion
A Probability Adjustment
B MPG
C Adult Dataset
D Open Flights Dataset
References
DMNAED: A Novel Framework Based on Dynamic Memory Network for Abnormal Event Detection in Enterprise Networks
1 Introduction
2 Related Work
3 DMNAED Framework
3.1 Data Preparation Layer
3.2 Representation Layer
3.3 Memory Formation Layer
3.4 Prediction Layer
3.5 Anomaly Detection Layer
4 Experimental Evaluation
4.1 Dataset
4.2 Baselines
4.3 Results and Discussion
5 Conclusion
References
NeoLOD: A Novel Generalized Coupled Local Outlier Detection Model Embedded Non-IID Similarity Metric
1 Introduction
2 Related Work
3 Preliminary
4 NeoLOD Model and Instantiation
4.1 Attribute Structure Learning
4.2 Neo-Based Coupled Similarity
4.3 NeoLOD Model and Instantiation
5 Experiments and Evaluation
5.1 Experiment Environment
5.2 Experiment Design and Evaluation Method
5.3 Data Sets
5.4 Results and Analysis
6 Conclusions and Future Work
References
Dynamic Anomaly Detection Using Vector Autoregressive Model
1 Introduction
2 Preliminary
2.1 Graph Spectral Projections
2.2 Non-randomness Measure
2.3 Vector Autoregression
3 Methodology
3.1 Overview
3.2 Adjusted Node Nonrandomness Measure
3.3 Variable and Model Selection
3.4 Causal Analysis with Granger Causality
4 Empirical Evaluation
4.1 Case Study I
4.2 Case Study II
4.3 Case Study III
5 Summary
References
A Convergent Differentially Private k-Means Clustering Algorithm
1 Introduction
2 Related Work
3 Preliminaries
3.1 Lloyd's k-Means Algorithm
3.2 Differential Privacy
4 Algorithm and Analysis
4.1 Approach Overview
4.2 Preliminary Analysis
4.3 Our Approach and Its Analysis
5 Experimental Evaluation
5.1 Datasets and Configuration
5.2 Experimental Results
6 Conclusion
References
Author Index

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Advances in Knowledge Discovery and Data Mining

Description

More details

Other editions

Additional editions

Persons

Content

System requirements