
Advances in Knowledge Discovery and Data Mining
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
This three-volume set, LNAI 10937, 10938, and 10939, constitutes the thoroughly refereed proceedings of the 22nd Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2018, held in Melbourne, VIC, Australia, in June 2018.
The 164 full papers were carefully reviewed and selected from 592 submissions. The volumes present papers focusing on new ideas, original research results and practical development experiences from all KDD related areas, including data mining, data warehousing, machine learning, artificial intelligence, databases, statistics, knowledge engineering, visualization, decision-making systems and the emerging applications.
More details
Other editions
Additional editions

Content
- Intro
- PC Chairs' Preface
- General Chairs' Preface
- Organization
- Contents - Part II
- Graphical Models, Latent Variables and Statistical Methods
- Probabilistic Topic and Role Model for Information Diffusion in Social Network
- 1 Introduction
- 2 Related Work
- 3 TRM Model
- 3.1 Formulation
- 3.2 Model Description
- 3.3 Model Learning
- 4 Experiments
- 4.1 Experimental Setup
- 4.2 Experimental Results
- 5 Conclusions
- References
- Topic-Sensitive Influential Paper Discovery in Citation Network
- 1 Introduction
- 2 Related Work
- 3 Empirical Observation
- 4 Proposed Model
- 4.1 Generation of Citation Network
- 4.2 Parameter Learning
- 4.3 Initialization and Parallelization
- 5 Experiment
- 5.1 Dataset
- 5.2 Citation Prediction
- 5.3 Finding Influential Papers
- 6 Conclusion
- References
- Course-Specific Markovian Models for Grade Prediction
- 1 Introduction
- 2 Related Work
- 2.1 Academic Performance Prediction
- 2.2 Markovian Models for Educational Data
- 3 Problem Formulation and Notations
- 4 Methods
- 4.1 Hidden Markov Model (HMM)
- 4.2 Hidden Semi-Markov Model (HSMM)
- 4.3 Baseline Methods
- 5 Experiments
- 5.1 Dataset Description and Preprocessing
- 5.2 Evaluation Metrics
- 6 Results and Discussion
- 6.1 Comparative Performance
- 6.2 Case Study: At-Risk Students
- 7 Conclusions
- References
- A Temporal Topic Model for Noisy Mediums
- 1 Introduction
- 2 Related Work
- 3 Background and Definitions
- 4 Topic Flow Model
- 4.1 TFM Overview
- 4.2 Identifying Important Terms
- 4.3 Semantic Graph Construction
- 4.4 Finding Topics
- 5 Empirical Evaluation
- 5.1 Data Sets
- 5.2 Synthetic Data Evaluation
- 5.3 Twitter and Newspaper Evaluation
- 5.4 TFM Flood Words Evaluation
- 5.5 Execution Time Comparison: TFM and Cataldi et al.
- 6 Conclusion
- References
- A CRF-Based Stacking Model with Meta-features for Named Entity Recognition
- 1 Introduction
- 2 Related Work
- 3 Model
- 3.1 Stacking
- 3.2 Stacking with Meta-features
- 3.3 Stacking with Joint Meta-Features
- 3.4 Stacking with Local Embedding Features
- 4 Experiment
- 4.1 Dataset and Evaluation
- 4.2 Training
- 4.3 Overall Results
- 4.4 Effectiveness of Our Model and Meta-features
- 4.5 Our Model with the Existing Level-0 Classifiers
- 5 Conclusion
- References
- Adding Missing Words to Regular Expressions
- 1 Introduction
- 2 Related Work
- 3 Preliminaries
- 4 Repairing Regular Expressions
- 5 Experiments
- 5.1 Setup
- 5.2 Experimental Evaluation and Results
- 6 Conclusion
- References
- Marrying Community Discovery and Role Analysis in Social Media via Topic Modeling
- 1 Introduction
- 2 Background
- 2.1 Observed Social-Media Properties: Topology and Messages
- 2.2 Unobserved Social-Media Properties: Topics, Affiliations, Roles and Communities
- 2.3 Problem Statement
- 3 NOODLES
- 4 Posterior Inference
- 5 Tasks
- 5.1 Exploratory Network Analysis
- 5.2 Predictive Analysis
- 5.3 Descriptive Analysis
- 6 Experimental Evaluation
- 6.1 Data Sets and Competitors
- 6.2 Quantitative Evaluation
- 6.3 Qualitative Evaluation
- 7 Conclusions
- References
- Text Generation Based on Generative Adversarial Nets with Latent Variables
- 1 Introduction
- 2 Preliminary
- 2.1 LSTM Architecture
- 2.2 Variational Autoencoder
- 2.3 Generative Adversarial Nets
- 3 Model Description
- 3.1 The Generative Model of VGAN
- 3.2 Adversarial Training of VGAN
- 4 Experimental Studies
- 4.1 Training Details
- 4.2 Results and Evaluation
- 5 Conclusions
- References
- GEMINIO: Finding Duplicates in a Question Haystack
- Abstract
- 1 Introduction
- 2 Description
- 2.1 Relation Aided Duplicate Question Detection (REL-DQD)
- 2.2 BraidNet Duplication Question Detection Scheme (BraidNet-DQD)
- 3 Experiments
- 3.1 Dataset
- 3.2 Experimental Setup
- 3.3 Experimental Results
- 4 Related Work
- 4.1 Similar Question Retrieval
- 4.2 Natural Language Sentence Matching
- 5 Conclusion and Future Work
- References
- Fast Converging Multi-armed Bandit Optimization Using Probabilistic Graphical Model
- 1 Introduction
- 2 Model Definition
- 2.1 General Function Approximation
- 2.2 Graphical Model Representation
- 2.3 Graphical Model Inference
- 2.4 Review on the Multi-armed Bandit Problem
- 3 Graphical Model Learning
- 3.1 Decision Making Policy
- 3.2 Experiments - Learning Synthetic Data
- 4 Test Bench Experiments - Policy Variation
- 5 Conclusion
- References
- Leveraging Label Category Relationships in Multi-class Crowdsourcing
- 1 Introduction
- 2 Related Work
- 3 Problem Formulation and Proposed Model
- 4 Parameter Estimation
- 5 Experiments and Results
- 5.1 True Label Prediction
- 5.2 True Label Prediction Under Response Sparsity
- 5.3 Consistency of Learned Relatedness Between Categories
- 6 Conclusion
- References
- Embedding Knowledge Graphs Based on Transitivity and Asymmetry of Rules
- 1 Introduction
- 2 Related Work
- 3 Problem Definition
- 4 Our Model
- 4.1 Restricted Triple Model (RTM)
- 4.2 Approximate Order Logic Model (AOLM)
- 4.3 Global Objective
- 4.4 Discussions
- 5 Experiments
- 5.1 Datasets and Experiment Settings
- 5.2 Knowledge Base Completion
- 5.3 Relational Learning
- 6 Conclusion and Future Work
- References
- Representation Learning and Embedding
- SIGNet: Scalable Embeddings for Signed Networks
- 1 Introduction
- 2 Problem Formulation
- 3 Scalable Embedding of Signed Networks (SIGNet)
- 3.1 SIGNet for Undirected Networks
- 3.2 SIGNet for Directed Networks
- 3.3 Efficient Optimization by Targeted Node Sampling
- 4 Experiments
- 5 Other Related Work
- 6 Conclusion
- References
- Sub2Vec: Feature Learning for Subgraphs
- 1 Introduction
- 2 Problem Formulation
- 3 Our Methods
- 3.1 Overview
- 3.2 Subgraph Truncated Random Walks
- 3.3 Sub2Vec-DM
- 3.4 Sub2Vec-DBON
- 3.5 Algorithm
- 4 Experiments
- 4.1 Community Detection
- 4.2 Graph Classification
- 4.3 Case Studies
- 5 Related Work
- 6 Conclusions and Discussion
- References
- Interaction Content Aware Network Embedding via Co-embedding of Nodes and Edges
- 1 Introduction
- 2 Related Work
- 3 Preliminaries
- 4 Model Development
- 4.1 Node Representation Learning
- 4.2 Edge Representation Learning
- 4.3 Joint Learning
- 5 The Optimization
- 6 Empirical Evaluation
- 6.1 Datasets
- 6.2 Experiment Settings
- 6.3 Representation Visualization
- 6.4 Link Prediction
- 6.5 Multi-label Classification
- 6.6 Multi-class Classification
- References
- MetaGraph2Vec: Complex Semantic Path Augmented Heterogeneous Network Embedding
- 1 Introduction
- 2 Preliminaries and Problem Definition
- 3 Methodology
- 3.1 MetaGraph Guided Random Walk
- 3.2 MetaGraph2Vec Embedding Learning
- 4 Experiments
- 4.1 Experimental Settings
- 4.2 Node Classification Results
- 4.3 Node Clustering Results
- 4.4 Node Similarity Search
- 4.5 Parameter Sensitivity
- 5 Conclusions and Future Work
- References
- Multi-network User Identification via Graph-Aware Embedding
- 1 Introduction
- 2 Related Work
- 3 Proposed Method
- 3.1 Notations
- 3.2 Graph-Aware Embedding (GAEM)
- 3.3 Optimization
- 4 Experiment
- 4.1 Datasets and Configurations
- 4.2 Experiment Results
- 5 Conclusion
- References
- Knowledge-Based Recommendation with Hierarchical Collaborative Embedding
- 1 Introduction
- 2 Preliminary
- 2.1 Implicit Feedback Recommendation
- 2.2 Knowledge Graph
- 2.3 Problem Definition
- 3 Hierarchical Collaborative Embedding
- 3.1 Knowledge Graph Structured Embedding
- 3.2 Knowledge Conceptual Level Connection
- 3.3 Collaborative Learning
- 4 Experiment
- 4.1 Dataset
- 4.2 Baselines
- 4.3 Comparison
- 5 Related Work
- 5.1 Knowledge Graph Structured Embedding
- 5.2 Collaborative Filtering Using Implicit Feedback
- 6 Conclusions
- References
- DPNE: Differentially Private Network Embedding
- 1 Introduction
- 2 Preliminaries
- 2.1 Network Embedding
- 2.2 Differential Privacy
- 3 Differentially Private Network Embedding
- 3.1 Differentially Private Network Embedding (DPNE)
- 3.2 DPNE vs. Other DP-Preserving Embedding Approaches
- 4 Evaluation
- 4.1 Vertex Classification Task
- 4.2 Link Prediction Task
- 5 Conclusions and Future Work
- References
- A Generalization of Recurrent Neural Networks for Graph Embedding
- 1 Introduction
- 2 Framework of Graph Recurrent Neural Network
- 2.1 Subgraph Extraction
- 2.2 G-RNN Training
- 3 Application of G-RNN in Knowledge Graph
- 4 Experiments
- 4.1 Link Prediction
- 4.2 Node Classification
- 5 Related Work
- 6 Conclusion
- References
- NE-FLGC: Network Embedding Based on Fusing Local (First-Order) and Global (Second-Order) Network Structure with Node Content
- 1 Introduction
- 2 Related Work
- 3 Problem Formulation
- 4 Our Model
- 4.1 Structure-Based Module
- 4.2 Text-Based Module
- 4.3 Training
- 4.4 Context-Enhance
- 5 Experiments
- 5.1 Dataset
- 5.2 Experimental Settings
- 5.3 Multi-class Classification
- 5.4 Link Prediction
- 5.5 Parameter Sensitivity
- 5.6 Network Visualization
- 6 Conclusion and Future Work
- References
- Semi-structured Data and NLP
- Category Multi-representation: A Unified Solution for Named Entity Recognition in Clinical Texts
- 1 Introduction
- 2 Problem Definition
- 3 Our Proposed Approach
- 3.1 Generating Semantic Space
- 3.2 Category Multi-representation
- 3.3 Generating CMR Features
- 4 Experiments
- 4.1 Data Sets
- 4.2 Our Models and Parameter Settings
- 4.3 Threshold Settings for Determining CMR Features
- 4.4 Comparison with Different CMR Features
- 4.5 Comparison with Previous Systems
- 5 Related Works
- 6 Conclusion and Future Work
- References
- A Heterogeneous Information Network Method for Entity Set Expansion in Knowledge Graph
- 1 Introduction
- 2 Preliminary
- 3 The Method Description
- 3.1 Random Walk Based Concatenated Meta Path Generation Method
- 3.2 Multi-Type-Constrained Meta Path Extraction and Similarity Calculation
- 3.3 Weight Learning of Meta Paths
- 4 Experiment
- 4.1 Experiment Settings
- 4.2 Effectiveness Experiments
- 4.3 Efficiency Study
- 5 Conclusion
- References
- Identifying In-App User Actions from Mobile Web Logs
- 1 Introduction
- 2 Related Work
- 3 Transaction Identification
- 3.1 Definition
- 3.2 Identifying Transactions
- 4 Experiment
- 4.1 Experiment Environment
- 4.2 Measurement Metrics
- 4.3 Parameters
- 4.4 Experiment Results
- 5 Conclusion
- References
- Harvesting Knowledge from Cultural Heritage Artifacts in Museums of India
- 1 Introduction
- 2 Related Work
- 3 Harvesting Data from MOI MOI into CultKB
- 4 Evaluation of CultKB
- 5 Conclusion
- References
- Query-Based Automatic Training Set Selection for Microblog Retrieval
- 1 Introduction
- 2 Related Works
- 3 The Proposed Model
- 3.1 Determining Top-k Documents
- 3.2 Expansion Terms Selection
- 4 Experiments and Results
- 4.1 Datasets
- 4.2 Experiment Metrics
- 4.3 Parameter Tuning
- 4.4 Baseline Algorithms
- 4.5 Experimental Results
- 4.6 Discussions
- 5 Conclusions
- References
- Distributed Representation of Multi-sense Words: A Loss Driven Approach
- 1 Introduction
- 2 Definitions, Notations and Background
- 3 Prior Approaches for Dealing with Multi-sense Words
- 4 Loss Driven Multisense Identification (LDMI)
- 4.1 Identifying the Words with Multiple Senses
- 4.2 Clustering the Occurrences
- 4.3 Putting Everything Together
- 5 Experimental Methodology
- 5.1 Datasets
- 5.2 Evaluation Methodology and Metrics
- 6 Results and Discussion
- 6.1 Quantitative Analysis
- 6.2 Qualitative Analysis
- 7 Conclusion
- References
- Active Blocking Scheme Learning for Entity Resolution
- 1 Introduction
- 2 Related Work
- 3 Problem Definition
- 4 Active Scheme Learning Framework
- 4.1 Active Sampling
- 4.2 Active Branching
- 4.3 Algorithm Description
- 5 Experiments
- 5.1 Experimental Setup
- 5.2 Experimental Results
- 6 Conclusions
- References
- Mining Relations from Unstructured Content
- 1 Introduction
- 2 Related Work
- 3 Relation Classification
- 3.1 Models, Data Representations and Parameter Choices
- 3.2 Active Learning by Pruning
- 4 Experiments
- 4.1 Datasets
- 4.2 Fixed Active Learning Strategy VS Dynamic Selection
- 5 Conclusions and Future Work
- References
- Incorporating Word Embeddings into Open Directory Project Based Large-Scale Classification
- 1 Introduction
- 2 Preliminary
- 2.1 ODP-Based Knowledge Representation
- 2.2 Word2Vec
- 3 Joint Models of Explicit and Implicit Representation
- 3.1 Generating Category Vector with Algebraic Operation
- 3.2 Generating Category Vector with Embedding
- 4 Semantic Similarity Measure
- 4.1 Using Word-Level Semantics
- 4.2 Using Category- and Word-Level Semantics
- 5 Experiments
- 5.1 Datasets
- 5.2 Evaluation Metrics
- 5.3 Experimental Setup
- 5.4 Experimental Results
- 5.5 Analysis
- 6 Related Work
- 7 Conclusion
- References
- Inference of a Concise Regular Expression Considering Interleaving from XML Documents
- 1 Introduction
- 2 Preliminaries
- 3 Inference Algorithm GenICHARE
- 4 Experiments
- 4.1 Usage of ICHARE in Practice
- 4.2 Analysis of Inference Results Among Different Methods
- 5 Conclusion and Future Work
- References
- Spatial-Temporal, Time-Series and Stream Mining
- Accelerating Adaptive Online Learning by Matrix Approximation
- 1 Introduction
- 2 Related Work
- 3 Main Results
- 3.1 Problem Setting
- 3.2 The Proposed ADA-GP Method
- 3.3 The Proposed ADA-DP Method
- 4 Experiments
- 4.1 Online Convex Optimization
- 4.2 Non-convex Optimization in Convolutional Neural Networks
- 5 Conclusions and Future Work
- References
- Cruising or Waiting: A Shared Recommender System for Taxi Drivers
- 1 Introduction
- 2 Related Work
- 3 Definition and Problem
- 4 Online Shared Recommender System
- 4.1 Overview
- 4.2 Offline Optimal Policy Learning
- 4.3 Online Recommendation
- 5 Experiments
- 5.1 Experimental Setup
- 5.2 Parameter Tuning
- 5.3 Comparison Results
- 6 Conclusion
- References
- A Local Online Learning Approach for Non-linear Data
- 1 Introduction
- 2 Related Work
- 2.1 Global Methods
- 2.2 Local Methods
- 3 SCW Local Online Learning
- 3.1 Preliminary
- 3.2 Model
- 3.3 Online Clustering
- 3.4 Algorithm
- 4 Evaluation
- 4.1 Environment Setup
- 4.2 Results
- 5 Conclusion
- References
- Contextual Location Imputation for Confined WiFi Trajectories
- 1 Introduction
- 2 Related Work
- 2.1 Collaborative filtering in location recommendation
- 2.2 Social Ties and Mobility Patterns
- 3 Problem Description
- 4 Matrix Factorization for Location Imputation
- 4.1 Location Imputation with Implicit Feedback
- 4.2 Contextual Imputation
- 4.3 Implicit Social Ties and GNMF
- 5 Experiments
- 5.1 Dataset and Experimental Setup
- 5.2 Results
- 6 Conclusion
- References
- Low Redundancy Estimation of Correlation Matrices for Time Series Using Triangular Bounds
- 1 Introduction
- 2 Related Work
- 3 Low Redundancy Estimation
- 3.1 Preliminaries
- 3.2 Problem Statement
- 4 COREQ
- 4.1 Approximations with Quality Guarantees
- 4.2 A Greedy Estimation Algorithm
- 4.3 Formal Relation to Clustering Algorithms
- 5 Empirical Evaluation
- 5.1 Experimental Setup
- 5.2 Quality of Estimates
- 5.3 Comparison with Existing Methods
- 6 Conclusion and Future Work
- References
- Traffic Accident Detection with Spatiotemporal Impact Measurement
- 1 Introduction
- 2 Related Work
- 3 Preliminaries
- 4 Traffic Accident Classification with Traffic MTS
- 4.1 A Discrete Unsupervised Solution: IIG
- 4.2 Severity Features Based Solution (Severity-I)
- 5 Empirical Evaluation
- 5.1 Experiments Settings
- 5.2 Parameter Effects in Impact Based Approaches
- 5.3 Comparison of Different Approaches
- 6 Conclusions
- References
- MicroGRID: An Accurate and Efficient Real-Time Stream Data Clustering with Noise
- 1 Introduction
- 2 Related Work
- 3 The MicroGRID Approach
- 3.1 The Micro-cluster Structure
- 3.2 The Proposed Grid Clustering
- 4 Experiments and Analysis
- 5 Conclusion
- References
- UFSSF - An Efficient Unsupervised Feature Selection for Streaming Features
- 1 Introduction
- 2 Related Work
- 3 The UFSSF Approach
- 3.1 The UFSSF Model
- 4 Experimental Evaluation
- 5 Conclusion
- References
- Online Clustering for Evolving Data Streams with Online Anomaly Detection
- 1 Introduction
- 2 Related Work
- 3 Problem Definition
- 4 Methodology
- 4.1 Step 1 (Cluster Update Rule):
- 4.2 Step 2 (Detecting Emerging Clusters)
- 5 Evaluation
- 6 Conclusion and Future Directions
- References
- An Incremental Dual nu-Support Vector Regression Algorithm
- Abstract
- 1 First Section
- 1.1 A Subsection Sample
- 2 An Incremental Dual-v-SVR
- 2.1 The Formulation
- 2.2 KKT Conditions
- 2.3 Incremental and Decremental Adjustment
- 3 Experiment Result
- 3.1 Experiment Setup
- 3.2 Performance Evaluation
- 4 Concluding Remarks
- References
- Text Stream to Temporal Network - A Dynamic Heartbeat Graph to Detect Emerging Events on Twitter
- 1 Introduction
- 2 Preliminaries
- 3 Dynamic Heartbeat Graph (DHG)
- 3.1 Network Series
- 3.2 DHG Series
- 3.3 Event Detection Method
- 4 Experiment and Results
- 4.1 Evaluation
- 4.2 Dataset
- 4.3 Results
- 5 Conclusion
- References
- Model the Dynamic Evolution of Facial Expression from Image Sequences
- 1 Introduction
- 2 Related Work
- 3 Our Approach
- 3.1 Facial Appearance Convolutional Recurrent Network
- 3.2 Facial Geometry Recurrent Network
- 3.3 Integration Method
- 4 Experiments
- 4.1 Databases
- 4.2 Evaluation of the FACRN
- 4.3 Evaluation of the FGRN
- 4.4 Classification Results
- 5 Conclusion
- References
- Unsupervised Disaggregation of Low Granularity Resource Consumption Time Series
- 1 Introduction
- 2 Related Work
- 3 Model Formulation and Disaggregation Algorithms
- 3.1 Method Description
- 4 Evaluation
- 4.1 Baseline
- 4.2 Water Consumption Dataset
- 4.3 Energy Consumption Dataset
- 5 Conclusion
- References
- STARS: Soft Multi-Task Learning for Activity Recognition from Multi-Modal Sensor Data
- 1 Introduction
- 2 Preliminaries
- 3 Methodology
- 3.1 Multi-class Learning with Softmax Regression
- 3.2 Proposed Method: STARS
- 3.3 Optimization
- 4 Experimental Evaluation
- 4.1 Baseline Algorithms
- 4.2 Experimental Results
- 5 Related Work
- 6 Conclusion
- References
- A Refined MISD Algorithm Based on Gaussian Process Regression
- 1 Introduction
- 2 Related Work
- 3 Proposed Model
- 3.1 Hawkes Process
- 3.2 MISD
- 3.3 GP-MISD
- 4 Experiment
- 4.1 Synthetic Data
- 4.2 Real Data
- 5 Conclusion
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.