
Advances in Knowledge Discovery and Data Mining
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Content
- Title
- Preface
- Organization
- Table of Contents
- Graph Mining
- Spectral Analysis of k-Balanced Signed Graphs
- Introduction
- Notation
- The Spectral Property of k-Balanced Graph
- Non-negative Block-Wise Diagonal Matrix
- A General Perturbation Result
- Moderate Inter-community Edges
- Increase the Magnitude of Inter-community Edges
- Unbalanced Signed Graph
- Evaluation
- Synthetic Balanced Graph
- Synthetic Unbalanced Graph
- Comparison with Laplacian Spectrum
- Related Work
- Conclusion
- References
- Spectral Analysis for Billion-Scale Graphs: Discoveries and Implementation
- Introduction
- Discoveries
- Spotting Near-Cliques
- Triangle Counting
- Background - Sequential Algorithms
- Proposed Method
- Summary of the Contributions
- Careful Algorithm Choice
- Selective Parallelization
- Blocking
- Exploit Skewness: Matrix-Vector Multiplication
- Exploiting Skewness: Matrix-Matrix Multiplication
- Analysis
- Performance
- Scalability
- Optimizations
- Related Works
- Conclusion
- References
- LGM: Mining Frequent Subgraphs from Linear Graphs
- Introduction
- Preliminaries
- Enumeration of Linear Subgraphs
- Reduction Map
- Frequent Pattern Mining
- Complexity Analysis
- Experiments
- Motif Extraction from Protein 3D Structures
- Conclusion
- References
- Efficient Centrality Monitoring for Time-Evolving Graphs
- Introduction
- Problem Motivation
- Related Work
- Preliminary
- Centrality Monitoring
- Ideas Behind Sniper
- Node Aggregation
- Tree Estimation
- Search Algorithm
- Extension
- Directed or Weighted Graphs
- Other Types of Queries
- Theoretical Analysis
- Experimental Evaluation
- Efficiency of Sniper
- Exactness of the Search Results
- Conclusions
- References
- Graph-Based Clustering with Constraints
- Introduction
- Our Contributions
- Relevant Background
- Preliminaries
- Definitions
- Graph-Based Hierarchical Clustering
- Constrained Graph-Based Clustering
- Embedding Constraints
- The Proposed Algorithm
- Experimental Results
- Conclusion
- References
- Social Network/Online Analysis
- A Partial Correlation-Based Bayesian Network Structure Learning Algorithm under SEM
- Introduction
- PCB Algorithm
- Restrict Step
- Search Step
- Time Complexity of PCB Algorithm
- Experimental Results
- Networks, Datasets and Measures of Performance
- Experimental Results and Analyses
- Conclusions and Future Work
- References
- Predicting Friendship Links in Social Networks Using a Topic Modeling Approach
- Introduction
- Related Work
- Topic Modeling and Latent Dirichlet Allocation (LDA)
- System Architecture
- Interest Based Features
- Graph Based Features
- Experimental Design and Results
- Dataset Description and Preprocessing
- Experiments
- Results
- Summary and Discussion
- References
- Info-Cluster Based Regional Influence Analysis in Social Networks
- Introduction
- Related Work
- Frameworks
- Modeling the Social Network Data
- Problem Formulation
- Framework of Our Solutions
- Algorithms
- Clustering
- Info-Cluster Detection
- Experiments
- Conclusion
- References
- Utilizing Past Relations and User Similarities in a Social Matching System
- Introduction
- The Proposed Social Matching Method
- Empirical Analysis
- Dataset: The Online Dating Network
- Evaluation Criteria
- Results and Discussion
- Conclusion
- References
- On Sampling Type Distribution from Heterogeneous Social Networks
- Introduction
- Related Work
- Problem Statement
- Sampling Algorithms
- Random-Based Sampling
- Chain-Referral Sampling
- Respondent-Driven Sampling
- Evaluation
- Twitter Data Sets
- Evaluation Index
- Results of the Type Distribution Preserving Goal
- Results of the Intra-Relationship Preserving Goal
- Analysis on the Effects of the Number of Groups and the Sample Size
- Conclusion
- References
- Ant Colony Optimization with Markov Random Walk for Community Detection in Graphs
- Introduction
- Algorithm
- The Main Idea
- Algorithm Description
- Parameter Setting
- Experiments
- Computer-Generated Networks
- Real-World Networks
- Parameters Analysis
- Conclusions and Future Work
- References
- Time Series Analysis
- Faster and Parameter-Free Discord Search in Quasi-Periodic Time Series
- Introduction
- Time Series Discords
- Direct Discord Search
- Efficient Way to Estimate
- Quasi-Periodic Time Series
- Implementation of the Search Strategy
- Empirical Evaluation
- Conclusions and Future Work
- References
- INSIGHT: Efficient and Effective Instance Selection for Time-Series Classification
- Introduction
- Related Work
- Score Functions in INSIGHT
- The Hubness Property
- Score Functions Based on Hubness
- Coverage and Instance Selection
- Coverage Graphs for Instance-Based Learning Methods
- 1-NN Coverage Graphs
- Experiments
- Conclusion and Outlook
- References
- Multiple Time-Series Prediction through Multiple Time-Series Relationships Profiling and Clustered Recurring Trends
- Introduction
- Global and Local Model of Time-Series Prediction
- Local Modeling of Multiple Time-Series Data
- Extracting Profiles of Relationship of Multiple Time-Series
- Clustering Recurring Trends of a Time-Series
- Knowledge Repository and Multiple Time-Series Prediction
- Experiments and Evaluation of Results
- New Zealand Climate Data
- Prediction Accuracy Evaluation
- Conclusion and Future Work
- References
- Probabilistic Feature Extraction from Multivariate Time Series Using Spatio-Temporal Constraints
- Introduction
- Related Work
- Methodology
- Gaussian Process Latent Variable Model
- Spatio-Temporal Gaussian Process Latent Variable Model
- Validation of ST-GPLVM Approach
- Application of ST-GPLVM to Activity Recognition
- Conclusion
- References
- Sequence Analysis
- Real-Time Change-Point Detection UsingSequentially Discounting Normalized Maximum Likelihood Coding
- Introduction
- Motivation
- Purpose and Significances of This Paper
- Related Works
- Sequentially Discounting Normalized Maximum Likelihood Coding
- Proposed Method
- Empirical Evaluation for Artificial Datasets
- Methods to Be Compared
- Discontinuous Change-Point Detection
- Continuous Change-Point Detection
- Applications to Malware Detection
- Conclusion
- References
- Compression for Anti-Adversarial Learning
- Introduction
- Context-Based Data Compression Model-Prediction by Partial Matching
- Compression-Based Classification and Adversarial Attacks
- Robust Classification via Subsequence Differentiation
- Experimental Results
- Concluding Remarks
- References
- Mining Sequential Patterns from Probabilistic Databases
- Introduction
- Problem Statement
- Computing Expected Support
- Optimizations
- Candidate Generation
- Experimental Evaluation
- Conclusions and Future Work
- References
- Large Scale Real-Life Action Recognition Using Conditional Random Fields with Stochastic Training
- Introduction
- Related Work and Motivations
- Conditional Random Fields
- Stochastic Gradient Descent
- Averaged SGD with Feedback
- Experiments and Discussion
- How to Design and Implement Good Features
- Experimental Setting
- Results and Discussion
- A Challenge in Real-Life Action Recognition: Axis Rotation
- Conclusions and Future Work
- References
- Packing Alignment: Alignment for Sequences of Various Length Events
- Introduction
- Packing Aliment of Event Sequences
- Properties of Packing Alignment
- Relation to Edit Distance
- Comparison with General String Alignment
- Comparison with DTW
- Gap Constraint
- Experiments
- Frequent Approximate Pattern Extraction
- Running Time Comparison
- Musical Variation Extraction Experiment
- Concluding Remarks
- References
- Outlier Detection
- Multiple Distribution Data Description Learning Algorithm for Novelty Detection
- Introduction
- Single Hypersphere Approach: SVDD
- Proposed Multiple Hypersphere Approach
- Problem Formulation
- Calculating Radii, Centres and Slack Variables
- Calculating Membership U
- Iterative Learning Process
- Experimental Results
- Conclusion
- References
- RADAR: Rare Category Detection via Computation of Boundary Degree
- Introduction
- Problem Formalization
- RADAR Algorithm
- Working Principle
- Algorithm
- Performance Evaluation
- Synthetic Data Sets
- Real Date Sets
- Conclusion
- References
- RKOF: Robust Kernel-Based Local Outlier Detection
- Introduction
- Main Framework
- Robust Kernel-Based Outlier Factor (RKOF)
- Choice of Kernel Functions
- Robustness and Computation Complexity of RKOF
- Experiments
- Synthetic Data
- Real Data
- Conclusions
- References
- Chinese Categorization and Novelty Mining
- Introduction
- Related Work
- Preprocessing for English and Chinese
- English
- Chinese
- Categorization
- Novelty Mining
- Mixed Metric on Chinese Novelty Mining
- Evaluation Measures
- Novelty Evaluation Measures
- Experiments and Results
- Dataset
- Effect of Preprocessing Rules on Chinese Novelty Mining
- Chinese Novelty Mining Using Mixed Metric
- Categorization in English and Chinese
- Novelty Mining Based on Categorization
- Conclusion
- References
- Finding Rare Classes: Adapting Generative and Discriminative Models in Active Learning
- Introduction
- Adaptive Active Learning
- Active Learning
- Generative-Discriminative Model Pairs
- Combining Active Query Criteria
- Adaptive Selection of Classifiers
- Experiments
- Conclusion
- References
- Imbalanced Data Analysis
- Margin-Based Over-Sampling Method for Learning from Imbalanced Datasets
- Introduction
- Related Works
- Large Margin Principle Analysis for Over-Sampling
- The Margin-Guided Synthetic Over-Sampling Algorithm
- Experiment Study
- Synthetic Datasets
- Real World Problems
- Conclusion and Future Work
- References
- Improving k Nearest Neighbor with Exemplar Generalization for Imbalanced Classification
- Introduction
- Related Work
- Main Ideas
- Pivot Positive Instances
- k Exemplar-Based Nearest Neighbor Classification
- Experiments
- Performance Evaluation Using AUC
- The ROC Convex Hull Analysis
- The Impact of Confidence Level on kENN
- Conclusions
- References
- Sample Subset Optimization for Classifying Imbalanced Biological Data
- Introduction
- Ensemble System
- Sample Subset Optimization
- Formulation of Sample Subset Optimization
- Analysis of Behavior
- Base Classifier and Fitness Function
- Base Classifier of Support Vector Machine
- Fitness Function
- Experimental Results
- Datasets
- Performance Comparison
- Conclusion
- References
- Class Confidence Weighted kNN Algorithms for Imbalanced Data Sets
- Introduction
- Existing kNN Classifiers
- Handling Imbalanced Data
- CCW Weighted kNN
- Justification of CCW
- Estimations of CCW Weights
- Mixture Models
- Bayesian Networks
- Experiments and Analysis
- Comparisons among NN Algorithms
- Comparisons among kNN Algorithms
- Effects of Distance Metrics
- Conclusions and Future Work
- References
- Agent Mining
- Multi-agent Based Classification Using Argumentation from Experience
- Introduction
- Argumentation-Based Multi Agent Classification: The PISA Framework
- PISA Dynamic CAR Mining
- Applications of PISA
- Application 1: PISA-Based Classification
- Application 2: PISA-Based Ordinal Classification
- Application 3: PISA-Based Solution to the Imbalanced Class Problem
- Conclusions
- References
- Agent-Based Subspace Clustering
- Introduction
- Background and Related Works
- Agent-Based Subspace Clustering
- Problem Statement
- Agent-Based Subspace Clustering
- Algorithm
- Experiments
- Data and Evaluation Criteria
- Experimental Results
- A Case Study
- Conclusion and Future Work
- References
- Evaluation (Similarity, Ranking, Query)
- Evaluating Pattern Set Mining Strategies in a Constraint Programming Framework
- Introduction
- Pattern Set Mining Task
- Constraint Programming Framework
- Constraint Programming Notation
- Two-Step Pattern Set Mining
- One-Step Pattern Set Mining
- Experiments
- Two-Step Pattern Set Mining
- One-Step Pattern Set Mining
- Conclusions
- References
- Asking Generalized Queries with Minimum Cost
- Introduction
- Related Work
- Algorithm for Asking Generalized Queries
- Constructing Generalized Queries
- Updating Learning Model
- Empirical Study
- Experimental Configurations
- Results for Balancing Acc./Cost Trade-Off
- Results for Minimizing Total Cost
- Approximate Probabilistic Answers
- Conclusion
- References
- Ranking Individuals and Groups by Influence Propagation
- Introduction
- The Motivation
- Ranking Nodes and Groups
- IPRanking Nodes
- IPRanking Groups
- Experimental Study
- Related Work
- Conclusion
- References
- Dynamic Ordering-Based Search Algorithm for Markov Blanket Discovery
- Introduction
- Background and Notations
- DOS Algorithm
- Algorithm Formulation
- Theoretical Analysis
- Experimental Results
- Related Work
- Conclusion and Future Work
- References
- Mining Association Rules for Label Ranking
- Introduction
- Label Ranking
- Association Rules Mining
- Pruning
- Class Association Rules
- Association Rules for Label Ranking
- Similarity-Based Support and Confidence
- APRIORI-LR Algorithm
- Parameter Tuning
- Experimental Results
- Results
- Conclusions
- References
- Tracing Evolving Clusters by Subspace and Value Similarity
- Introduction
- Related Work
- A Novel Tracing Model
- Tracing of Behavior Types
- Cluster Distance Measure
- Clustering for Improved Tracing Quality
- Experiments
- Conclusion
- References
- An IFS-Based Similarity Measure to Index Electroencephalograms
- Introduction
- Background
- Fractal Interpolation
- K-Medoid Clustering
- Choice of Number of Clusters
- An IFS-Based Similarity Measure
- Fractal Interpolation Step
- Fractal Dimensions Estimation
- Similarity Matrix Computation
- Description of the Dataset and Experiments
- Results
- Conclusion
- References
- DISC: Data-Intensive Similarity Measure for Categorical Data
- Introduction
- Key Contributions
- Related Work
- Problem Formulation
- DISC Algorithm
- Motivation and Design
- Data Structure Description
- Algorithm Overview
- DISC Computation
- Validity of Similarity Measure
- Experimental Study
- Pre-processing and Experimental Settings
- Experimental Results
- Discussion of Results
- Conclusion
- References
- ListOPT: Learning to Optimize for XML Ranking
- Introduction
- Related Work
- Learning-to-Rank
- Ranking Function BM25
- ListOPT: A Learning-to-Optimize Approach
- BM25 in XML Retrieval
- Training Process
- Loss Functions
- Cosine Similarity
- Euclidean Distance
- Cross Entropy
- Experiment
- Data Collection
- Effect of BM25 Tuning
- Number of Training Queries
- Conclusions and Future Work
- References
- Item Set Mining Based on Cover Similarity
- Introduction
- Frequent Item Set Mining
- Jaccard Item Sets
- The Eclat Algorithm
- The JIM Algorithm (Jaccard Item Set Mining)
- Other Similarity Measures
- Experiments
- Conclusions
- References
- Applications
- Learning to Advertise: How Many Ads Are Enough?
- Introduction
- Problem Definition
- Data Insight Analysis
- Data Set
- Position vs. Click-Through Rate (CTR)
- Click Entropy
- Ad Ranking and Number Prediction
- Basic Idea
- Learning Algorithm
- Feature Definition
- Experimental Results
- Evaluation, Baselines and Experiment Setting
- Results and Analysis
- Feature Contribution Analysis
- Related Work
- Conclusion
- References
- TeamSkill: Modeling Team Chemistry in Online Multi-player Games
- Introduction
- Related Work
- Proposed Approaches
- TeamSkill-K
- TeamSkill-AllK
- TeamSkill-AllK-EV
- TeamSkill-AllK-LS
- Dataset
- Experimental Analysis
- Findings and Analysis
- Discussion
- Conclusions
- References
- Learning the Funding Momentum of Research Projects
- Introduction
- Basic Models of Topic Popularity
- Modeling Funding Momentum
- Related Work
- Funding Momentum
- Technical Analysis Indicators of Momentum
- Funding Momentum for Research Topics
- Funding Momentum of Research Projects: Percentage Model
- Methods
- Experimental Results
- Analyzing Bursts for Research Topics and Projects
- Experimental Validation of the Funding Momentum Definition
- Prediction of Funding Momentum for Research Topics
- Prediction of Funding Momentum for Projects
- Conclusion and Future Work
- References
- Local Feature Based Tensor Kernel for Image Manifold Learning
- Introduction
- Tensor Kernels Built on Local Features
- Manifold Learning with Twin Kernel Embedding
- Twin Kernel Embedding
- Manifold Learning Process
- Experimental Results
- COIL 3D Images
- Frey Faces
- Handwritten Digits
- Conclusion
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.