
Advanced Data Mining and Applications
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Content
- Intro
- Title page
- Preface
- Organization
- Table of Contents
- Generating Syntactic Tree Templates for Feature-Based Opinion Mining
- Introduction
- Related Work
- The Approach
- Syntactic Tree Template
- Building Syntactic Tree Template Lexicon
- Similarity Computation
- Lexicon Generation
- Experiments and Results
- Corpus
- Preprocessing Results
- Appraisal Expressions Extraction Experiments
- Conclusions and Future Work
- References
- Handling Concept Drift via Ensemble and Class Distribution Estimation Technique
- Introduction
- Class Distributions Estimation-Based Methods (CDE-Based Methods)
- Cost Sensitive Classifier (CSC)
- Class Distribution
- Positive-to-Negative Ratio (pr/nr)
- Distribution Mismatch Ratio (DMR)
- Class Distribution Estimation Based Method (CDE)
- CDE Oracle Method
- Our Methods
- CDE-EM-AVG-N
- CDE-EM-BM-N
- CDE-EM-AVGBM-M-N
- CDE-EM-EX-N
- Experiment Methodology
- Experiment Results
- Conclusion
- References
- HUE-Stream: Evolution-Based Clustering Technique for Heterogeneous Data Streams with Uncertainty
- Introduction
- Basic Concepts of Evolution-Based Stream Clustering with Uncertainty
- Tuple-Level and Dimension-Level Uncertainty
- Cluster Representation Using Fading Cluster Structure with Histogram
- Distance Functions
- Evolution-Based Stream Clustering
- The Algorithm
- Overview of HUE-Stream Algorithm
- Experimental Results
- Effectiveness Test
- Sensitivity Test
- Efficiency Test
- Conclusions
- References
- Hybrid Artificial Immune Algorithm and CMAC Neural Network Classifier for Supporting Business and Medical Decision Making
- Introduction
- Related Works
- MIMO CMAC NN Classifier
- Artificial Immune Algorithm
- Method
- Results
- Australian Credit Approval Credit Dataset
- Diabetes Dataset
- Conclusions
- References
- Improving Suffix Tree Clustering with New Ranking and Similarity Measures
- Introduction
- Related Work
- STHAC: Suffix Tree Hierarchical Agglomerative Clustering
- Step 1 - Document Cleaning
- Step 2 - Base Cluster Identification
- Step 3 - Clusters Ranking and Filtering
- Step 4 - Merging Clusters
- Step 5 - Cluster Cleaning
- Evaluation
- Comparison STHAC with STC and Other Algorithms
- Experimental Results
- Further Experiments
- Conclusions
- References
- Individual Doctor Recommendation Model on Medical Social Network
- Introduction
- The Method
- Doctor-Patient Relationships Mining via TPFG
- Features Fxtraction for Calculating AD-CDs
- Authority Degrees Sorting via Ranking SVM
- Individual Doctor Recommendation via Weighted Average Method
- Experiments and Evaluations
- Mining Patient-Doctor Relationships via TPFG
- Effectiveness of Ranking-SVM-Based Algorithm
- AD-CDs Sorting Evaluation
- Recommendation Performance
- Related Work
- Conclusions
- References
- Influence Maximizing and Local Influenced Community Detection Based on Multiple Spread Model
- Introduction
- Related Works
- Independent Cascade Model (ICM)
- Local Community Detection
- Multiple Spread Model (MSM)
- Influence Maximizing
- Approximation Guarantees
- Greedy Algorithm
- Greedy Algorithm Based on Community Detection
- Local Influenced Community Detection
- Experiments
- Data Sets
- Influence Maximizing
- Local Influenced Community
- Conclusion and Future Work
- References
- Interactive Predicate Suggestion for Keyword Search on RDF Graphs
- Introduction
- Problem Statement
- The Framework
- The Structures of Search Results
- Framework of Interactive Predicate Suggestion
- Relevance Evaluation
- Prioritized Propagation and Aggregation
- MPA
- PPA
- Interactive Predicate Suggestion
- Experimental Study
- Data Set and Query Set
- Quality Evaluation
- Performance Evaluation
- Related Work
- Conclusion
- References
- Intrinsic Dimension Induced Similarity Measure for Clustering
- Introduction
- Intrinsic Dimension Estimation
- Dimension Estimation Based on Local PCA
- Dimension Estimation Based on k-Nearest Neighbor Graphs
- Dimension Estimation Based on Maximum Likelihood
- New Similarity Measure for Clustering
- Intrinsic Dimension Feature
- Similarity Measure
- Experiments
- Artificial Data Sets
- Image Segmentation
- Conclusion
- References
- Learning to Make Social Recommendations: A Model-Based Approach
- Introduction
- Problem Definition
- Model Creation
- Representation of the Target Function
- Feature Selection and Construction
- Model Selection
- Candidate Generation
- Ranking
- Experimental Evaluation
- Experimental Setup
- Evaluation Metrics
- Experimental Results for Model Selection
- Experimental Results for Recommendation
- Conclusion
- References
- Microgroup Mining on TSina via Network Structure and User Attribute
- Introduction
- Related Work
- Data Set
- Characterizing Microgroup
- Basic Analysis
- Assortativity Coefficient
- Density Difference
- Attribute Similarity
- United Microgroup Detection Method
- Experiments and Results
- Conclusion
- References
- Mining Good Sliding Window for Positive Pathogens Prediction in Pathogenic Spectrum Analysis
- Introduction
- Related Works
- Traditional Time Series Prediction Methods
- GEP-Based Time Series Prediction
- Sliding Window Mining
- Sub-sliding Windows Enumeration
- Finding the Best Sliding Window
- Experimental Study
- Real-World Positive Pathogens Prediction
- Synthetic Data Prediction
- Discussions and Conclusions
- References
- Mining Patterns from Longitudinal Studies
- Introduction
- Background
- Trees
- Tree Pattern Mining
- Longitudinal Data Analysis
- Tree-Based Representation Schemes for Longitudinal Data
- Timestamp-Based Representation
- Shallow Time-Based Representation
- Deep Time-Based Representation
- Variable-Based Representation
- Identifying Frequent Patterns in Longitudinal Studies
- Induced Patterns in Longitudinal Studies
- Embedded Patterns in Longitudinal Studies
- Choosing the Right Representation Scheme
- Conclusions
- References
- Mining Top-K Sequential Rules
- Introduction
- Problem Definition and Related Work
- The TopSeqRules Algorithm
- The Algorithm
- Implementing TopSeqRules Efficiently
- Extensions
- Evaluation
- Influence of k
- Influence of minconf
- Influence of |S|
- Performance Comparison
- Influence of Optimizations and of Expanding the Most Promising Rules First
- Conclusion
- References
- Mining Uncertain Data Streams Using Clustering Feature Decision Trees
- Introduction
- Related Works
- VFDT and CFDTu
- Uncertain Data Models
- VFDT
- CF Vector
- CFDTu Overview
- Uncertain Clustering
- Heuristic Evaluation Function
- Functional Tree Leaves
- Experimental Results
- Datasets and Experiment Setup
- Classification Accuracy, Runtime, and Memory Usage
- Runtime and Scalability
- Conclusions
- References
- Multi-view Laplacian Support Vector Machines
- Introduction
- Multi-view Laplacian SVMs (MvLapSVM)
- Manifold Regularization
- Multi-view Regularization
- MvLapSVM
- Optimization
- Theoretical Analysis
- Background Theory
- The Generalization Error of MvLapSVM
- The Empirical Rademacher Complexity $^Rl(G)$
- Experiments
- Two-Moons-Two-Lines Synthetic Data
- Image-Text Classification
- Web Page Categorization
- Conclusion
- References
- New Developments of Determinacy Analysis
- Introduction to DA and the Problem Statement
- Determination and Its Characteristics
- Example
- Problems
- New Approach
- Basis of the New Approach
- Description of the Algorithm
- Example
- Conclusion
- References
- On Mining Anomalous Patterns in Road Traffic Streams
- Introduction
- Background
- Statistical Background
- Proposed Framework
- Preliminaries
- Statistical Models
- Upper-Bounding Strategy and Pruning Mechanism for Proposed Framework
- Upper-Bounding Strategy
- Precomputation and Pruning Mechanism
- Experiments, Results and Analysis
- Evaluations on Synthetic Data
- Case Study: Beijing Taxi GPS Data
- Related Work
- Conclusions
- References
- Ontology Guided Data Linkage Framework for Discovering Meaningful Data Facts
- Introduction
- Related Work
- Ontology Guided Data Linkage (OGDL) Framework
- Data Uncertainties Analysis
- Multi-layer Cluster Formation
- Multi-faceted Cluster-Mapping
- Generating Global Schema
- Empirical Evaluation
- Conclusion and Future Work
- References
- Predicting New User's Behavior in Online Dating Systems
- Introduction
- Problem Definition and Basic Prediction Methods
- Algorithm BehvPred
- Clustering Users
- Assigning Users to Groups
- Combing Group Behavior
- Algorithm
- Experiments
- Dataset
- Evaluation Criteria
- Classification Method and Collaborative Filtering Method
- Comparison and Discussion
- Related Works
- Conclusions
- References
- Sequential Pattern Mining from Stream Data
- Introduction
- Basic Notions
- Algorithm PrefixSpan
- SS-BE: Stream Sequence Miner Using Bounded Error
- Algorithm SS-BE2
- Further Extensions: SS-LC and SSLC2
- Experimental Results
- Experiment 1 (Variable Support Threshold)
- Experiment 2 (Variable Number of Patterns)
- Conclusions
- References
- Social Influence Modeling on Smartphone Usage
- Introduction
- Modeling Inter-personal Influence
- Asymmetric Relationship
- Deriving Influence Factors from Observations
- Latent Group Model
- Data and Experiments
- Collected Data
- Evaluation Process
- Latent Structure by Matrix Factorization
- Predictive Performance Comparison
- Discussion
- Influence / Influencee Factor on Individuals
- Conclusion
- References
- Social Network Inference of Smartphone Users Based on Information Diffusion Models
- Introduction
- Inference of a Social Network Based on Information Diffusion
- Preliminaries
- Modeling User Behavior on Application Adoption
- Estimation of User Influence
- Extension
- Related Work
- Experiments
- Baseline Methods and Evaluation Measures
- Results
- Conclusion
- References
- Support Vector Regression with A Priori Knowledge Used in Order Execution Strategies Based on VWAP
- Introduction
- Introduction to e-SVR and SVC
- Introduction to d-SVR
- Introduction to Detractors for SVC, e-SVR and d-SVR
- Introduction to Volume Participation Strategy
- Volume Participation Strategy
- Errors for Volume Participation Strategy
- Predicting Volume Participation
- Incorporating A Priori Knowledge about Prices
- Detractors in Practice
- Experiments
- Prediction Performance and (18) Comparison
- Final Execution Error Comparison for A Priori Knowledge
- Conclusions
- References
- Terrorist Organization Behavior Prediction Algorithm Based on Context Subspace
- Introduction
- Characteristic of the Context Dataset
- Preliminaries of Spectral Clustering
- Prediction Algorithm Based on the Context Subspace
- Extract the Context Subspace
- Prediction Algorithm Based on the Extracted Subspace
- Experiment and Analysis
- Analysis Based on the Artificial Context Dataset
- Analysis Based on the MAROB Dataset
- Conclusion
- References
- Topic Discovery and Topic-Driven Clustering for Audit Method Datasets
- Introduction
- Related Work
- Background
- Audit Method
- System of Audit Methods
- Preparation of Audit Methods
- Document Representation
- Topic-Driven Clustering
- Goals
- Method
- Topics Induced by LDA and Domain Experts
- Topic Discovery
- Goals
- Method
- Experiments
- Metric
- Topics Discovered by LDA
- Comparison of Various Clustering Methods
- Conclusion
- References
- Transportation Modes Identification from Mobile Phone Data Using Probabilistic Models
- Introduction
- Preliminary and Problem Statement
- Proposed Approaches
- General Identification Formula
- SDL-Based Approach
- CPT-Based Approach
- Experiment Setting
- Results
- The Low-Speed Bias Index Variation
- The Mean of the Driving Speed Variation
- The Ambiguous Parameter Variation
- Related Works
- Conclusion and Discussions
- References
- User Graph Regularized Pairwise Matrix Factorization for Item Recommendation
- Introduction
- Matrix Factorization Based on Bayesian Personalized Ranking
- Bayesian Personalized Ranking
- Matrix Factorization Based on BPR (BPR-MF)
- User Graph Regularized Pairwise Matrix Factorization
- User Graph Regularization
- Integration of PMF and User Graph Regularization
- Experiments
- Datasets
- Evaluation Criteria
- Comparison Settings
- Results
- Impact of Parameters
- Conclusion
- References
- Using Predicate-Argument Structures for Context-Dependent Opinion Retrieval
- Introduction
- Related Work
- Problem Formulation
- The Natural Language Approach
- Grammatical Tree Derivations
- Predicate-Argument Relations
- The Subjective Component Score
- Context-Dependent Opinion Relevance
- Constructing a Relevance Function
- Transformed Terms Similarity (TTS)
- A Linear Relevance Model
- Evaluation and Experiment
- Evaluation Using Cross-Entropy
- Opinion Retrieval Task
- Conclusions
- References
- XML Document Clustering Using Structure-Preserving Flat Representation of XML Content and Structure
- Introduction
- Problem Background
- Method Description
- Conversion of Tree-Structured Database to Flat Representation
- Characteristics/Implications of the Proposed Approach
- Experimental Evaluation
- Experimental Methodology
- Data
- Results
- Conclusions
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.