
Advanced Data Mining and Applications
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
This book constitutes the proceedings of the 12th International Conference on Advanced Data Mining and Applications, ADMA 2016, held in Gold Coast, Australia, in December 2016.
The 70 papers presented in this volume were carefully reviewed and selected from 105 submissions. The selected papers covered a wide variety of important topics in the area of data mining, including parallel and distributed data mining algorithms, mining on data streams, graph mining, spatial data mining, multimedia data mining, Web mining, the Internet of Things, health informatics, and biomedical data mining.
More details
Other editions
Additional editions

Content
- Intro
- Preface
- Organization
- Contents
- Spotlight Research Papers
- Effective Monotone Knowledge Integration in Kernel Support Vector Machines
- 1 Introduction
- 2 Background
- 2.1 Partially Monotone Classification
- 2.2 Monotone Support Vector Machines
- 3 Partially Monotone Support Vector Machines
- 3.1 PM-SVM Technique
- 3.2 Measurement of Partial Monotonicity
- 4 Experiments and Datasets
- 5 Results and Discussion
- 6 Conclusions
- References
- Textual Cues for Online Depression in Community and Personal Settings
- 1 Introduction
- 2 Method
- 2.1 Datasets
- 2.2 Feature Sets
- 2.3 Statistical Testing
- 2.4 Classification
- 3 Depression in Community and Personal Settings
- 3.1 Affective Information
- 3.2 Psycho-Linguistic Features
- 3.3 Topical Representation
- 4 Classification
- 4.1 Performance
- 4.2 Linguistic Features as the Predictors
- 4.3 Topics as the Predictors
- 5 Limitation and Further Research
- 6 Conclusion
- References
- Confidence-Weighted Bipartite Ranking
- 1 Introduction
- 2 Related Work
- 3 Online Confidence-Weighted Bipartite Ranking
- 3.1 Problem Setting
- 3.2 Update Buffer
- 3.3 Update Ranker
- 4 Experimental Results
- 4.1 Real World Datasets
- 4.2 Compared Methods and Model Selection
- 4.3 Results on Benchmark Datasets
- 4.4 Results on High-Dimensional Datasets
- 5 Conclusions and Future Work
- References
- Mining Distinguishing Customer Focus Sets for Online Shopping Decision Support
- 1 Introduction
- 2 Problem Definition
- 3 Related Work
- 3.1 Recommendation
- 3.2 Opinion Mining
- 3.3 Contrast Mining
- 4 Design of dFocus-Miner
- 4.1 Candidate Customer Focus Generation
- 4.2 Customer Focus Selection
- 4.3 Mining Top-k Distinguishing Customer Focus Sets
- 5 Empirical Evaluation
- 5.1 Case Study for Effectiveness Evaluation
- 5.2 Efficiency Evaluation
- 6 Conclusions
- References
- Community Detection in Networks with Less Significant Community Structure
- 1 Introduction
- 2 Background: LPA, LPAm and LPAm+
- 2.1 LPA
- 2.2 LPAm
- 2.3 LPAm+
- 3 Meta-LPAm+
- 4 Experimental Results
- 5 Conclusions
- References
- Prediction-Based, Prioritized Market-Share Insight Extraction
- 1 Introduction
- 2 Review of Time-Series Predictors
- 3 Our Solution
- 3.1 Analytics Engine
- 3.2 Display and Interactivity
- 4 Surprise Factor
- 4.1 Logit Transform
- 5 Experiments
- 5.1 Synthetic Data
- 5.2 Real-World Data
- 6 Conclusion
- References
- Interrelationships of Service Orchestrations
- 1 Introduction
- 2 Related Work
- 3 Interrelationships of Service Orchestrations
- 3.1 Service Orchestration Discovery by Topic Modeling
- 3.2 Proposed Model for Interrelationship Discovery
- 4 Experiments
- 4.1 Service Orchestration Discovery by Topic Modeling
- 4.2 Proposed Model for Interrelationship Discovery
- 4.3 Performance Analysis
- 5 Conclusions
- References
- Outlier Detection on Mixed-Type Data: An Energy-Based Approach
- 1 Introduction
- 2 Related Work
- 3 Mixed-Type Outlier Detection
- 3.1 Density Estimation for Mixed Data
- 3.2 Mixed-Variate Restricted Boltzmann Machines
- 3.3 Outlier Detection on Mixed-Type Data
- 4 Experiments
- 4.1 Synthetic Data
- 4.2 Real Data
- 5 Discussion
- References
- Low-Rank Feature Reduction and Sample Selection for Multi-output Regression
- 1 Introduction
- 2 Preliminary
- 3 Method
- 3.1 LFR_SS Algorithm
- 3.2 Optimization
- 3.3 Proving of the Convergence
- 4 Experiments
- 4.1 Datasets and Comparison Algorithms
- 4.2 Experimental Settings
- 4.3 Regression Results
- 5 Conclusion
- References
- Biologically Inspired Pattern Recognition for E-nose Sensors
- 1 Introduction
- 2 Related Work
- 3 Methodology
- 3.1 Problem Definition
- 3.2 E-nose Olfactory System
- 3.3 E-nose Adaptation with AORC
- 4 Experiments and Evaluation
- 4.1 E-nose Data
- 4.2 Synthetic Data Set
- 4.3 Analysis and Results
- 5 Conclusions
- References
- Addressing Class Imbalance and Cost Sensitivity in Software Defect Prediction by Combining Domain Costs and Balancing Costs
- 1 Introduction
- 1.1 Main Contributions of This Study
- 2 Related Work
- 2.1 Measuring Source Code
- 2.2 Sampling Techniques
- 2.3 Classification Methods
- 2.4 Cost-Sensitive Classification for Class Imbalance Treatment
- 3 Our Framework: BCF
- 3.1 Step 1: Generation of Class Specific Clusters (CSCs)
- 3.2 Step 2: Calculation of Record Specific Balancing Costs
- 3.3 Step 3: Combination of Domain Costs and Balancing Costs
- 3.4 Step 4: Cost-Sensitive Classification Using Modified CSForest
- 4 Experiments
- 4.1 Experimental Setup
- 4.2 Results and Discussion
- 4.3 Extracted Knowledge
- 5 Conclusion
- References
- Unsupervised Hypergraph Feature Selection with Low-Rank and Self-Representation Constraints
- 1 Introduction
- 2 Related Work
- 3 Approach
- 3.1 Notations
- 3.2 Method
- 3.3 Optimization
- 4 Experiments
- 4.1 Experimental Settings
- 4.2 Parameter Sensitivity
- 4.3 Experimental Results
- 5 Conclusion
- References
- Improving Cytogenetic Search with GPUs Using Different String Matching Schemes
- 1 Introduction
- 2 Background
- 2.1 Related Work
- 3 Algorithms
- 3.1 Parallel Brute-Force String Matching
- 3.2 Parallel FA String Matching
- 3.3 Application to Triple Inference
- 4 Experiments
- 4.1 MESH 2016 Benchmark
- 4.2 Bio2RDF Benchmark
- 5 Conclusion and Future Work
- References
- CEIoT: A Framework for Interlinking Smart Things in the Internet of Things
- 1 Introduction
- 2 The CEIoT Approach
- 2.1 Correlation Discovery Process
- 2.2 Framework Architecture and System Entities
- 2.3 Correlation Extraction
- 2.4 Correlation Integration
- 2.5 Correlation Representation
- 3 Experimental Results
- 3.1 System Performance
- 3.2 Things Correlation Graph
- 3.3 Message Volume
- 4 Related Work
- 5 Conclusion
- References
- Adopting Hybrid Descriptors to Recognise Leaf Images for Automatic Plant Specie Identification
- 1 Introduction
- 2 Related Work
- 3 Research Problem
- 4 Descriptors
- 4.1 Global Feature Extraction
- 4.2 Local Feature Extraction
- 4.3 Hybrid Descriptor
- 5 Experimental Evaluation
- 5.1 Experimental Design
- 5.2 Experimental Result Analysis
- 6 Conclusions
- References
- Efficient Mining of Pan-Correlation Patterns from Time Course Data
- 1 Introduction
- 2 Problem Formulation
- 2.1 Correlation Patterns: Definitions
- 2.2 Unified Representation of All Correlation Patterns
- 3 Mining Algorithms
- 3.1 Transform Time-Course Data Set M into Sequential Transaction Data Set S
- 3.2 Opposite Mirror Copy of S
- 3.3 Mine Frequent Closed Sequential Value Movements in S'
- 3.4 Opposite Mirror Copy Causes Redundancy in Patterns
- 3.5 Parameter Setting
- 3.6 An Illustrative Example
- 4 Performance Evaluation and Application
- 4.1 Efficiency and Scalability Results on Synthetic Data Sets
- 4.2 Application in Time-Course Gene Expression Data
- 5 Conclusion
- References
- Recognizing Daily Living Activity Using Embedded Sensors in Smartphones: A Data-Driven Approach
- 1 Introduction
- 2 System Overview
- 2.1 Built-In Sensors
- 2.2 Defining Activity List
- 3 Methodology
- 3.1 Collection of Training Data
- 3.2 Classification Algorithms
- 4 Evaluation
- 4.1 Comparison of Different Methods
- 4.2 Optimal Selection of Parameters
- 4.3 Development of Real-Time HAR System
- 5 In-situ Experiments
- 6 Related Work
- 6.1 Wearable Sensors Based HAR
- 6.2 Environmental Sensors Based HAR
- 6.3 Smartphone Based HAR
- 7 Conclusion
- References
- Dynamic Reverse Furthest Neighbor Querying Algorithm of Moving Objects
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Preliminaries
- 3.1 Uncertain Moving Object Model
- 3.2 Dynamic RFN Query Algorithm
- 4 Experiments
- 4.1 Evaluation of DRFN
- Evaluation: P 8 RFN and P 9 RFN
- 5 Conclusion and Future Work
- References
- Research Papers
- Relative Neighborhood Graphs Uncover the Dynamics of Social Media Engagement
- 1 Introduction and Background
- 2 Methodology
- 3 Results
- 4 Discussion and Conclusion
- References
- An Ensemble Approach for Better Truth Discovery
- 1 Introduction
- 2 Related Work
- 3 Problem Formulation
- 4 Ensemble Approaches
- 4.1 Feasibility Analysis
- 4.2 Parallel Model
- 4.3 Serial Model
- 5 Experiments
- 5.1 Experimental Setup
- 5.2 Experiments on Real-World Datasets
- 5.3 Experiments on Synthetic Datasets
- 5.4 Impact of Method Numbers on Serial Ensemble Model
- 6 Conclusion
- References
- Single Classifier Selection for Ensemble Learning
- 1 Introduction
- 2 Definitions of Accurate and Diverse Classifiers for Ensemble
- 2.1 Accurate Classifier for Ensemble
- 2.2 Diverse Classifier for Ensemble
- 3 Picking Up Single Classifiers for Ensemble
- 4 Experimental Study
- 4.1 Benchmark Data Set
- 4.2 Experimental Setup
- 4.3 Experimental Results and Analysis
- 5 Conclusions
- References
- Community Detection in Dynamic Attributed Graphs
- 1 Introduction
- 2 Related Work
- 3 Community Detection in Dynamic Attributed Graphs
- 3.1 Problem Statement
- 3.2 Algorithm for Community Detection in Dynamic Attributed Graphs
- 3.3 Benchmark Dynamic Attributed Graphs for Testing Community Detection Algorithms
- 4 Experimental Evaluation
- 4.1 Benchmark Graphs
- 4.2 Real-World Networks
- 5 Conclusions
- References
- Secure Computation of Skyline Query in MapReduce
- 1 Introduction
- 2 Related Work
- 2.1 Skyline Query
- 2.2 Multi-party Secure Computation
- 2.3 MapReduce Implementations of Skyline Query
- 3 Preliminaries
- 3.1 Dominance and Skyline
- 3.2 Hadoop MapReduce
- 3.3 Multi-party Secure Skyline Problem
- 4 Multi-party Secure Skyline Algorithm
- 4.1 Preparing the key, value Pair
- 4.2 Ordering with MapReduce
- 4.3 Disguise the Original Order
- 4.4 Return of Disguised Order Values
- 4.5 Merging and Sorting
- 4.6 Skyline Computation
- 5 Experiments
- 6 Conclusion
- References
- Recommending Features of Mobile Applications for Developer
- 1 Introduction
- 2 Related Work
- 3 Framework of Apps Features Recommendation with Hybrid Information
- 4 Mobile App Information Analysis
- 4.1 Explicit Information Analysis
- 4.2 Implicit Information (API) Processing
- 5 Functional Feature Recommendation of Mobile App
- 6 Experiment and Evaluation
- 6.1 Experiment Set Up
- 6.2 Experiment Evaluation
- 6.3 Experiment Results and Analysis
- 7 Conclusion
- References
- Adaptive Multi-objective Swarm Crossover Optimization for Imbalanced Data Classification
- Abstract
- 1 Introduction
- 2 Related Works
- 3 Design of AMSCO
- 3.1 Optimized SMOTE for Over-Sampling Minority Instances
- 3.2 Swarm Instance Selection (SIS) for Reducing Majority Instances
- 4 Experiment
- 5 Conclusions
- Acknowledgement
- References
- Causality-Guided Feature Selection
- 1 Introduction
- 2 Problem Statement
- 3 Method
- 3.1 Constructing Causal Graphs and Selecting Potential Causal Relationships
- 3.2 Estimating Causal Effects and Assessing Its Statistical Significance
- 3.3 Feature Selection via Clustering
- 4 Empirical Evaluation
- 4.1 Data Description
- 4.2 Data Preprocessing
- 4.3 Performance Comparison
- 4.4 Time Complexity
- 5 Related Work
- 6 Conclusion
- References
- Temporal Interaction Biased Community Detection in Social Networks
- 1 Introduction
- 2 Related Work
- 3 Problem Definition
- 4 Proposed Algorithms
- 4.1 Activity Biased Re-weighting Process
- 4.2 TIB Community Detection
- 5 Experiments
- 6 Conclusions and Future Work
- References
- Extracting Key Challenges in Achieving Sobriety Through Shared Subspace Learning
- 1 Introduction
- 2 Methods
- 2.1 Datasets
- 2.2 Feature Extraction
- 2.3 Topic Extraction
- 2.4 Classification
- 2.5 Classification Results
- 3 Discussion
- 3.1 Common Topics
- 3.2 Discriminative Topics
- 4 Conclusion
- References
- Unified Weighted Label Propagation Algorithm Using Connection Factor
- 1 Introduction
- 2 Related Works
- 3 The Connection Factor
- 3.1 Definition of the Connection Factor
- 3.2 Calculation of the Connection Factor
- 4 The Connection Factor and the Unified Weight
- 4.1 Discussion of the Topological Feature
- 4.2 The Unified Weighted LPA
- 5 Experiment
- 5.1 Experiment Setup
- 5.2 Accuracy Analysis
- 5.3 Quality Analysis
- 5.4 Parameter Sensitivity Analysis
- 6 Conclusions
- References
- MetricRec: Metric Learning for Cold-Start Recommendations
- 1 Introduction
- 2 MetricRec Model
- 2.1 User Based Collaborative Filtering
- 2.2 User Attribute Based Collaborative Filtering
- 2.3 Metric Learning
- 2.4 Interior-Point Stochastic Gradient Descent
- 3 Experiments
- 3.1 Experiment Setup
- 3.2 Methods for Comparison
- 3.3 Experimental Results
- 3.4 Sensitivity to the Number of Neighbors
- 3.5 Convergence of ISGD
- 4 Conclusion and Future Work
- References
- Time Series Forecasting on Engineering Systems Using Recurrent Neural Networks
- 1 Introduction
- 2 Related Work
- 2.1 Time Series Forecasting
- 2.2 Recurrent Neural Networks
- 3 Problem Background
- 3.1 Problem Statement
- 3.2 Data Description
- 4 Preliminaries
- 4.1 VAR
- 4.2 LSTM
- 4.3 SGD
- 5 Approach
- 5.1 Overall Workflow
- 5.2 Feature Selection
- 5.3 Predictive Modeling
- 6 Result Presentation
- 6.1 Experimental Evaluation
- 7 Conclusion
- References
- EDAHT: An Expertise Degree Analysis Model for Mass Comments in the E-Commerce System
- Abstract
- 1 Introduction
- 2 Analysis Model for Expertise Degree
- 2.1 Expertise Features of Comments
- 2.2 Construction Algorithm for Concept Hierarchy Tree
- 2.3 Computing the Expertise Features Based on AHT
- 3 Experiment Results and Analysis
- 3.1 Datasets and Evaluating Methods
- 3.2 Experiments Setup
- 3.3 Experiment Results
- 4 Conclusion
- Acknowledgement
- References
- A Scalable Document-Based Architecture for Text Analysis
- 1 Introduction
- 2 Related Works
- 2.1 Text Cubes and OLAP
- 2.2 Text Preprocessing and Analysis
- 2.3 Document Indexing
- 3 Proposed Approach and Implementation
- 3.1 Approach Overview
- 3.2 Data Models
- 3.3 Text Preprocessing
- 3.4 Index Management
- 4 Experimental Validation
- 4.1 News Articles Corpus Experiments
- 4.2 Twitter Corpus Experiments
- 4.3 Scientific Articles Corpus Experiments
- 5 Conclusion
- References
- DAPPFC: Density-Based Affinity Propagation for Parameter Free Clustering
- Abstract
- 1 Introduction
- 2 Principles
- 2.1 Figure 1 the Whole Process of the Proposed Algorithm
- 2.2 Parameter Normalization
- 2.3 Density Clustering
- 2.4 Clustering Synthesis
- 2.4.1 Fusing the Results of Multiple Density Clustering
- 2.4.2 Core Region Connection Synthesis
- 3 Experiment Analysis
- 3.1 The Clustering Quality
- 3.2 The Extra Time
- 4 Conclusions
- Acknowledgements
- References
- Effective Traffic Flow Forecasting Using Taxi and Weather Data
- 1 Introduction
- 2 Related Work
- 2.1 Prediction Techniques
- 2.2 Applications
- 3 Problem Definition
- 3.1 Basic Definitions
- 3.2 Problem Definitions
- 4 Time Series Prediction Model
- 4.1 Prediction Framework
- 4.2 Data Source
- 4.3 Algorithm for Extracting Boundary
- 4.4 Number of Floating Taxis in Three Airports
- 4.5 Prediction Algorithms
- 5 Experiment Study
- 5.1 Prediction Performance for the JFK Airport
- 5.2 Prediction Results for the JFK Airport
- 6 Conclusion
- References
- Understanding Behavioral Differences Between Short and Long-Term Drinking Abstainers from Social Media
- 1 Introduction
- 2 Methods
- 2.1 Datasets
- 2.2 Feature Extraction
- 2.3 Classification
- 2.4 Classification Results
- 3 Discussion
- 3.1 Class 1 (Users with the Drinking Abstinence Period of at Least 365 days)
- 3.2 Class 0 (Users with the Drinking Abstinence Period of at Most 30 days)
- 4 Conclusion
- References
- Discovering Trip Hot Routes Using Large Scale Taxi Trajectory Data
- Abstract
- 1 Introduction
- 2 Measuring Trajectory Similarity Based on LCS
- 2.1 Measuring Similarity Between Two Points
- 2.2 Measuring similarity between two sub trajectories
- 2.3 Measuring Similarity Between Two Trajectories
- 3 LCS-Based DBSCAN Trajectory Clustering
- 4 Hot Routes Extracting
- 5 Experiment
- 5.1 Taxi Trajectory Data
- 5.2 Trajectory data preprocessing
- 5.3 Results
- 6 Conclusion
- Acknowledgments
- References
- Discovering Spatially Contiguous Clusters in Multivariate Geostatistical Data Through Spectral Clustering
- 1 Introduction
- 2 Method
- 2.1 Similarity Measure
- 2.2 Similarity Graph
- 2.3 Spectral Clustering Algorithm
- 2.4 Hyper-parameters Selection
- 3 Application
- 3.1 Dataset
- 3.2 Results
- 4 Conclusion
- References
- On Improving Random Forest for Hard-to-Classify Records
- 1 Introduction
- 2 Our Technique
- 3 Experimental Results
- 4 Conclusion
- References
- Scholarly Output Graph: A Graphical Article-Level Metric Indicating the Impact of a Scholar's Publications
- 1 Introduction
- 2 Related Work
- 3 Methods for the Metric
- 3.1 Data Preparation
- 3.2 Three-Dimensional gALM
- 4 Scholarly Output Graph
- 5 Applications and Results
- 6 Conclusion
- References
- Distributed Lazy Association Classification Algorithm Based on Spark
- Abstract
- 1 Introduction
- 2 Introduction of the Distributed Lazy Association Classification Algorithm(DLAC)
- 2.1 The Procedure of DLAC Algorithm
- 2.2 The Shortcomings of the DLAC Algorithm
- 3 Distributed Lazy Association Classification Algorithm Based on Spark
- 3.1 The Procedure of the SDLAC Algorithm
- 3.2 Clustering Unclassified Samples
- 3.3 Distributed Projection
- 3.4 Classifier Construction
- 3.5 Implementation on the Spark Framework
- 4 Experiment and Analysis
- 4.1 Experimental Environment
- 4.2 Experimental Data
- 4.3 Evaluation of Accuracy of SDLC Algorithm
- 4.4 Evaluation of the Efficiency of SDLC Algorithm
- 4.4.1 Impaction Evaluation of the Distributed Projection Operation
- 4.4.2 Impaction Evaluation of the Samples Clustering Operation
- 4.4.3 Impaction Evaluation of the Implementation
- 5 Conclusions
- References
- Event Evolution Model Based on Random Walk Model with Hot Topic Extraction
- Abstract
- 1 Introduction
- 2 Relevant Work
- 2.1 Event Evolution Analysis
- 2.2 Hot Topic Detection
- 2.3 Parallelization
- 3 Event Evolution Analysis
- 3.1 Data Pre-processing
- 3.2 Time Relation
- 3.3 Cosine Similarity
- 3.4 Random Walk Probability
- 4 Hot Topic Detection
- 5 Parallelization of EEM_RW_T
- 5.1 Implementation with Apache Spark
- 5.2 Implementation with MapReduce Model
- 6 Valuation and Result
- 6.1 Evaluation Data
- 6.2 Evaluation Methodology
- 6.3 Evaluation Result and Analysis
- 7 Conclusion
- References
- Knowledge-Guided Maximal Clique Enumeration
- 1 Introduction
- 2 Problem Statement
- 3 Biased Clique Enumeration
- 4 State Space Indexing and Querying Strategy
- 4.1 Storing and Indexing the Search Space
- 4.2 Query Processing
- 5 Biased Clique Applications
- 6 Experimental Analysis of Dynamic Index
- 6.1 Benefits of Dynamic Indexing
- 6.2 Computational Overhead of Dynamic Indexing
- 7 Related Work
- 8 Conclusion and Discussion
- References
- Got a Complaint?- Keep Calm and Tweet It!
- 1 Introduction
- 2 Experimental Setup and Characterization
- 3 Data Enhancement and Enrichment
- 3.1 Hashtag Expansion
- 3.2 Sentence Segmentation
- 3.3 Spell Error Correction
- 3.4 Acronyms and Slang Treatment
- 4 Complaints and Grievance Tweets Classification
- 4.1 Appreciation, Information Sharing and Promotion (AISP) Tweets Classifier
- 4.2 Features Identification
- 4.3 Classification
- 5 Empirical Analysis and Experimental Results
- 6 Conclusions and Future Work
- References
- Query Classification by Leveraging Explicit Concept Information
- 1 Introduction
- 2 Related Work
- 3 Concept-Based Query Representation
- 3.1 Identifying Wikipedia Concepts from Queries
- 3.2 Identifying Probase Concepts from Queries
- 4 Query Classification by Leveraging Concept Information
- 4.1 Classification Using Concepts as Query Enrichment
- 4.2 Classification Incorporating Concepts
- 5 Experiment
- 5.1 Experiment Setup
- 5.2 Query Term Disambiguation Performance
- 5.3 Concept-Based Query Representation
- 5.4 Query Classification Performance
- 6 Conclusion
- References
- Stabilizing Linear Prediction Models Using Autoencoder
- 1 Introduction
- 2 Framework
- 2.1 Correlation by Factorization in Linear Models
- 2.2 Learning Higher Order Correlations Using Autoencoder
- 3 Experiments
- 3.1 Models and Baselines
- 3.2 Temporal Validation
- 3.3 Measuring Stability
- 4 Results
- 4.1 Capturing Higher Order Correlations
- 4.2 Effect on Model Sparsity
- 4.3 Effect on Stability
- 5 Discussion and Conclusion
- 5.1 Conclusion
- References
- Mining Source Code Topics Through Topic Model and Words Embedding
- 1 Introduction
- 2 Methodology
- 2.1 Data Pre-processing
- 2.2 Topic Extraction
- 2.3 Automated Terms Selection for Topic Extraction
- 2.4 The Coherence Measurement
- 3 Related Works
- 4 Experiments
- 4.1 Implementation
- 4.2 Results
- 5 Conclusion
- References
- IPC Multi-label Classification Based on the Field Functionality of Patent Documents
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Proposed Method
- 3.1 IPC Structure
- 3.2 Patent Document Structure
- 3.3 Construction of Proposed Patent Classification Model
- 4 Experiments and Results
- 4.1 Precision Measurements
- 4.2 Evaluation
- 5 Conclusions
- Acknowledgments
- References
- Unsupervised Component-Wise EM Learning for Finite Mixtures of Skew t-distributions
- 1 Introduction
- 2 Finite Mixtures of Skew t-distributions
- 3 Existing Method for Parameter Estimation
- 4 Minimum Message Length (MML) Approach for FM-MST
- 5 Cell Population Segmentation from a DLBCL Sample
- 6 Conclusions
- References
- Supervised Feature Selection by Robust Sparse Reduced-Rank Regression
- 1 Introduction
- 2 Approach
- 2.1 Notations
- 2.2 Objective Function
- 2.3 Optimization
- 3 Experiments
- 3.1 Experiments Setup
- 3.2 Regression Results and Analysis
- 4 Conclusion
- References
- PUEPro: A Computational Pipeline for Prediction of Urine Excretory Proteins
- Abstract
- 1 Introduction
- 2 Materials and Methods
- 2.1 Data Collection
- 2.2 Model Construction
- 2.3 Identification of Differentially Expressed Genes
- 3 Results
- 3.1 Features of Urine-Excretory Proteins
- 3.2 Performance of Urine-Excretory Proteins
- 3.3 The Prediction of Origins of Urinary Proteins
- 3.4 Identification of Urinary Biomarkers for Lung Cancer
- 4 Discussions and Conclusion
- Acknowledgments
- References
- Partitioning Clustering Based on Support Vector Ranking
- Abstract
- 1 Introduction
- 2 Partitioning Clustering Based on Support Vector Ranking
- 2.1 Support Vector Sorting
- 2.2 Partition Clustering
- 2.3 The Implementation of PC-SVR Algorithm
- 3 Experiment Analysis
- 3.1 Evaluation Criteria of Experimental Results
- 3.2 Experimental datasets
- 3.3 Experimental Results and Analysis
- 4 Conclusions and Future Work
- References
- Global Recursive Based Node Importance Evaluation
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Global Recursive Based WCN Node Importance Evaluation
- 3.1 Preliminaries
- 3.2 Algorithm of Global Recursive Based Node Importance Evaluation
- 4 Experiment
- 4.1 Experimental Conditions
- 4.2 Baseline Algorithms
- 4.3 Friedmann Dataset
- 4.4 Learning Curves
- 5 Conclusions
- Acknowledgments
- References
- Extreme User and Political Rumor Detection on Twitter
- 1 Introduction
- 2 Related Work
- 3 Political Rumor Detection
- 3.1 Determining News Propagation by Clustering
- 3.2 Rumor Detection Based on Extreme Users
- 3.3 Judging Extreme Users
- 4 Experimental Analysis
- 4.1 News Rumor Detection
- 4.2 Picture Rumor Detection
- 5 Discussion
- 6 Conclusion
- References
- Deriving Public Sector Workforce Insights: A Case Study Using Australian Public Sector Employment Profiles
- 1 Introduction
- 2 Methods
- 2.1 Problem Description
- 2.2 Approach for Addressing Workforce Diversity
- 2.3 Approach for Addressing Workforce Ageing Issues
- 3 Results
- 3.1 Dataset Description
- 3.2 On Workforce Diversity
- 3.3 On Workforce Ageing
- 4 Hypothesis and Discussion
- 5 Conclusion
- References
- Real-Time Stream Mining Electric Power Consumption Data Using Hoeffding Tree with Shadow Features
- Abstract
- 1 Introduction
- 2 Background
- 3 Proposed Shadow Feature Prediction Model
- 3.1 New Prediction Model
- 3.2 Shadow Feature Generation
- 4 Experiment
- 5 Conclusions
- Acknowledgement
- References
- Real-Time Investigation of Flight Delays Based on the Internet of Things Data
- 1 Introduction
- 2 IoT Crawler and Search Engine
- 3 Related Works
- 4 Model Features
- 5 Results
- 5.1 Data Sets
- 5.2 Data Collection
- 5.3 Data Cleaning
- 5.4 Data Integration
- 5.5 Data Exploration and Visualization
- 5.6 Predictive Model
- 6 Conclusion
- References
- Demo Papers
- IRS-HD: An Intelligent Personalized Recommender System for Heart Disease Patients in a Tele-Health Environment
- 1 Introduction
- 2 Proposed Recommendation System
- 3 Demonstration Plan
- References
- Sentiment Analysis for Depression Detection on Social Networks
- 1 Introduction
- 2 Framework
- 3 Demonstration Plan
- 4 Conclusions
- References
- Traffic Flow Visualization Using Taxi GPS Data
- 1 Introduction
- 2 Implementation
- 3 Demonstration
- 4 Summary
- References
- Erratum to: Advanced Data Mining and Applications
- Erratum to: Chapter 52 in: J. Li et al. (Eds.) Partitioning Clustering Based on Support Vector Ranking DOI: 10.1007/978-3-319-49586-6_52
- Erratum to: Chapter 59 in: J. Li et al. (Eds.) Sentiment Analysis for Depression Detection on Social Networks DOI: 10.1007/978-3-319-49586-6_59
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.