
Machine Learning and Data Mining in Pattern Recognition
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Content
- Title
- Preface
- Organization
- Table of Contents
- Classification and Decision Theory
- Quadratically Constrained Maximum a Posteriori Estimation for Binary Classifier
- Introduction
- Maximum a Posteriori-Based Classifier
- Proposed Method
- Linear Model and Training Method
- A Unified Characterization of LSR and SVM
- A New Classifier
- Construction of GQCM Classifier
- Experiments
- Experiment Using Artificial Samples
- Performance with UCI Data Sets
- Discussion
- Conclusions and Future Work
- References
- Hubness-Based Fuzzy Measures for High-Dimensional k-Nearest Neighbor Classification
- Introduction
- Related Work
- Hubness-Weighted kNN
- Fuzzy Nearest Neighbor Algorithm
- Proposed Hubness-Based Fuzzy Measures
- Experimental Evaluation
- UCI Data Sets
- ImageNet Data
- Conclusions and Future Work
- References
- Decisions: Algebra and Implementation
- Introduction
- Decision Algebra
- Decision Functions
- Learning and Deciding
- Auxiliary Decision Function Operations
- Decision Lattices
- The ``More Accurate'' Relations
- Approximating Decision Functions
- Experiments
- Implementation Details
- Decision Graph Sizes
- k-Approximated Decision Graphs
- Related Work
- Conclusions and Future Work
- References
- Smoothing Multinomial Naïve Bayes in the Presence of Imbalance
- Introduction
- Related Work
- Random OverSampling Expected Smoothing
- ROSE Smoothing Background
- ROSE Smoothing Approach
- Experiments
- Experiment 1: Standard Datasets
- Experiment 2: Class Prior Controlled Data Sets
- Conclusion
- References
- ACE-Cost: Acquisition Cost Efficient Classifier by Hybrid Decision Tree with Local SVM Leaves
- Introduction
- Preliminaries and Related Work
- Computing Average Test Cost in Decision Tree and SVM
- Cost Efficient SVM
- Cost Efficient Decision Trees
- Preprune and Postprune
- ACE-Cost Approach: The Hybrid Decision Tree with Local SVM Leaves
- Decision Tree Sketch
- Postpruning with Local SVM
- Feature Selection at Local SVM Leaves
- Experimental Results
- Performance Comparison on Standard Dataset
- Synthetic Dataset
- A Practical Application with Dependent Cost
- Discussion and Future Work
- Conclusion
- References
- Informative Variables Selection for Multi-relational Supervised Learning
- Introduction
- Approach Illustration
- Evaluation Criterion
- Grid Optimisation
- Experiments
- Protocol
- Artificial Datasets
- Stulong Dataset
- Conclusion
- References
- Separability of Split Value Criterion with Weighted Separation Gains
- Introduction
- Split Criteria
- Weighting Separability Gains
- The Analysis
- Comparison of the Split Criteria
- Analysis of the $a$ Parameter
- Conclusions
- References
- Granular Instances Selection for Fuzzy Modeling
- Introduction
- Background Studies
- Instance Selection
- Fuzzy Modeling
- The Proposed Methodology
- The Framework of the Proposed GIS Fuzzy Modeling Methodology
- Granular Instances Selection
- Evaluation
- Experimental Studies
- Conclusion
- References
- Parameter-Free Anomaly Detection for Categorical Data
- Introduction
- Related Work
- Our Proposed Method
- Problem Definition
- Outlier Detection as an Optimization Problem
- Outlier Factor
- Update of Entropy and Weight
- The ITB Methods and Approximate Optimization
- Experimental Results
- Effectiveness Test
- Efficiency Test
- Conclusion
- References
- Fuzzy Semi-supervised Support Vector Machines
- Introduction
- Fuzzy Semi-supervised Support Vector Machines Approach
- Experimental Results
- Datasets
- Experimental Design
- The Classification Tasks
- Evaluation Procedure
- Parameter Optimization
- Experiments and Results
- Conclusion
- References
- GENCCS: A Correlated Group Difference Approach to Contrast Set Mining
- Introduction
- Related Work
- Problem Definition
- Mutual Information and All Confidence
- Correlated Group Difference
- Background
- Data Format
- Search for Quantitative Contrast Sets
- Distribution Difference
- Our Proposed Approach
- Tests for Significance
- Comparison of Contrasting Groups
- Discretization
- Mining Correlated Group Differences
- Experimental Results
- Performance of GENCCS
- Effect of Mutual Information and All Confidence
- Conclusion
- References
- Collective Classification Using Heterogeneous Classifiers
- Introduction
- Background
- Notation
- Collective Classification
- Collective Classification Using Heterogeneous Classifiers
- Related Work
- Experimental Setup
- Datasets
- Sampling
- Classification Methods
- Experimental Results
- Analysis of Average Local Accuracy Values
- Performance of Different Classifiers
- Performance of Classifier Combination
- Discussion
- References
- Spherical Nearest Neighbor Classification: Application to Hyperspectral Data
- Introduction
- Mapping of Images to a Hypersphere
- Tangent Space and Manifolds
- Exponential and Log Maps
- Spherical Metrics
- Spherical Geodesic and Mahalanobis Metrics
- Spherical Discriminant Adaptive Nearest-Neighbor Classifier
- Experiments
- Data
- Results
- Conclusions
- References
- Adaptive Kernel Diverse Density Estimate for Multiple Instance Learning
- Introduction
- Maximum Diverse Density
- Kernel Density Estimate
- Kernel Density Estimate for MIL
- Kernelized Diverse Density Estimate of Positive Bags
- Kernel Density Estimate of Negative Bags
- Objective Function
- Optimization and Implementation
- Experiments
- Conclusions
- References
- Boosting Inspired Process for Improving AUC
- Introduction
- Boosting Inspired Process AUCBoost
- Weighted AUC (WAUC)
- Experiments
- Base Learning Algorithm: C4.4
- Base Learning Algorithm: Naive Bayes
- Conclusions and Future Work
- References
- Theory of Learning
- Investigation in Transfer Learning: Better Way to Apply Transfer Learning between Agents
- Introduction
- Reinforcement Learning
- The Q-Learning Algorithm
- Accelerating Reinforcement Learning through Heuristics
- Case Based Reasoning and Transfer Learning
- Combination of the Techniques: Transfer Learning with Case Based Reasoning and Reinforcement Learning
- The Transfer Learning Experience
- Conclusion
- References
- Exploration Strategies for Learned Probabilities in Smart Terrain
- Probabilistic Smart Terrain
- Learned Probabilities
- Object Categories and Prior Knowledge
- Bayesian Parameter Learning
- Exploration Strategies and Benchmarks
- Defining Information Gain
- Defining a Simple Two-Object Case
- Estimating Distance Traveled and Error
- Normalizing Error over Categories Based on Prevalence
- Estimating Information Gain
- A Simple Example of Information Gain
- Creating the Influence Map
- Inverse Falloff
- Cumulative Effect of Objects
- Updating the Influence Map When Information Is Learned
- Benchmark Testing
- Different Levels of Prior Knowledge
- Different Category Prevalence
- Significantly Closer Objects
- Aggregate Influence
- Empirical Demonstration of Learning
- Evaluating Ability to Move Towards the Best Objects
- Evaluating the Value of Learning
- Conclusions and Ongoing Research
- Ongoing Research
- References
- Sensitivity Analysis for Weak Constraint Generation
- Introduction
- Abstract Problem Statement
- Formulation of the Multi-constraint Planning Problem
- Examples
- Solution Approach
- Plan Changes
- Sensitivity Analysis, Simulation, and Clustering
- Strategic Release Planning: A Multi-constraint Planning Problem
- Simulation Overview
- Data Generation and Pre-processing
- Sensitivity Analysis and Clustering
- Simulation and Clustering Experiment
- Use Case Scenario
- Discussion
- Limitations
- Conclusions and Future Work
- References
- Dictionary Learning Based on Laplacian Score in Sparse Coding
- Introduction
- Related Work
- Laplacian Score for Dictionary Learning
- Experiments
- Experiment Datasets
- Experiment Setup
- Experiment Result
- Conclusion
- References
- Clustering
- A Practical Approach for Clustering Transaction Data
- Introduction
- Clustering Algorithm
- Objective Function
- Clustering Procedure
- Empirical Evaluation
- Comparing Algorithms
- Quality Measure
- Experiments on Synthetic Data
- Experiments on Real-World Data
- Conclusion
- References
- Hierarchical Clustering with High Order Dissimilarities
- Introduction
- Dissimilarity Increments
- Dissimilarity Increments Distribution
- Hierarchical Clustering Algorithm
- Algorithm
- Minimum Description Length Criterion
- Algorithm Analysis
- Experimental Results
- Datasets
- Parameter Selection
- Exprimental Results and Discussion
- Conclusions
- References
- Clust-XPaths: Clustering of XML Paths
- Introduction
- Related Work
- XML Structure Clustering
- XML Content Clustering
- Clust-XPaths: Our Approach
- Thesaurus
- Paths Matrix
- Clustering
- Experiments
- Evaluation Prototype
- Test Collection and Results
- Conclusion
- References
- Comparing Clustering and Metaclustering Algorithms
- Introduction
- Related Work
- Meta-clustering
- Bagged Clustering
- Majority Voting
- Graph Partitioning
- Cluster Validation Techniques
- Experimental Results
- Conclusions
- References
- Applications in Medicine
- Detection of Phenotypes in Microarray Data Using Force-Directed Placement Transforms
- Introduction
- Force-Directed Placement Strategies
- Modified Force-Directed Placement Transform
- Application to Model Data
- Feature Selection Algorithm
- Results on Model Data
- Application to Real Data
- Discussion
- References
- On the Temporal Behavior of EEG Recorded during Real Finger Movement
- Introduction
- Exploiting the Temporal Information
- Hidden Markov Models for BCI
- Conditional Random Fields for BCI
- Methods for Temporal Classification of Self-Paced EEG
- Method I
- Method II
- Method III
- Experiments
- Results
- Discussion and Conclusion
- References
- A Machine Learning and Data Mining Framework to Enable Evolutionary Improvement in Trauma Triage
- Problem Description
- A Model for Intelligent Triage Support
- Data
- Realtime Decision Support through Machine Learning
- Experiments: Methodology and Results
- Discussion
- Mining for Deeper Understanding
- Knowledge Frontiers
- Related Work
- Algorithm Description
- Experimental Methodology
- Results and Discussion
- Conclusion and Future Work
- References
- A Decision Support System Based on the Semantic Analysis of Melanoma Images Using Multi-elitist PSO and SVM
- Introduction
- Formulation of the Problem of the Semantic Analysis of the Melanoma Malignum
- The MEPSO Algorithm
- The Basic Concept of the SVM Classifier
- Experimental Results
- Conclusion
- References
- WebMining/Information Mining
- Authorship Similarity Detection from Email Messages
- Introduction
- Stylistic Features
- Authorship Similarity Detection
- Frequent Pattern Matching
- Style Differentiation
- Detection Algorithm
- Baseline Comparison Methods
- The Enron Email Corpus
- Experiment Results
- Conclusion
- References
- An Investigation Concerning the Generation of Text Summarisation Classifiers Using Secondary Data
- Introduction
- Related Work
- Problem Definition
- Classifier Generation Using Secondary Data
- The SAVSNET Application
- Evaluation
- Conclusion
- References
- Comparing the One-vs-One and One-vs-All Methods in Benthic Macroinvertebrate Image Classification
- Introduction
- Method
- Linearly Separable Case
- Linearly Non-separable Case
- Nonlinear Support Vector Machines
- One-vs-All
- One-vs-One
- Experimental Tests
- Data Description and Test Arrangements
- Results
- Discussion
- References
- Incremental Web-Site Boundary Detection Using Random Walks
- Introduction
- Preliminaries
- k-Means Clustering
- Clustering Based on Random Crawling
- Experiments
- Data Set
- Evaluation Criteria
- Results
- Discussion
- References
- Discovering Text Patterns by a New Graphic Model
- Introduction
- The Method
- Defining the Task
- The Model
- Properties of the Model
- Finding &c1*, ., cN*&
- Complexities
- Three Tasks
- Descriptions
- Definition of Task 1
- Definition of Task 2
- Definition of Task 3
- Empirical Results
- Experiments Set Up
- Results on the First Task
- Results on the Second Task
- Results on the Third Task
- Related Researches and More Comparisons
- Conclusions
- References
- Topic Sentiment Change Analysis
- Introduction
- Related Works
- Solution Overview
- Topic-Level Sentiment Analysis
- Topic Content Division
- Topic Sentiment Evaluation
- Sentiment Change Analysis
- Time Period Partition
- Cause Identification
- Experiments
- Experiment Setup
- Topic Identification Results
- Sentiment Change Analysis Results
- Conclusions
- References
- Adaptive Context Modeling for Deception Detection in Emails
- Introduction
- Related Work
- Proposed Deception Detector
- Prediction by Partial Matching
- Generalized Suffix Tree Data Structure
- Adaptive Context Modeling
- Experimental Results
- Conclusions
- References
- Contrasting Correlations by an Efficient Double-Clique Condition
- Introduction
- Preliminaries
- Correlation Based on k-Way Mutual Information
- Problem of Mining Correlation Contrast Sets
- Detecting Correlation Contrast Sets with Double-Clique Methods
- Excluding Useless Itemsets with Anti-correlation Graphs
- Enumerating Cliques in Undirected Graph
- Additional Support Constraint
- Algorithm for Extracting Correlation Contrast Sets
- Experimental Results
- Contrasted Databases
- Extracted Correlation Contrast Sets
- Computational Performance
- Conclusion and Further Research
- References
- Machine Learning and Image Mining
- Estimating Image Segmentation Difficulty
- Introduction
- Mathematical Background
- Details of the Approach
- Transformation of Difficulty Measures
- Feature Extraction
- Modeling Process
- Experiments and Results
- Building the Model with Labeled Data
- Applying the Model to Additional Data
- Conclusion and Future Work
- References
- Mining Spatial Trajectories Using Non-parametric Density Functions
- Introduction
- Related Work
- Trajectory Mining with Density Functions
- Trajectory Density Estimation
- The DENTRAC Trajectory Clustering Algorithm
- Hill Climbing Procedure
- Complexity of DENTRAC
- Post Analysis for Trajectory Clusters
- Experimental Evaluations
- Datasets
- Results for the Oldenburg Traffic Data
- Post Analysis by Cluster Average Density and the Density of Density Attractors
- Results of Atlantic Hurricane Tracks Data
- Conclusion and Future Works
- References
- Exploring Synergetic Effects of Dimensionality Reduction and Resampling Tools on Hyperspectral Imagery Data Classification
- Introduction
- Methodology
- Data Preprocessing
- Classification
- Experimental Set-Up
- Classification Performance Measures
- Results and Discussion
- Conclusions and Further Extensions
- References
- A Comparison between Haralick´s Texture Descriptor and the Texture Descriptor Based on Random Sets for Biological Images
- Introduction
- Texture Descriptors
- Haralick´s Texture Descriptor
- Texture Descriptor Based on Random Sets
- Material and Application
- Results
- Discussion
- Conclusion
- References
- Time Series and Frequent Item Set Mining
- Unsupervised Discovery of Motifs under Amplitude Scaling and Shifting in Time Series Databases
- Introduction
- Background
- Related Work
- Algorithm
- Thresholding
- Experimental Results
- Synthetic Data
- Real-World Data with Known Motifs
- Discord Discovery in Synthetic Data
- Empirical Results and Discussion
- Spurious Motifs
- Conclusion and Future Work
- References
- Static Load Balancing of Parallel Mining of Frequent Itemsets Using Reservoir Sampling
- Introduction
- Notation
- The Lattice of All Itemsets
- Sampling Methods
- Database Sample
- The Reservoir Sampling Algorithm
- Error of the Estimation of the Size of a Union of PBECs
- Summary of the Previous Two Methods
- Proposal of a New DM Parallel Method
- Detailed Description of Phase 1
- Detailed Description of Phase 2
- Detailed Description of Phase 3
- Detailed Description of Phase 4
- The Parallel-FIMI-Reservoir Method
- Experimental Evaluation
- Evaluation of the Speedup
- Conclusion and Future Work
- References
- GA-TVRC: A Novel Relational Time Varying Classifier to Extract Temporal Information Using Genetic Algorithms
- Introduction
- Background
- Classification Method
- Genetic Algorithms
- Related Work
- Genetic Algorithm Enhanced Time Varying Relational Classifier (GA-TVRC)
- Training Phase
- Validation Phase Using Evolutionary Strategies
- Test Phase
- Experimental Results
- Datasets
- Methodology
- Results and Analysis
- Conclusions and Future Work
- References
- Aspects of Machine Learning and Data Mining
- Detection of Communities and Bridges in Weighted Networks
- Introduction
- Background and Motivation
- Methodology
- Fuzzy Clustering of Weighted Graphs
- Fuzzy Clustering of Unweighted Graphs
- Bridgeness Measure
- Experimental Setup
- Datasets
- Edge Weights
- Label Correspondence
- Evaluation Metrics
- Software
- Results
- Accuracy
- Sensitivity Analysis
- Bridgeness Analysis
- Conclusion and Future Work
- References
- Techniques for Improving Filters in Power Grid Contingency Analysis
- Introduction
- Context
- A Metric for Evaluating Filters
- Resource-Aware Filter Combination
- Multi-criteria Optimization
- Related Work
- Future Work
- Conclusion
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.