
Machine Learning and Data Mining in Pattern Recognition
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Content
- Title
- Preface
- Table of Contents
- Theory
- Bayesian Approach to the Concept Drift in the Pattern Recognition Problems
- Introduction
- Bayesian Approach to the Problem of Concept Drift for the Pattern Recognition Problem
- Dynamic Programming Procedure for the Estimation of the Decision Rule Parameters under Concept Drift
- Experimental Evaluation
- ``Ground-Truth'' Experiments
- Case Study: ``Spam'' E-Mail Problem
- Conclusion
- References
- Transductive Relational Classification in the Co-training Paradigm
- Introduction
- Related Work
- Transductive Learning
- Multi-view Learning
- Co-training
- The CoTReC Method
- Constructing Multi-views from Relational Data
- Reliability Measures
- Computing Reliability Thresholds
- Implementation Considerations
- Experiments
- Datasets
- Experimental Results
- Conclusions
- References
- Generalized Nonlinear Classification Model Based on Cross-Oriented Choquet Integral
- Introduction
- Signed Efficiency Measure, Choquet Integral, and Classification
- Signed Efficiency Measure
- Choquet Integral
- Classification by Choquet Integral Projections
- Generalized Classification Model by Cross-Oriented Projection
- Algorithms
- Experiments
- Simulation on Artificial Data
- Simulation on Real Data
- Conclusions
- References
- A General Lp-norm Support Vector Machine via Mixed 0-1 Programming
- Introduction
- A General Lp-norm Support Vector Machine
- Optimizing General Lp-norm Support Vector Machine
- Experiment
- Experimental Settings
- Experimental Results
- Conclusions
- References
- Reduction of Distance Computations in Selection of Pivot Elements for Balanced GHT Structure
- Introduction
- Distance Matrices
- Selection of Pivot Elements
- Improvements of the Local Optimization Method
- Interval Model of Distance Calculations
- Conclusions
- References
- Hot Deck Methods for Imputing Missing Data
- Introduction
- Review of Literature
- Simulation Design
- Research Questions
- Factorial Design
- Quality Criteria
- Simulation Details
- Results
- Donor Limitation Impact
- Analysis of Donor Limit Influencing Factors
- Analysis of Donor Limitation Advantages
- Conclusions
- References
- BINER
- Introduction
- Motivation and Contribution
- Organization of Paper
- Problem Formulation
- Related Work
- Intuition and Methodology of BINER
- The BINER Algorithm
- Complexity Analysis
- Experimental Study
- Performance Model
- Results
- Discussion
- Conclusions
- References
- A New Approach for Association Rule Mining and Bi-clustering Using Formal Concept Analysis
- Introduction
- Terminology
- FIST Algorithm
- Database Preprocessing
- Mining Frequent Closed Patterns
- Generating Conceptual Clusters and Bases of Association Rules
- Performance Analysis
- Conclusions
- References
- Top-N Minimization Approach for Indicative Correlation Change Mining
- Introduction
- Preliminaries
- Correlation Based on k-way Mutual Information
- Problem of Mining Top-N Correlation Contrast Sets
- Extracting Top-N Correlation Contrast Sets with Extended Double-Clique Search
- Double-Clique Search with Dynamic Update on Base Graph
- Pruning Mechanism
- Algorithm for Extracting Top-N Correlation Contrast Sets
- Experimental Results
- Computational Performance
- Extracted Correlation Contrast Sets
- Concluding Remark
- References
- Theory: Evaluation of Models and Performance Evaluation Methods
- Selecting Classification Algorithms with Active Testing
- Background and Motivation
- Relative Landmarks
- Active Testing
- Empirical Evaluation
- Evaluation Methodology and Experimental Set-up
- Results
- Related Work in Other Scientific Areas
- Significance and Impact
- References
- Comparing Logistic Regression, Neural Networks, C5.0 and M5´ Classification Techniques
- Introduction
- The Proposed Algorithm
- Experimental Setup
- Neural Networks
- Logistic Regression
- C5.0 and M5´
- Data
- Experiments
- Experimental Results
- Discussion and Conclusions
- Future Work
- References
- Unsupervised Grammar Inference Using the Minimum Description Length Principle
- Introduction
- Related Work
- Minimum Description Length Approach
- Grammar Description Length (GDL)
- Derivation Description Length (DDL)
- Operators
- Experimental Result and Evaluation
- Conclusion and Future Work
- References
- How Many Trees in a Random Forest?
- Introduction
- Related Work
- Random Trees and Random Forests
- Density-Based Metrics for Datasets
- Datasets
- Experimental Methodology
- Results and Discussion
- Conclusion
- References
- Theory: Learning
- Constructing Target Concept in Multiple Instance Learning Using Maximum Partial Entropy
- Introduction
- Partial Entropy
- Approaching MIL by Maximizing Partial Entropy
- Implementation and Experiments
- Artificial Datasets
- MUSK Datasets
- Conclusion
- References
- A New Learning Structure Heuristic of Bayesian Networks from Data
- Introduction
- BNs and Structure Learning from Data Problems
- A New Clustering-Based Heuristic: Theoretical Framework and Methodology
- Theoretical Background
- Procedure and Applied Methodologies
- Experimentations Procedures
- Data-Bases
- Clustering
- The Classical Learning Structure Compared to Our New Heuristic
- Both Attained Structures' Relevant Inferences and Result Comparisons
- Discussion
- Conclusion
- References
- Discriminant Subspace Learning Based on Support Vectors Machines
- Introduction
- Principal Component Analysis (PCA)
- Linear Discriminant Analysis (LDA)
- Margin Maximizing Discriminant Analysis (MMDA)
- Proposed Approach
- Definition and Derivation of the Problem
- Deflation Procedure
- Within-Class Variance Deflation
- Deflated Within-Class Support Vector Machines
- Deflation Kernel
- Feature Extraction
- Experimental Results
- Use of WSVDA for Visualization Purposes
- Dimensionality Reduction and Classification Results
- Conclusions
- References
- A New Learning Strategy of General BAMs
- Introduction
- Bidirectional Associative Memory Models
- Related Works
- Studies about BAMs
- Learning Process of BAMs
- Given Learning Strategies
- Our Approach
- Our Learning Strategy
- Advantages of the New Learning Strategy
- Experiments
- Experiment 1 : Our Model Compared to Non-iterative Learning Models
- Experiment 2 : Our Model Compared to Iterative Learning Models
- Conclusion
- References
- Proximity-Graph Instance-Based Learning, Support Vector Machines, and High Dimensionality: An Empirical Comparison
- Introduction
- Proximity Graphs
- Reducing the Size of the Training Data
- Experiments and Results
- Concluding Remarks
- References
- Theory: Clustering
- Semi Supervised Clustering: A Pareto Approach
- Introduction
- Related Works
- Clustering
- Semi Supervised Clustering
- Problem Definition
- Clustering Algorithm
- Centroid Update and Assignment
- Multi Objective Evolutionary Algorithms
- Experiments
- Data Sets
- Clustering Evaluation Measure
- Analysis
- Conclusion
- References
- Semi-supervised Clustering: A Case Study
- Introduction
- Semi-supervised Clustering
- Unsupervised Clustering with K-Means
- Unsupervised Clustering with Expectation Maximization
- Semi-supervised Clustering by Seeding
- Semi-supervised Clustering with Pairwise Constraints
- Case Study
- Data Description
- Experimental Results
- Discussion
- Conclusions
- References
- SOStream: Self Organizing Density-Based Clustering over Data Stream
- Introduction
- Related Work
- SOStream Framework
- SOStream Overview
- Density-Based Centroid
- SOStream Algorithm
- Online Merging
- Experiments
- Synthetic Data
- Real-World Dataset
- Parameter Analysis
- Scalability and Complexity of SOStream
- Conclusion
- References
- Clustering Data Stream by a Sub-window Approach Using DCA
- Introduction
- Clustering Data Stream Based on Sub-windows
- The First Analysis: Comparison between Independent Local Clustering and Real Clustering
- The Second Analysis: The Adequation between Independent Local Clustering and Global Clustering with the Same Clustering Algorithm
- A DCA Clustering Algorithm
- General DC Programs
- Difference of Convex Functions Algorithms (DCA)
- DCA for Solving the Clustering Problem via MSSC (Minimum Sum of Squares Clustering) Formulation
- Numerical Experiments
- Conclusion
- References
- Improvement of K-means Clustering Using Patents Metadata
- Introduction
- Patent Analysis
- Prior Work
- K-means Algorithm
- Data Model for Patents
- Weighting Functions
- Implementation and Evaluation
- Conclusions
- References
- WebMining
- Content Independent Metadata Production as a Machine Learning Problem
- Introduction
- Data Acquisition
- Classifiers Predictions for Metadata Production
- Experimental Protocols
- Results and Discussions
- Association Rules Generation for Metadata Production
- Experimental Protocol
- Initial Results
- Rules Pruning
- Improving FP-Growth with Efficient 2 Based Rules Pruning
- Results and Discussion
- Conclusion
- References
- Discovering K Web User Groups with Specific Aspect Interests
- Introduction
- Related Works
- Model
- Problem Formulation
- Model Overview
- Generalized Mallows Model(GMM) over Permutations
- Generative Process
- Analysis of Parameters Setting
- Inference
- Experiment
- Datasets
- Evaluation Methodology
- Parameter Setting
- Comparison of Grouping Performance with K-means Baseline
- Illustration of User Groups with Specific Aspect Interests
- Conclusion
- References
- Image Mining
- An Algorithm for the Automatic Estimation of Image Orientation
- Introduction
- The Approach for Automatic Image Orientation Estimation
- Experimental Results
- Conclusions
- References
- Multi-label Image Annotation Based on Neighbor Pair Correlation Chain
- Introduction
- Model Label Correlation under a Graph
- Multi-label Image Annotation Based on Neighbor Pair Correlation
- Experiments
- Conclusions
- References
- Enhancing Image Retrieval by an Exploration-Exploitation Approach
- Introduction
- Nearest-Neighbor Relevance Feedback for Relevance Score
- The Exploration Phase
- Experimental Results
- Datasets
- Experimental Setup
- Results
- Conclusion
- References
- Finding Correlations between 3-D Surfaces: A Study in Asymmetric Incremental Sheet Forming
- Introduction
- Previous Work
- Grid Representation
- Springback Measurement
- Surface Representation (The Local Geometry Matrix)
- Classifier Generation
- Classifier Application
- Evaluation
- Conclusions and Perspectives
- References
- Data Mining in Biometry and Security
- Combination of Physiological and Behavioral Biometric for Human Identification
- Introduction
- Background
- Multimodal Identification Scheme
- PCA- LDA
- MLP
- J48 Classifier
- Naïve Bayes Classifier
- SMO Classifier
- Experimental Results and Discussion
- Conclusions and Further Plan
- References
- Detecting Actions by Integrating Sequential Symbolic and Sub-symbolic Information in Human Activity Recognition
- Introduction
- Methods
- Dataset
- Results
- Object Recognition
- Action Recognition
- Conclusions
- References
- Computer Recognition of Facial Expressions of Emotion
- Introduction
- Emotion Recognition Problem
- Facial Detection
- Facial Expression Representation
- Emotion Recognition
- Experimental Results
- Conclusion
- References
- Data Mining in Medicine
- Outcome Prediction for Patients with Severe Traumatic Brain Injury Using Permutation Entropy Analysis of Electronic Vital Signs Data
- Introduction
- Method
- Ordinal Pattern and Permutation Entropy
- Multivariate Time Series
- Evaluations
- Experiments and Results
- Data and Setup
- Prediction for Mortality and 3-Month GOSE
- Baseline
- Conclusion
- Summary
- Future Work
- Clinical Implication
- References
- EEG Signals Classification Using a Hybrid Method Based on Negative Selection and Particle Swarm Optimization
- Introduction
- Artificial Immune Systems
- Particle Swarm optimization
- Materials and Methods
- EEG Data
- Discrete Wavelet Transform: Feature Extraction
- Adaptive Particle Swarm Negative Selection: EEG Classification
- Experimental Results
- Performance Measures
- Results and Discussion
- Conclusion
- References
- Data Mining in Environment and Water Quality Detection
- DAGSVM vs. DAGKNN: An Experimental Case Study with Benthic Macroinvertebrate Dataset
- Introduction
- Methods
- Support Vector Machine
- DAGSVM and DAGKNN
- Experimental Tests
- Data Description and Test Arrangements
- Results
- Discussion
- References
- Image Mining in Medicine
- Lung Nodules Classification in CT Images Using Shannon and Simpson Diversity Indices and SVM
- Introduction
- Methodology
- Image Acquisition
- Preprocessing with Histogram Equalization
- Lung Nodule Segmentation
- Features Extraction Using Diversity Indices
- Support Vector Machine
- Selection of Features Using Stepwise Linear Discriminant Analysis
- Validation of the Classification Method
- Experiments
- Results
- Conclusions
- References
- Comparative Analysis of Feature Selection Methods for Blood Cell Recognition in Leukemia
- Introduction
- Problem Statement
- Feature Selection
- Fisher Method
- Correlation of the Feature with the Class
- The Feature Selection Based on the Application of the Multiple Input Linear SVM
- The Feature Selection Based on the Application of Nonlinear Kernel
- Principal Component Analysis for Feature Extraction and Selection
- Independent Component Analysis for Feature Extraction and Selection
- Selection of the Features Using Genetic Algorithm
- Ensemble of Classifiers and Final Results of Classification
- Conclusions
- References
- Classification of Breast Tissues in Mammographic Images in Mass and Non-mass Using McIntosh's Diversity Index and SVM
- Introduction
- Related Work
- Materials and Methods
- McIntosh's Diversity Index
- Histogram
- GLCM - Gray Level Co-occurrence Matrix
- GLRLM - Gray Level Run Length Matrix
- Proposed Methodology
- Database
- Features Extraction and Classification
- Results and Discussion
- Conclusion
- References
- Text Mining
- A Semi-Automated Approach to Building Text Summarisation Classifiers
- Introduction
- Related Work
- Problem Formalisation
- Classifier Generation Using SARSET (Semi-Automated Rule Summarisation Extraction Tool)
- Phrase Identification and Generation of Phrase Variations (Step 1)
- Identification of Questionnaires Covered by Identified Phrases (Step 2)
- Rule Generation (Steps 3 and 4)
- Continuation of the Process or Exit (Steps 5)
- Applying Classification Rules to Unseen Documents
- The SAVSNET Application
- Evaluation
- Conclusion
- References
- A Pattern Recognition System for Malicious PDF Files Detection
- Introduction
- An Overview of PDF Technology
- A Brief History
- PDF Structure
- Attacks against PDF Documents
- Attack Types
- Evasion Techniques
- Typical Attack Procedure
- Related Works on PDF Security
- General PDF Security Research
- State-of-the-Art Malicious PDF Detection Tools
- A New PDF Detector
- Feature Extraction
- Classification
- Results
- Data Collection
- Feature Extraction
- Choice of the Classifier
- Accuracy on the Test Set
- Weaknesses
- Conclusions
- References
- Text Categorization Using an Ensemble Classifier Based on a Mean Co-association Matrix
- Introduction
- Problem Overview
- Methodology
- Methodology Application
- Dataset
- Preprocessing
- Ensemble Building
- Evaluation Metrics
- Results
- Results Validation (Friedman Test)
- Discussion
- Conclusions and Future Work
- References
- A Pattern Discovery Model for Effective Text Mining
- Introduction
- Related Work
- Basic Definitions
- Mining Informative Contents
- A Pattern Fusion Model
- Transferable Belief Model (TBM)
- Evidential Mapping
- Weight Fusion
- Reasoning
- Experimental Evaluation
- Experimental Dataset
- Data Preprocessing and Measures
- Baseline Models and Settings
- Quality of Extracted Features
- PFM vs. Pattern Mining Models
- PFM vs. Term-Based Models
- Conclusion
- References
- Investigating Usage of Text Segmentation and Inter-passage Similarities to Improve Text Document Clustering
- Introduction
- Related Work
- Basic Idea
- Text Segmentation
- Similarity Computation
- Traditional Inter-document Similarity
- Passage-Based Inter-document Similarity
- Combined Similarity Measure
- Experimental Results
- Data Sets
- Evaluation Measure
- Clustering Algorithm
- Baseline Approach
- Results
- Conclusion and Future Work
- References
- Data Mining in Network
- Mining Ranking Models from Dynamic Network Data
- Introduction
- Related Work
- Learning Ranking Functions with Different Time Windows
- Experiments
- Conclusions
- References
- Machine Learning-Based Classification of Encrypted Internet Traffic
- Introduction
- Related Work
- BitTorrent Architecture Overview
- Proposed P2P Identification System
- Database
- Pre-filtering
- Flow Conversion and Features Computation
- Feature Selection and Classification
- Support Vector Machines
- Experimental Results
- Conclusion
- References
- Application of Bagging, Boosting and Stacking to Intrusion Detection
- Intrusion Detection System
- Data Mining for IDS
- Ensemble Classifier
- Bagging
- Boosting
- Stacking
- Experimental Settings
- Intrusion Dataset
- Performance Metric
- Experimental Settings
- Conclusions
- References
- Applications of Data Mining in Process Automation, Organisation Change Management, Telecommunication and Post Services
- Classification of Elementary Stamp Shapes by Means of Reduced Point Distance Histogram Representation
- Introduction
- Features Extraction/Reduction
- Point Distance Histogram Representation
- Features Dimensionality Reduction by Means of PCA/LDA
- Classification
- Experimental Results
- Summary
- References
- A Multiclassifier Approach for Drill Wear Prediction
- Introduction
- The Drill Wear Problem
- The Drilling Process
- Methodology
- Feature Selection
- Supervised Classification
- Classification Algorithms
- Validation Method
- Multiclassifier Approach
- Experimental Results
- Burr detection
- Roughness Detection
- Conclusions
- References
- Measuring the Dynamic Relatedness between Chinese Entities Orienting to News Corpus
- Introduction
- Related Work
- Methods
- Outline
- Development Law of Dynamic Relatedness between Entity-Pair
- Development Process of Dynamic Relatedness
- Experiments
- Dataset
- Evaluation of Metrics and Methods
- Results and Analysis
- Conclusions
- References
- Prediction of Telephone User Attributes Based on Network Neighborhood Information
- Introduction
- Some Interesting Scenarios
- Data Description and Preparation
- Exploratory Analysis and Learning Approach
- Isolated Link Approach
- SMS and Call Metrics
- Gender Prediction
- Age Prediction
- Ego-Network Approach
- Network Metrics
- Gender Prediction
- Age Prediction
- Conclusion and Future Work
- References
- Data Mining in Biology
- A Hybrid Approach to Increase the Performanceo f Protein Folding Recognition Using Support Vector Machines
- Introduction
- Protein Database and Its Features
- Protein Database
- Features Vectors
- Machine Learning Classifiers
- Support Vector Machine Classifiers (SVM)
- Quadratic Discriminant Analysis
- Feature Selection
- Generalized Linear Model
- Sequential Feature Selection Algorithm
- Feature Selection Algorithms
- Results
- Conclusions and Future Work
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.