Machine Learning and Knowledge Discovery in Databases, Part II

Name: Machine Learning and Knowledge Discovery in Databases, Part II | European Conference, ECML PKDD 2010, Athens, Greece, September 5-9, 2011, Proceedings, Part II
Brand: Springer
Price: 53.49 EUR
Availability: OnlineOnly

European Conference, ECML PKDD 2010, Athens, Greece, September 5-9, 2011, Proceedings, Part II

Dimitrios Gunopulos Thomas Hofmann Donato Malerba Michalis Vazirgiannis(Editor)

Springer (Publisher)

Published on 6. September 2011

XXII, 681 pages

E-Book

PDF with digital watermarking

System requirements

978-3-642-23783-6 (ISBN)

€53.49incl. 7% vat

System requirements

for PDF with digital watermarking

E-Book Single Licence

Available for download

Description

More details

Other editions

Content

Title
Preface
Organization
Table of Contents
Regular Papers
Common Substructure Learning of Multiple Graphical Gaussian Models
Introduction
Structure Learning of Graphical Gaussian Model
Graphical Gaussian Model
Sparse Estimation of GGM
Learning Structural Changes
Multi-task Approach for Learning a Set of GGMs
Common Substructure Learning
Algorithm
Block Coordinate Descent
Subproblem
Continuous Quadratic Knapsack Problem
Hyper-Parameters ? and ?
Simulation
Synthetic Experiment
Analysis of City-Cycle Fuel Consumption Data
Application to Anomaly Detection
Conclusion
References
Mining Research Topic-Related Influence between Academia and Industry
Introduction
Related Work
Models
Simple Additive Model
Weighted Additive Model
Clustering-Based Additive Model
Experimental Results
Experiments Settings
Influence of Academia Researchers to Company
Influence of Universities to Company
Simulated Data
Conclusion
References
Typology of Mixed-Membership Models:Towards a Design Method
Introduction
Networks of Mixed Membership
Numerical Properties
Model Structure
Model Decomposition
Typology of Sub-structures
Towards a Model Design Method
Designing a Design Method
Example: Expert-Tag-Topic Model
Empirical Analysis
Conclusions and Future Work
References
ShiftTree: An Interpretable Model-Based Approach for Time Series Classification
Introduction
Related Works
Classification of Time Series
Problem Definition
Notation
Concept
The ShiftTree Algorithm
The Structure of a ShiftTree Node
Classification Process
Training Process
About Interpretability
Forest Methods for ShiftTree
Boosting
XV Method
Numerical Results
Datasets and Testing Environment
Results of the Basic ShiftTree
Results of the Forest Methods
Results of the Blind Tests
Conclusion
References
Image Classification for Age-related Macular Degeneration Screening Using Hierarchical Image Decompositions and Graph Mining
Introduction
Previous Work
Age-related Macular Degeneration
AMD Classifier Generation
Image Decomposition
Weighted Frequent Sub-tree (wFST) Mining
Feature Selection
Classification Technique
Evaluation
Performances Using Different Levels of Decomposition
Performances of AMD Classification According to the Size of the Identified Feature Space
Performance Comparison of AMD Classification Using Various Classification Techniques
Conclusions
References
Online Structure Learning for Markov Logic Networks
Introduction
Background
Terminology and Notation
MLNs
Natural Language Field Segmentation
Online Max-Margin Structure and Parameter Learning
Online Max-Margin Structure Learning with Mode-Guided Relational Pathfinding
Online Max-Margin l1-Regularized Weight Learning
Experimental Evaluation
Data
Input MLNs
Methodology
Results and Discussion
Related Work
Future Work
Conclusions
References
Fourier-Information Duality in the Identity Management Problem
Introduction
Probabilistic Identity Management
Inference Operations
Two Dueling Representations
Fourier Domain Representation
Information Form Representation
Comparing the Two Representations
Discussion
Representation Conversion
From Information Coefficients to Fourier Coefficients
From Fourier Coefficients to Information Coefficients
Computation of the Matrix Permanent
A Hybrid Approach for Identity Management
An Adaptive Approach for Identity Management
Experiments
Conclusion
References
Eigenvector Sensitive Feature Selection for Spectral Clustering
Introduction
Feature Selection Based on Perturbation Analysis
Problem Definition
$delta$q$_t,r$ with Respect to L
$delta$q$_rw,t,r$ with Respect to L$_rw$
$delta$q$_sym,t,r$ with Respect to L$_sym$
Eigenvector Sensitive Feature Selection
Eigenvector Sensitive Feature Selection for Spectral Clustering
Related Work
Empirical Analysis
Dataset Decription
Evaluation Criterion
Experiment Setup
Experiment Results
Conclusion
References
Restricted Deep Belief Networks for Multi-view Learning
Introduction
Related Work
Exponential Family Harmonium
Multi-Wing Harmonium
Restricted DBNs
Multi-view Harmonium
Restricted DBN
Inferring One View from the Other
Numerical Experiments
Synthetic Example
Object Conversion on NORB-Small
Image Annotation on ESL Photo Dataset
Conclusions
References
Motion Segmentation by Model-Based Clustering of Incomplete Trajectories
Introduction
Extracting Trajectories
Clustering Trajectories of Variable Length
Initialization Strategy
Experimental Results
Experiments with Simulated Data Sets
Experiments with Real Data Sets
Conclusion
References
PTMSearch: A Greedy Tree Traversal Algorithm for Finding Protein Post-Translational Modifications in Tandem Mass Spectra
Introduction
MS/MS Spectra and PTMs
Related Work
Method: The PTMSearch Algorithm
Speedup Techniques
Significance Calculation of a Hit
Experimental Results
Results on a Toy Datasets
Calculations on Real Data
Discussion, Future Plans
References
Efficient Mining of Top Correlated Patterns Based on Null-Invariant Measures
Introduction
Preliminaries
New Properties of Null-Invariant Measures
Level-Based Properties
Properties Based on a Single Item
NICoMiner Algorithm
Threshold-Based Correlation Mining
Top-k Correlation Mining
Experiments
Synthetic Datasets
Real Datasets
Related Work
Conclusions and Future Work
References
Smooth Receiver Operating Characteristics (smROC) Curves
Introduction
Motivation and Related Work
Constructing a Smooth ROC Curve
Experiments
Performance Similarities
Detecting Differences
Conclusions
References
A Boosting Approach to Multiview Classification with Cooperation
Introduction
The Mumbo Algorithm
Principles and Assumptions
Framework and Notations
The Core of Mumbo
Properties of Mumbo
Bounding the Training Error on Each View
Bounding the Whole Empirical Error
Results in Generalization
Experiments on Mumbo
Protocols
Results
Discussion
Related Works and Discussion
Related Works
Discussion and Improvements
Conclusion and Future Works
References
ARTEMIS: Assessing the Similarity of Event-Interval Sequences
Introduction
Event-Interval Sequences
Distance Measures
The Vector-Based DTW Distance
Artemis: A Bipartite-Based Matching Distance
Lower Bounding Artemis
Experiments
Experimental Setup
Results
Lessons Learned
Related Work
Summary and Conclusions
References
Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms
Introduction
Related Work
Theorems and Correspondences
Arithmetic Examples
Analysis of Convergence
Proposed Algorithm: FaBP
Experiments
Q1: Accuracy
Q2: Convergence
Q3: Sensitivity to Parameters
Q4: Scalability
Conclusions
References
Online Clustering of High-Dimensional Trajectories under Concept Drift
Introduction
Related Work
Method TRACER
Gaussian Mixture Model
Expectation-Maximisation Algorithm
Kalman Filter
Kalman Filter Initialisation
Update and Clustering
Experiments
Data Sets
Methods
Results
Conclusions
References
Gaussian Logic for Predictive Classification
Introduction
A Probabilistic Framework
Parameter Estimation
Structure Search
A Straightforward Predictive Classification Method
Feature Construction for Predictive Classification
Conclusions and Future Work
References
Toward a Fair Review-Management System
Introduction
Contribution
Roadmap
Related Work
Notation
Spotlight Shuffling
Attribute Coverage
Review Quality
Fair Spotlight Share
Compactness
Reviewer Motivation and Utilization
Experiments
Datasets
Qualitative Evidence
Evaluation of ImportanceSampling on the Spotlight-Shuffling Task
The Effect of the Seed of Minimal Covers on ImportanceSampling
Compactness Evaluation
Evaluating the Attribute-Recommendation System
Conclusion
References
Focused Multi-task Learning Using Gaussian Processes
Introduction
Symmetric and Asymmetric Multi-task Learning
Dependency Structure in Multi-task Learning with Gaussian Processes
Symmetric Dependency Structure
Predictive Mean for Symmetric Multi-task GP
Asymmetric Dependency Structure
Hyperparameter Learning
Related Work and Discussion
Examining the Generalisation Error for Asymmetric and Symmetric Models
Generalisation Error for a Test Point x*
Intuition about the Generalisation Errors
Experiments
Synthetic Data
fMRI Data
Conclusion
References
Reinforcement Learning through Global Stochastic Search in N-MDPs
Introduction
Background and Related Work
Notation
Previous Results for N-MDPs
A Sound Local Algorithm
The Algorithm: SoSMC
Exploration: Gathering Information
Assessment
Exploration Strategies
Experimental Evaluation
Parr and Russell's Grid World
Sutton's Grid World
Keepaway
Conclusions
References
Analyzing Word Frequencies in Large Text Corpora Using Inter-arrival Times and Bootstrapping
Introduction
Related Work
Problem Setting
Methods
Method 1: Bernoulli Trials
Method 2: Inter-arrival Times
Method 3: Bootstrapping
Experiments
BNC: A Simple Benchmark
BNC: Differences between Male and Female Authors
BNC: Differences between the Main Genres
SFCNC: Language Change over Time
SFCNC: Locating Dates of Important Events
Conclusion
References
Discovering Temporal Bisociations for Linking Concepts over Time
Introduction
Related Works and Contribution
Formal Definition of the Problem
Discovering Temporal Bisociations
Check for Direct Connections
Generation of Abstract Descriptions
Linking Concepts over Time
Experiments on Biomedical Literature
Conclusions
References
Minimum Neighbor Distance Estimators of Intrinsic Dimension
Introduction
Related Works
The Proposed Algorithms
Base Theoretical Results
Maximum Likelihood Approaches
A pdf Comparison Approach
Algorithm Evaluation
Dataset Description
Experimental Setting
Experimental Results
Conclusions and Future Works
References
Graph Evolution via Social Diffusion Processes
Introduction
Social Diffusion Process for Friendship Broadening
Preliminaries
Social Events and Broadening of Friendship
Social Diffusion Process
Graph Evolution Based on Social Diffusion Process
The Evolution Algorithm
Application of Graph Evolution
Experimental Results
Convergence Analysis
Clustering
Semi-supervised Learning
Graph Evolution for microRNA Functionality Analysis
Conclusions
References
Multi-Subspace Representation and Discovery
Introduction
Problem Description and Our Solution
Multi-Subspace Discovery Problem
A Constructive Solution
Multi-Subspace Representation with Noise
Multi-Subspace Representation
Relation to Previous Work
An Efficient Algorithm and Analysis
Outline of the Algorithm
Optimization Algorithm
Theoretical Analysis of Algorithm 1
Applications
Using Multi-Subspace Representation as Preprocessing
Using Multi-Subspace Representation as Classifier
Experiment
A Toy Example
Experimental Settings
Experimental Results
Conclusions
References
A Novel Stability Based Feature Selection Framework for k-means Clustering
Introduction
Spectral k-means
Stable Sparse PCA
Stability Maximizing Objective and the Cluster Separation/Variance Tradeoff
Two-Way Stability
Optimization Framework
Useful Bounds for Optimizing Stability
Greedy Solutions
Efficient Deflation for Multiple Clusters
Related Work
Experiments
Conclusions and Further Work
References
Link Prediction via Matrix Factorization
The Link Prediction Problem
Challenges in Link Prediction
Our Contributions
Problem Definition and Notation
Existing Link Prediction Models
Do Existing Methods Meet the Challenges in Link Prediction?
Extending Matrix Factorization for Link Prediction
Why is the Factorization Approach Appealing?
How Do We Combine Explicit and Latent Features?
How Do We Overcome Imbalance?
The Final Model
Experimental Design
Experimental Results
Do latent features improve on unsupervised scores?
Conclusion
References
On Oblique Random Forests
Introduction
Oblique Random Forests
Comparison of Classification Performances
Advantages of Oblique Model Trees
Feature Importance and Sample Proximity
Conclusion
References
An Alternating Direction Method for Dual MAP LP Relaxation
Introduction
MAP and LP Relaxation
The Alternating Direction Method of Multipliers
The Augmented Dual LP Algorithm
Experimental Results
Discussion
References
Aggregating Independent and Dependent Models to Learn Multi-label Classifiers
Introduction
Notation and Related Work
Formal Framework for Multi-label Classification
Some Approaches for Multi-label Classification
Aggregating Independent and Dependent Classifiers
Comparison with Related Approaches
Experiments
AID Classifier vs. Stacking Approach
AID Classifier vs. State-of-the-Art Methods
Conclusions
References
Tensor Factorization Using Auxiliary Information
Introduction
Tensor Completion Problem with Auxiliary Information
Tensor Analysis Using Low-Rank Decomposition
Tensor Completion with Auxiliary Information
Proposed Methods: Within-Mode and Cross-Mode Regularization
Regularization Using Auxiliary Similarity Matrices
Method 1: Within-Mode Regularization
Proposed Method 2: Cross-Mode Regularization
Experiments
Datasets
Experimental Settings
Results
Related Work
Conclusion and Future Work
References
Kernels for Link Prediction with Latent Feature Models
Introduction
Latent Feature Models of Graphs
Biological Motivation
Latent Feature Models of Graphs
Ideal Kernels
Link Kernels with Latent Features
Node Kernels with Latent Features
Relation to Ideal Kernels on Sparse Graphs
Link Kernels with Latent Features
Demonstration
Application on Non-similarity Networks with Latent Features
Latent Feature versus Similarity
Execution Time
Link Prediction Results
Comparison to Sequence-Based Prediction
Conclusion
References
Frequency-Aware Truncated Methods for Sparse Online Learning
Introduction
Linear Sparse Online Supervised Learning
Problem Setting
Related Works
Frequency-Aware Truncated Methods
Subgradient Method with Frequency-Aware Truncation
Regret Analysis of SGFT
Lazy Update
SGFT with Cumulative Penalty
Evaluation
Conclusion
References
A Shapley Value Approach for Influence Attribution
Introduction
Related Work
Problem Setting
Example: Author-Publication Instantiation
Methods
Naïve Approach
The Shapley Value Approach
The Iterative Algorithm
Enforcing Monotonicity of the Gain Function
Experiments
Setup
Experimental Results
Conclusions
References
Fast Approximate Text Document Clustering Using Compressive Sampling
Introduction
Coherence and Random Projections
Sampling Cyclic Signals
Sampling Sparse Signals
Compressive Clustering
Document Clustering
Complex Radial K-means
Approximate k-means Document Clustering
Performance
Cluster Accuracy
Radial K-means without Sampling
Radial K-means with DFT Sampling
Radial K-means with DCT Sampling
Clustering Large Scale Document Sets
Related Work
Conclusion
References
Ancestor Relations in the Presence of Unobserved Variables
Introduction
Bayesian Discovery of Ancestor Relations
Computation
Experiments
Challenges of Learning Ancestor Relations
A Simulation Study
Real Life Data
Discussion
References
ShareBoost: Boosting for Multi-view Learningwith Performance Guarantees
Introduction
Related Work
Shared Sampling Algorithm
Randomized Shared Sampling Algorithm
Adversarial Multi-armed Bandit Approach
Exp3.P: Exponential-Weight Algorithm for Exploration and Exploitation
Randomized ShareBoost: Combining ShareBoost and Exp3.P
Convergence Analysis of Randomized ShareBoost
Experiments
Summary
References
Analyzing and Escaping Local Optima in Planning as Inference for Partially Observable Domains
Introduction
Background
Partially Observable Markov Decision Processes
Planning as Inference
State Splitting
Local Optima Analysis
Escaping Local Optima
Forward Search
Node Splitting
Computational Complexity
Experiments
Conclusion
References
Abductive Plan Recognition by Extending Bayesian Logic Programs
Introduction
Background
Logical Abduction
Bayesian Logic Programs
Bayesian Abductive Logic Programs
Logical Abduction
Probabilistic Parameters and Inference
Parameter Learning
Experimental Evaluation
Datasets
Comparison with Other Approaches
Parameter Learning Experiments
Discussion
Related Work
Future Work
Conclusions
References
Higher Order Contractive Auto-Encoder
Introduction
Considered Framework
Setup and Notation
Basic Auto-Encoder
The First-Order Contractive Auto-Encoder
Proposed Higher Order Regularization
Geometric Interpretation
Related Previous Work
Experiments
Analysis of CAE+H
Experimental Results
MNIST Variants
CIFAR-10
Discussion
References
The VC-Dimension of SQL Queries and Selectivity Estimation through Sampling
Introduction
Related Work
Preliminaries
The VC-Dimension of Classes of Queries
Implementation
Experiments
Conclusions
References
Author Index

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Machine Learning and Knowledge Discovery in Databases, Part II

Description

More details

Other editions

Additional editions

Content

System requirements