
Machine Learning and Knowledge Discovery in Databases
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
The three volume set LNAI 9284, 9285, and 9286 constitutes the refereed proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2015, held in Porto, Portugal, in September 2015.
The 131 papers presented in these proceedings were carefully reviewed and selected from a total of 483 submissions. These include 89 research papers, 11 industrial papers, 14 nectar papers, and 17 demo papers. They were organized in topical sections named: classification, regression and supervised learning; clustering and unsupervised learning; data preprocessing; data streams and online learning; deep learning; distance and metric learning; large scale learning and big data; matrix and tensor analysis; pattern and sequence mining; preference learning and label ranking; probabilistic, statistical, and graphical approaches; rich data; and social and graphs. Part III is structured in industrial track, nectar track, and demo track.
More details
Other editions
Additional editions

Content
- Intro
- Preface
- Organization
- Abstracts of Invited Talks
- Towards Declarative, Domain-OrientedData Analysis
- Sum-Product Networks: Deep Modelswith Tractable Inference
- Mining Online Networks and Communities
- Learning to Acquire Knowledge in a SmartGrid Environment
- Untangling the Web's Invisible Net
- Towards a Digital Time Machine Fueled by BigData and Social Mining
- Abstracts of Journal Track Articles
- Contents - Part I
- Contents - Part II
- Contents - Part III
- Research Track Classification, Regression and Supervised Learning
- Data Split Strategiesfor Evolving Predictive Models
- 1 Introduction
- 2 Data Splits for Model Fitting, Selection,and Assessment
- 3 Issues with Evolving Models
- 4 Data Splits for Evolving Models
- 4.1 Parallel Dump Workflow
- 4.2 Serial Waterfall Workflow
- 4.3 Hybrid Workflow
- 5 Bias Due to Test Set Reuse
- 6 Illustration on Synthetic Data
- 7 Case Study: Paraphrase Detection
- 8 Related Work
- 9 Conclusions
- A Appendix: Bias Due to Test Set Reuse
- References
- Discriminative Interpolation for Classification of Functional Data
- 1 Introduction
- 2 Function Representations and Wavelets
- 3 Related Work
- 4 Classification by Discriminative Interpolation
- 4.1 Training Formulation
- 4.2 Testing Formulation
- 5 Experiments
- 6 Conclusion
- References
- Fast Label Embeddings via Randomized Linear Algebra
- 1 Introduction
- 1.1 Contributions
- 2 Algorithm Derivation
- 2.1 Notation
- 2.2 Background
- 2.3 Rank-Constrained Estimation and Embedding
- 2.4 Rembrandt
- 3 Related Work
- 4 Experiments
- 4.1 ALOI
- 4.2 ODP
- 4.3 LSHTC
- 5 Discussion
- References
- Maximum Entropy Linear Manifold for Learning Discriminative Low-Dimensional Representation
- 1 Introduction
- 2 General Idea
- 3 Theory
- 4 Closed form Solution for Objective and its Gradient
- 5 Experiments
- 6 Conclusions
- References
- Novel Decompositions of Proper Scoring Rules for Classification: Score Adjustment as Precursor to Calibration
- 1 Introduction
- 2 Proper Scoring Rules
- 2.1 Scoring Rules
- 2.2 Divergence, Entropy and Properness
- 2.3 Expected Loss and Empirical Loss
- 3 Decompositions with Ideal Scores and Calibrated Scores
- 3.1 Ideal Scores Q and the Decomposition L=EL+IL
- 3.2 Calibrated Scores C and the Decomposition L=CL+RL
- 4 Adjusted Scores A and the Decomposition L=AL+PL
- 4.1 Adjustment
- 4.2 The Right Adjustment Procedure Guarantees Decreased Loss
- 5 Decomposition Theorems and Terminology
- 5.1 Decompositions with S,C,Q,Y
- 5.2 Decompositions with S,A,C,Q,Y and Terminology
- 6 Algorithms and Experiments
- 7 Related Work
- 8 Conclusions
- References
- Parameter Learning of Bayesian Network Classifiers Under Computational Constraints
- 1 Introduction
- 2 Related Work
- 3 Background and Notation
- 4 Algorithms for Online Learning of Reduced-Precision Parameters
- 4.1 Learning Maximum Likelihood Parameters
- 4.2 Learning Maximum Margin Parameters
- 5 Experiments
- 5.1 Datasets
- 5.2 Results
- 6 Discussions
- References
- Predicting Unseen Labels Using Label Hierarchies in Large-Scale Multi-label Learning
- 1 Introduction
- 2 Multi-label Classification
- 3 Model Description
- 3.1 Joint Space Embeddings
- 3.2 Learning with Hierarchical Structures Over Labels
- 3.3 Efficient Gradients Computation
- 3.4 Label Ranking to Binary Predictions
- 4 Experimental Setup
- 5 Experimental Results
- 5.1 Learning All Labels Together
- 5.2 Learning to Predict Unseen Labels
- 6 Pretrained Label Embeddings as Good Initial Guess
- 6.1 Understanding Label Embeddings
- 6.2 Results
- 7 Conclusions
- Regression with Linear Factored Functions
- 1 Introduction
- 1.1 Kernel Regression
- 1.2 Factored Basis Functions
- 2 Regression
- 3 Linear Factored Functions
- 3.1 Function Class
- 3.2 Constraints
- 3.3 Regularization
- 3.4 Optimization
- 4 Empirical Evaluation
- 4.1 Demonstration
- 4.2 Evaluation
- 5 Discussion
- Appendix A LFF Definition and Properties
- Appendix B Inner Loop Derivation
- Appendix C Proofs of the Propositions
- References
- Ridge Regression, Hubness, and Zero-Shot Learning
- 1 Introduction
- 1.1 Background
- 1.2 Research Objective and Contributions
- 2 Zero-Shot Learning as a Regression Problem
- 3 Hubness Phenomenon and the Variance of Data
- 4 Hubness in Regression-Based Zero-Shot Learning
- 4.1 Shrinkage of Projected Objects
- 4.2 Influence of Shrinkage on Nearest Neighbor Search
- 4.3 Additional Argument for Placing Target Objects Closer to the Origin
- 4.4 Summary of the Proposed Approach
- 5 Related Work
- 6 Experiments
- 6.1 Experimental Setups
- 6.2 Task Descriptions and Datasets
- 6.3 Experimental Results
- 7 Conclusion
- References
- Solving Prediction Games with Parallel Batch Gradient Descent
- 1 Introduction
- 2 Problem Setting and Data Transformation Model
- 3 Analysis of Equilibrium Points
- 3.1 Existence of Equilibrium Points
- 3.2 Uniqueness of Equilibrium Points
- 4 Finding the Unique Equilibrium Point Efficiently
- 4.1 Inexact Line Search
- 4.2 Arrow-Hurwicz-Uzawa Method
- 4.3 Parallelized Methods
- 5 Experimental Results
- 5.1 Reference Methods
- 5.2 Performance of the Parameterized Transformation Model
- 5.3 Optimization Algorithms
- 5.4 Parallelized Models
- 6 Conclusion
- References
- Structured Regularizer for Neural Higher-Order Sequence Models
- 1 Introduction
- 2 Related Work
- 3 Higher-Order Conditional Random Fields
- 3.1 Parameter Learning
- 3.2 Forward Algorithm for 2nd-Order CRFs
- 4 Structured Regularizer
- 5 Experiments
- 5.1 TIMIT Data Set
- 5.2 Experimental Setup
- 5.3 Labeling Results Using Only MLP Networks
- 5.4 Labeling Results Using LC-CRFs with Linear or Neural Higher-Order Factors
- 6 Conclusion
- References
- Versatile Decision Trees for Learning Over Multiple Contexts
- 1 Introduction
- 2 Dataset Shift
- 3 Versatile Decision Trees
- 3.1 Constructing Splits Using Percentiles
- 3.2 Adapting for Output Shifts
- 3.3 Versatile Model for Decision Trees
- 4 Experimental Results
- 4.1 Generating Synthetic Shifts
- 4.2 Results of the Synthetic Shifts
- 4.3 Results on Non-synthetic Shifts
- 5 Conclusion
- References
- When is Undersampling Effective in Unbalanced Classification Tasks?
- 1 Introduction
- 2 The Warping Effect of Undersampling on the Posterior Probability
- 3 The Interaction Between Warping and Variance of the Estimator
- 4 Experimental Validation
- 4.1 Synthetic Datasets
- 4.2 Real Datasets
- 5 Conclusion
- References
- Clustering and Unsupervised Learning
- A Kernel-Learning Approach to Semi-supervised Clustering with Relative Distance Comparisons
- 1 Introduction
- 2 Related Work
- 3 Kernel Learning with Relative Distances
- 3.1 Basic Definitions
- 3.2 Relative Distance Constraints
- 3.3 Extension to a Kernel Space
- 3.4 Log Determinant Divergence for Kernel Learning
- 3.5 Problem Definition
- 4 Semi-supervised Kernel Learning
- 4.1 Bregman Projections for Constrained Optimization
- 4.2 Semi-supervised Kernel Learning with Relative Comparisons
- Selecting the Bandwidth Parameter.
- Semi-Supervised Kernel Learning with Relative Comparisons.
- Clustering Method.
- 5 Experimental Results
- 5.1 Datasets
- 5.2 Relative Constraints vs. Pairwise Constraints
- 5.3 Multi-resolution Analysis
- 5.4 Generalization Performance
- 5.5 Effect of Equality Constraints
- 6 Conclusion
- References
- Bayesian Active Clustering with Pairwise Constraints
- 1 Introduction
- 2 Problem Statement
- 3 Bayesian Active Clustering
- 3.1 The Bayesian Clustering Model
- Marginalization of Cluster Labels.
- 3.2 Active Query Selection
- Selection Criteria.
- Computing the Selection Objectives.
- 3.3 The Sequential MCMC Sampling of W
- 3.4 Find the MAP Solution
- 4 Experiments
- 4.1 Dataset and Setup
- 4.2 Effectiveness of the Proposed Clustering Model
- 4.3 Effectiveness of the Overall Active Clustering Model
- 4.4 Analysis of the Acyclic Graph Restriction
- 5 Related Work
- 6 Conclusion
- References
- ConDist: A Context-Driven Categorical Distance Measure
- 1 Introduction
- 2 Related Work
- 3 The Distance Measure ConDist
- 3.1 Definition of ConDist
- 3.2 Attribute Distance dX
- 3.3 Attribute Weighting Function wX
- 3.4 Correlation, Context and Impact
- 3.5 Heterogeneous Data Sets
- 4 Experiments
- 4.1 Evaluation Methodology
- 4.2 Experiment 1 -- Context Attribute Selection
- 4.3 Experiment 2 -- Comparison in the Context of Classification
- 4.4 Experiment 3 -- Comparison in the Context of Clustering
- 5 Discussion
- 5.1 Experiment 1 -- Context Attribute Selection
- 5.2 Experiment 2 -- Comparison in the Context of Classification
- 5.3 Experiment 3 -- Comparison in the Context of Clustering
- 6 Summary
- References
- Discovering Opinion Spammer Groups by Network Footprints
- 1 Introduction
- 2 Measuring Network Footprints
- 2.1 Neighbor Diversity of Nodes
- 2.2 Self-Similarity in Real-World Graphs
- 2.3 NFS Measure
- 3 Detecting Spammer Groups
- 4 Evaluation
- 4.1 Performance of NFS on Synthetic Data
- 4.2 Performance of GroupStrainer on Synthetic Data
- 4.3 Results on Real-World Data
- 5 Related Work
- 6 Conclusion
- References
- Gamma Process Poisson Factorization for Joint Modeling of Network and Documents
- 1 Introduction
- 2 Background and Related Work
- 2.1 Negative Binomial Distribution
- 2.2 Gamma Process
- 2.3 Network Modeling, Topic Modeling and Count Matrix Factorization
- 3 Joint Gamma Process Poisson Factorization (J-GPPF)
- 3.1 Inference via Gibbs Sampling
- 3.2 Special Cases: Network Only GPPF (N-GPPF) and Corpus Only GPPF (C-GPPF)
- 3.3 Computation Complexity
- 4 Experimental Results
- 4.1 Experiments with Synthetic Data
- 4.2 Experiments with Real World Data
- 5 Conclusion
- References
- Generalization in Unsupervised Learning
- 1 Introduction
- 1.1 Preliminaries and Setup
- 2 A General Learning Framework
- 2.1 Generalization and Stability
- 3 Empirical Generalization Analysis
- 3.1 Estimating n From a Finite Data Set
- 3.2 The Trend of "0362n and The Stability Line
- 3.3 Comparing Two Algorithms: A1 vs. A2
- 4 Empirical Validation on Real Data Sets
- 4.1 Generalization Assessment of k--Means Clustering
- 4.2 Generalization Assessment of PCA, LEM, and LLE
- 5 Concluding Remarks
- References
- Multiple Incomplete Views Clustering via Weighted Nonnegative Matrix Factorization with L2,1 Regularization
- 1 Introduction
- 2 Problem Formulation and Backgrounds
- 2.1 Problem Formulation
- 2.2 Weighted Nonnegative Matrix Factorization
- 3 Multi-Incomplete-View Clustering
- 3.1 Objective Function of MIC
- 3.2 Optimization
- Fixing {U(i)} and {V(i)} , minimize O over U* .
- Fixing U* , minimize O over {U(i)} and {V(i)} .
- 4 Experiments and Results
- 4.1 Comparison Methods
- 4.2 Dataset
- 4.3 Results
- 4.4 Parameter Study
- 4.5 Convergence Study
- 5 Related Work
- 6 Conclusion
- References
- Solving a Hard Cutting Stock Problem by Machine Learning and Optimisation
- 1 Introduction
- 2 Cutting Stock Problems
- 3 Problem Formalization
- 4 ILP Model for the CSAWCSP
- 5 Machine Learning Approach for the CSAWCSP
- 5.1 Distribution Learning
- 5.2 Generating Uniformly Distributed Random Vectors
- 5.3 k-Medoids Clustering
- 6 Emprirical Study
- 7 Conclusions and Future Work
- References
- Data Preprocessing
- Markov Blanket Discovery in Positive-Unlabelled and Semi-supervised Data
- 1 Introduction
- 2 Background: Markov Blanket
- 2.1 Markov Blanket Discovery Algorithms
- 2.2 Testing Conditional Independence in Categorical Data
- 2.3 Suggested Approach for Semi-supervised MB Discovery
- 3 Background: Partially-Labelled Data
- 3.1 Positive-Unlabelled Data
- 3.2 Semi-supervised Data
- 4 Markov Blanket Discovery in Positive-Unlabelled Data
- 4.1 Testing Conditional Independence in PU Data
- 4.2 Evaluation of Markov Blanket Discovery in PU Data
- 5 Markov Blanket Discovery in Semi-supervised Data
- 5.1 Testing Conditional Independence in Semi-supervised Data
- 5.2 Incorporating Prior Knowledge on Markov Blanket Discovery
- 6 Exploring our Framework Under Class Prior Change --- When and how the Unlabelled Data Help
- 7 Conclusions and Future Work
- A Generation of Network Data and Experimental Protocol
- References
- Multi-view Semantic Learning for Data Representation
- 1 Introduction
- 2 Related Work
- 2.1 Common Notations
- 2.2 NMF-Based Latent Subspace Learning
- 3 Multi-view Semantic Learning
- 3.1 Matrix Factorization with Multi-view Data
- 3.2 Graph Embedding for Multi-view Semantic Learning
- 3.3 Sparseness Constraint
- 3.4 Objective Function of MvSL
- 4 Optimization
- 4.1 Optimizing {U(v)}v=1H
- 4.2 Optimizing V
- 5 Experiment
- 5.1 Data Set
- 5.2 Baselines
- 5.3 Evaluation Metric
- 5.4 Experiment Results
- 5.5 Parameter Sensitive Analysis
- 6 Conclusion
- References
- Unsupervised Feature Analysis with Class Margin Optimization
- 1 Introduction
- 2 Notations and Definitions
- 3 Proposed Method
- 4 Optimization
- 5 Experiments
- 5.1 Experiment Setup
- 5.2 Experimental Results
- 5.3 Studies on Parameter Sensitivity and Convergence
- 6 Conclusion
- References
- Data Streams and Online Learning
- Ageing-Based Multinomial Naive Bayes Classifiers Over Opinionated Data Streams
- 1 Introduction
- 2 Related Work
- 3 Basic Concepts
- 3.1 Basic Model: Multinomial Naive Bayes
- 4 Ageing-Based Multinomial Naive Bayes
- 4.1 Ageing-Based MNB Model
- 4.2 Ageing-Based MNB Classification
- 4.3 Aggressive Fading MNB Alternative
- 5 Experiments
- 5.1 Data and Concept Changes
- 5.2 Evaluation Methods and Evaluation Measures
- 5.3 Classifier Performance
- 5.4 Impact of the Fading Factor on the New Algorithms
- 5.5 The Effect of Temporal Granularity and How to Set
- 6 Conclusions and Outlook
- References
- Drift Detection Using Stream Volatility
- 1 Introduction
- 2 Related Work
- 3 Preliminaries
- 4 Our Concept and Design
- 4.1 Predictive Approach
- 4.2 Online Adaptation Approach
- 4.3 Application onto ADWIN
- 5 Experimental Evaluation
- 5.1 False Positive Test
- 5.2 True Positive Test
- 5.3 False Negative Test
- 5.4 Real-World Data: Power Supply Dataset
- 5.5 Case Study: Incremental Classifier
- 6 Conclusion and Future Work
- References
- Early Classification of Time Series as a Non Myopic Sequential Decision Making Problem
- 1 Introduction
- 2 A Generic Framework and Positions of Related Works
- 3 A Formal Analysis and a Naïve Approach
- 4 The Proposed Approach
- 5 Implementation
- 6 Experiments
- 6.1 Controlled Experiments
- 6.2 Experiments on a Real Data Set
- 7 Conclusion and Future Works
- References
- Ising Bandits with Side Information
- 1 Introduction
- 2 Background and Preliminaries
- 2.1 Semi-supervised Graph Classifier Complexity
- 2.2 Ising Model at Low Temperature
- 2.3 Multi-Armed Bandit Problem (MAB)
- 2.4 Formulation
- 3 Maximum Flow Computation
- 3.1 Playing Ising Bandits
- 4 Experiments
- 4.1 Dataset Description
- 4.2 Synthetic Dataset
- 4.3 Graph Generation from Datasets
- 4.4 Evaluation Criteria
- 4.5 Results
- 5 Conclusion
- References
- Refined Algorithms for Infinitely Many-Armed Bandits with Deterministic Rewards
- 1 Introduction
- 2 Model Formulation and Lower Bound
- 3 Optimal Sample Size
- 4 Extensions
- 4.1 Anytime Algorithm
- 4.2 Non-retainable Arms
- 5 Experiments
- 5.1 Retainable Arms
- 5.2 Anytime Algorithm
- 5.3 Non-Retainable Arms
- 6 Conclusion and Discussion
- References
- Deep Learning
- An Empirical Investigation of Minimum Probability Flow Learning Under Different Connectivity Patterns
- 1 Introduction
- 2 Restricted Boltzmann Machines
- 3 Minimum Probability Flow
- 3.1 Dynamics of the Model
- 3.2 Form of the Transition Matrix
- 4 Probability Flow Rates
- 4.1 1-bit Flip Connections
- 4.2 Factorized Minimum Probability Flow
- 4.3 Persistent Minimum Probability Flow
- 5 Experiments
- 5.1 MNIST - Exact Log Likelihood
- 5.2 MNIST - Estimating Log Likelihood
- 5.3 Caltech 101 Silhouettes - Estimating Log Likelihood
- 6 Conclusion
- References
- A Minimum Probability Flow
- A.1 Dynamics of The Model
- Difference Target Propagation
- 1 Introduction
- 2 Target Propagation
- 2.1 Formulating Targets
- 2.2 How to Assign a Proper Target to Each Layer
- 2.3 Difference Target Propagation
- 2.4 Training an Auto-Encoder with Difference Target Propagation
- 3 Experiments
- 3.1 Deterministic Feedforward Deep Networks
- 3.2 Networks with Discretized Transmission Between Units
- 3.3 Stochastic Networks
- 3.4 Auto-Encoder
- 4 Conclusion
- References
- A Proof of Theorem 1
- B Proof of Theorem 2
- Online Learning of Deep Hybrid Architectures for Semi-supervised Categorization
- 1 Introduction
- 2 Related Work
- 3 Deep Hybrid Architectures
- 3.1 The Stacked Boltzmann Experts Network (SBEN)
- 3.2 Hybrid Stacked Denoising Auto-Encoders (HSDA)
- 3.3 Ensembling of Layer-Wise Experts
- 4 Experimental Results
- 4.1 Finite Dataset Learning Performance
- 4.2 Incremental Learning Performance
- 5 Conclusions
- References
- Scoring and Classifying with Gated Auto-Encoders
- 1 Introduction
- 2 Gated Auto-Encoders
- 3 Gated Auto-Encoder Scoring
- 3.1 Vector Field Representation
- 3.2 Scoring the GAE
- 4 Relationship to Restricted Boltzmann Machines
- 4.1 Gated Auto-Encoder and Factored Gated Conditional Restricted Boltzmann Machines
- 4.2 Mean-Covariance Auto-Encoder and Mean-Covariance Restricted Boltzmann Machines
- 5 Classification with Gated Auto-Encoders
- 5.1 Classification Using Class-Specific Gated Auto-Encoders
- 5.2 Multi-label Classification via Optimization in Label Space
- 6 Conclusion
- References
- Sign Constrained Rectifier Networks with Applications to Pattern Decompositions
- 1 Introduction
- 2 The Categories of Separable Pattern Sets
- 3 Binary Classification with Rectifier Networks
- 4 Single-Hidden-Layer Sign Constrained Rectifier Networks
- 5 Two-Hidden-Layer Sign Constrained Rectifier Networks
- 6 Discussion
- References
- Aggregation Under Bias: Rényi Divergence Aggregation and Its Implementation via Machine Learning Markets
- 1 Introduction
- 2 Background
- 3 Problem Statement
- 4 Weighted Divergence Aggregation
- 4.1 Weighted Rényi Divergence Aggregation
- Properties.
- 5 Maximum Entropy Arguments
- Interim Summary.
- 6 Implementation
- 7 Experiments
- Task 1: Aggregation on Simulated Sata.
- Task 2: Aggregation on Chords from Bach Chorales.
- Task 3: Aggregation on Kaggle Competition.
- Results.
- 8 Machine Learning Markets and Rényi Divergence Aggregation
- 9 Discussion
- References
- Distance and Metric Learning
- Higher Order Fused Regularization for Supervised Learning with Grouped Parameters
- 1 Introduction
- 2 Regularized Supervised Learning
- 3 Higher Order Fused Regularizer
- 3.1 Review of Submodular Functions and Robust Pn Potential
- 3.2 Definition of HOF Penalty
- 4 Optimization
- 4.1 Proximity Operator via Minimum-Norm-Point Problem
- 4.2 Network Flow Algorithm
- 5 Related Work
- 6 Experiments
- 6.1 Synthetic Data
- 6.2 Real-World Data
- 7 Conclusion
- References
- Joint Semi-supervised Similarity Learning for Linear Classification
- 1 Introduction
- 2 Related Work
- 3 Joint Metric and Classifier Learning
- 4 Generalization Bound for Joint Similarity Learning
- 5 Experiments
- 5.1 Experimental Setting
- 5.2 Experimental Results
- 6 Conclusion
- References
- Learning Compact and Effective Distance Metrics with Diversity Regularization
- 1 Introduction
- 2 Related Works
- 3 Diversify Distance Metric Learning
- 3.1 A Latent Space Modeling View of DML
- 3.2 Diversify DML
- 3.3 Optimization
- 4 Experiments
- 4.1 Datasets
- 4.2 Experimental Settings
- 4.3 Retrieval
- 4.4 Clustering
- 4.5 Classification
- 4.6 Sensitivity to Parameters
- 5 Conclusions
- References
- Scalable Metric Learning for Co-Embedding
- 1 Introduction
- 2 Metric Learning
- 3 Co-embedding as Metric Learning
- 4 Algorithm
- 5 Empirical Computational Complexity
- 6 Case Study: Multi-label Classification
- 7 Case Study: Tagging via Tensor Completion
- 8 Conclusion
- References
- A An Auxiliary Lemma
- Large Scale Learning and Big Data
- Adaptive Stochastic Primal-Dual Coordinate Descent for Separable Saddle Point Problems
- 1 Introduction
- 2 Primal-dual Framework for Convex-Concave Saddle Point Problems
- 3 Adaptive Stochastic Primal-Dual Coordinate Descent
- 3.1 Convergence Analysis
- 3.2 More Comparison with SDPC
- 4 Empirical Results
- 4.1 Ridge Regression
- 4.2 Binary Classification on Real-world Datasets
- 5 Conclusion and Future Work
- References
- Hash Function Learning via Codewords
- 1 Introduction
- 2 Formulation
- 3 Learning Algorithm
- 4 Insights to Generalization Performance
- 5 Experiments
- 5.1 Supervised Hash Learning Results
- 5.2 Transductive Hash Learning Results
- 6 Conclusions
- References
- HierCost: Improving Large Scale Hierarchical Classification with Cost Sensitive Learning
- 1 Introduction
- 2 Definitions and Notations
- 3 Motivation and Related Work
- 4 Methods
- 4.1 Cost Calculations
- 4.2 Optimization
- 4.3 Dealing with Hierarchical Multi-label Classification
- 5 Experimental Evaluations
- 5.1 Datasets
- 5.2 Evaluation Metrics
- 5.3 Experimental Details
- 5.4 Methods for Comparison
- 5.5 Results
- 6 Conclusions
- References
- Large Scale Optimization with Proximal Stochastic Newton-Type Gradient Descent
- 1 Introduction and Problem Statement
- 1.1 Notations and Assumptions
- 2 The PROXTONE Method
- 2.1 The Regularized Quadratic Model in Algorithm 2
- 2.2 The Hessian Approximation
- 3 Convergence Analysis
- 4 Numerical Experiments
- 5 Conclusions
- A Proof of Theorem 1
- B Proof of Theorem 2
- References
- Erratum to: Bayesian Active Clustering with Pairwise Constraints
- Erratum to: Scalable Metric Learning for Co-Embedding
- Author Index
- Erratum to: Predicting Unseen Labels Using Label Hierarchies in Large-Scale Multi-label Learning
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.