
Statistical Spoken Language Understanding Systems
Novel Strategies
Wiley-ISTE (Publisher)
1st Edition
Will be published approx. on 14. December 2010
Book
Hardback
254 pages
978-1-84821-203-9 (ISBN)
Description
This book provides a detailed and up-to-date overview on classification and data mining methods. The first part is focused on supervised classification algorithms and their applications, including recent research on the combination of classifiers. The second part deals with unsupervised data mining and knowledge discovery, with special attention to text mining. Discovering the underlying structure on a data set has been a key research topic associated to unsupervised techniques with multiple applications and challenges, from web-content mining to the inference of cancer subtypes in genomic microarray data. Among those, the book focuses on a new application for dialog systems which can be thereby made adaptable and portable to different domains. Clustering evaluation metrics and new approaches, such as the ensembles of clustering algorithms, are also described.
More details
Product info
GB
Edition
1., Auflage
Language
English
Place of publication
London
United Kingdom
Target group
Professional and scholarly
Product notice
sewn/stitched
Cloth over boards
Dimensions
Height: 236 mm
Width: 155 mm
Thickness: 20 mm
Weight
499 gr
ISBN-13
978-1-84821-203-9 (9781848212039)
Schweitzer Classification
Other editions
Additional editions

E-Book
01/2013
Wiley-ISTE
€139.99
Available for download

E-Book
01/2013
Wiley-ISTE
€139.99
Available for download
Persons
Amparo Albalate is a research assistant at the University of Ulm, Institute of Information Technology, Germany, pursuing her PhD on statistical language understanding for Spoken Language Dialog Systems.
Wolfgang Minker is Professor at the University of Ulm, Institute of Information Technology, Germany.
Wolfgang Minker is Professor at the University of Ulm, Institute of Information Technology, Germany.
Content
PART 1. STATE OF THE ART 1
Chapter 1. Introduction 3
1.1. Organization of the book 6
1.2. Utterance corpus 8
1.3. Datasets from the UCI repository10
1.4. Microarray dataset 13
1.5. Simulated datasets 14
Chapter 2. State of the Art in Clustering and Semi-Supervised Techniques 15
2.1. Introduction 15
2.2. Unsupervised machine learning (clustering) 15
2.3. A brief history of cluster analysis 16
2.4. Cluster algorithms 19
2.5. Applications of cluster analysis 52
2.6. Evaluation methods 77
2.7. Internal cluster evaluation 77
2.8. External cluster validation 80
2.9. Semi-supervised learning 84
2.10. Summary 88
PART 2. APPROACHES TO SEMI-SUPERVISED CLASSIFICATION 91
Chapter 3. Semi-Supervised Classification Using Prior Word Clustering 93
3.1. Introduction 93
3.2. Dataset 94
3.3. Utterance classification scheme 94
3.4. Semi-supervised approach based on term clustering 98
3.5. Disambiguation 113
3.6. Summary 124
Chapter 4. Semi-Supervised Classification Using Pattern Clustering 127
4.1. Introduction 127
4.2. New semi-supervised algorithm using the cluster and label strategy 128
4.3. Optimum cluster labeling 132
4.4. Supervised classification block 154
4.5. Datasets 159
4.6. An analysis of the bounds for the cluster and label approaches 162
4.7. Extension through cluster pruning 164
4.8. Simulations and results 173
4.9. Summary 179
PART 3 . CONTRIBUTIONS TO UNSUPERVISED CLASSIFICATION - ALGORITHMS TO DETECT THE OPTIMAL NUMBER OF CLUSTERS 183
Chapter 5. Detection of the Number of Clusters through Non-Parametric Clustering Algorithms 185
5.1. Introduction 185
5.2. New hierarchical pole-based clustering algorithm 186
5.3. Evaluation 190
5.4. Datasets 192
5.5. Summary 197
Chapter 6. Detecting the Number of Clusters through Cluster Validation 199
6.1. Introduction 199
6.2. Cluster validation methods 201
6.3. Combination approach based on quantiles 206
6.4. Datasets 212
6.5. Results 214
6.6. Application of speech utterances 223
6.7. Summary 224
Bibliography 227
Index 243
Chapter 1. Introduction 3
1.1. Organization of the book 6
1.2. Utterance corpus 8
1.3. Datasets from the UCI repository10
1.4. Microarray dataset 13
1.5. Simulated datasets 14
Chapter 2. State of the Art in Clustering and Semi-Supervised Techniques 15
2.1. Introduction 15
2.2. Unsupervised machine learning (clustering) 15
2.3. A brief history of cluster analysis 16
2.4. Cluster algorithms 19
2.5. Applications of cluster analysis 52
2.6. Evaluation methods 77
2.7. Internal cluster evaluation 77
2.8. External cluster validation 80
2.9. Semi-supervised learning 84
2.10. Summary 88
PART 2. APPROACHES TO SEMI-SUPERVISED CLASSIFICATION 91
Chapter 3. Semi-Supervised Classification Using Prior Word Clustering 93
3.1. Introduction 93
3.2. Dataset 94
3.3. Utterance classification scheme 94
3.4. Semi-supervised approach based on term clustering 98
3.5. Disambiguation 113
3.6. Summary 124
Chapter 4. Semi-Supervised Classification Using Pattern Clustering 127
4.1. Introduction 127
4.2. New semi-supervised algorithm using the cluster and label strategy 128
4.3. Optimum cluster labeling 132
4.4. Supervised classification block 154
4.5. Datasets 159
4.6. An analysis of the bounds for the cluster and label approaches 162
4.7. Extension through cluster pruning 164
4.8. Simulations and results 173
4.9. Summary 179
PART 3 . CONTRIBUTIONS TO UNSUPERVISED CLASSIFICATION - ALGORITHMS TO DETECT THE OPTIMAL NUMBER OF CLUSTERS 183
Chapter 5. Detection of the Number of Clusters through Non-Parametric Clustering Algorithms 185
5.1. Introduction 185
5.2. New hierarchical pole-based clustering algorithm 186
5.3. Evaluation 190
5.4. Datasets 192
5.5. Summary 197
Chapter 6. Detecting the Number of Clusters through Cluster Validation 199
6.1. Introduction 199
6.2. Cluster validation methods 201
6.3. Combination approach based on quantiles 206
6.4. Datasets 212
6.5. Results 214
6.6. Application of speech utterances 223
6.7. Summary 224
Bibliography 227
Index 243