
Unsupervised Learning
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions


Persons
Content
1 Introduction 1
1.1 Part I: The Self-Organizing Method 1
1.2 Part II: Dynamic Self-Organization for Image Filtering and Multimedia Retrieval 2
1.3 Part III: Dynamic Self-Organization for Image Segmentation and Visualization 5
1.4 Future Directions 7
2 Unsupervised Learning 9
2.1 Introduction 9
2.2 Unsupervised Clustering 9
2.3 Distance Metrics for Unsupervised Clustering 11
2.4 Unsupervised Learning Approaches 13
2.4.1 Partitioning and Cluster Membership 13
2.4.2 Iterative Mean-Squared Error Approaches 15
2.4.3 Mixture Decomposition Approaches 17
2.4.4 Agglomerative Hierarchical Approaches 18
2.4.5 Graph-Theoretic Approaches 20
2.4.6 Evolutionary Approaches 20
2.4.7 Neural Network Approaches 21
2.5 Assessing Cluster Quality and Validity 21
2.5.1 Cost Function-Based Cluster Validity Indices 22
2.5.2 Density-Based Cluster Validity Indices 23
2.5.3 Geometric-Based Cluster Validity Indices 24
3 Self-Organization 27
3.1 Introduction 27
3.2 Principles of Self-Organization 27
3.2.1 Synaptic Self-Amplification and Competition 27
3.2.2 Cooperation 28
3.2.3 Knowledge Through Redundancy 29
3.3 Fundamental Architectures 29
3.3.1 Adaptive Resonance Theory 29
3.3.2 Self-Organizing Map 37
3.4 Other Fixed Architectures for Self-Organization 43
3.4.1 Neural Gas 44
3.4.2 Hierarchical Feature Map 45
3.5 Emerging Architectures for Self-Organization 46
3.5.1 Dynamic Hierarchical Architectures 47
3.5.2 Nonstationary Architectures 48
3.5.3 Hybrid Architectures 50
3.6 Conclusion 50
4 Self-Organizing Tree Map 53
4.1 Introduction 53
4.2 Architecture 54
4.3 Competitive Learning 55
4.4 Algorithm 57
4.5 Evolution 61
4.5.1 Dynamic Topology 61
4.5.2 Classification Capability 64
4.6 Practical Considerations, Extensions, and Refinements 68
4.6.1 The Hierarchical Control Function 68
4.6.2 Learning, Timing, and Convergence 71
4.6.3 Feature Normalization 73
4.6.4 Stop Criteria 73
4.7 Conclusions 74
5 Self-Organization in Impulse Noise Removal 75
5.1 Introduction 75
5.2 Review of Traditional Median-Type Filters 76
5.3 The Noise-Exclusive Adaptive Filtering 82
5.3.1 Feature Selection and Impulse Detection 82
5.3.2 Noise Removal Filters 84
5.4 Experimental Results 86
5.5 Detection-Guided Restoration and Real-Time Processing 99
5.5.1 Introduction 99
5.5.2 Iterative Filtering 101
5.5.3 Recursive Filtering 104
5.5.4 Real-Time Processing of Impulse Corrupted TV Pictures 105
5.5.5 Analysis of the Processing Time 109
5.6 Conclusions 115
6 Self-Organization in Image Retrieval 119
6.1 Retrieval of Visual Information 120
6.2 Visual Feature Descriptor 122
6.2.1 Color Histogram and Color Moment Descriptors 122
6.2.2 Wavelet Moment and Gabor Texture Descriptors 123
6.2.3 Fourier and Moment-based Shape Descriptors 125
6.2.4 Feature Normalization and Selection 127
6.3 User-Assisted Retrieval 130
6.3.1 Radial Basis Function Method 132
6.4 Self-Organization for Pseudo Relevance Feedback 136
6.5 Directed Self-Organization 140
6.5.1 Algorithm 142
6.6 Optimizing Self-Organization for Retrieval 146
6.6.1 Genetic Principles 147
6.6.2 System Architecture 149
6.6.3 Genetic Algorithm for Feature Weight Detection 150
6.7 Retrieval Performance 153
6.7.1 Directed Self-Organization 153
6.7.2 Genetic Algorithm Weight Detection 155
6.8 Summary 157
7 The Self-Organizing Hierarchical Variance Map 159
7.1 An Intuitive Basis 160
7.2 Model Formulation and Breakdown 162
7.2.1 Topology Extraction via Competitive Hebbian Learning 163
7.2.2 Local Variance via Hebbian Maximal Eigenfilters 165
7.2.3 Global and Local Variance Interplay for Map Growth and Termination 170
7.3 Algorithm 173
7.3.1 Initialization, Continuation, and Presentation 173
7.3.2 Updating Network Parameters 175
7.3.3 Vigilance Evaluation and Map Growth 175
7.3.4 Topology Adaptation 176
7.3.5 Node Adaptation 177
7.3.6 Optional Tuning Stage 177
7.4 Simulations and Evaluation 177
7.4.1 Observations of Evolution and Partitioning 178
7.4.2 Visual Comparisons with Popular Mean-Squared Error Architectures 181
7.4.3 Visual Comparison Against Growing Neural Gas 183
7.4.4 Comparing Hierarchical with Tree-Based Methods 183
7.5 Tests on Self-Determination and the Optional Tuning Stage 187
7.6 Cluster Validity Analysis on Synthetic and UCI Data 187
7.6.1 Performance vs. Popular Clustering Methods 190
7.6.2 IRIS Dataset 192
7.6.3 WINE Dataset 195
7.7 Summary 195
8 Microbiological Image Analysis Using Self-Organization 197
8.1 Image Analysis in the Biosciences 197
8.1.1 Segmentation: The Common Denominator 198
8.1.2 Semi-supervised versus Unsupervised Analysis 199
8.1.3 Confocal Microscopy and Its Modalities 200
8.2 Image Analysis Tasks Considered 202
8.2.1 Visualising Chromosomes During Mitosis 202
8.2.2 Segmenting Heterogeneous Biofilms 204
8.3 Microbiological Image Segmentation 205
8.3.1 Effects of Feature Space Definition 207
8.3.2 Fixed Weighting of Feature Space 209
8.3.3 Dynamic Feature Fusion During Learning 213
8.4 Image Segmentation Using Hierarchical Self-Organization 215
8.4.1 Gray-Level Segmentation of Chromosomes 215
8.4.2 Automated Multilevel Thresholding of Biofilm 220
8.4.3 Multidimensional Feature Segmentation 221
8.5 Harvesting Topologies to Facilitate Visualization 226
8.5.1 Topology Aware Opacity and Gray-Level Assignment 227
8.5.2 Visualization of Chromosomes During Mitosis 228
8.6 Summary 233
9 Closing Remarks and Future Directions 237
9.1 Summary of Main Findings 237
9.1.1 Dynamic Self-Organization: Effective Models for Efficient Feature Space Parsing 237
9.1.2 Improved Stability, Integrity, and Efficiency 238
9.1.3 Adaptive Topologies Promote Consistency and Uncover Relationships 239
9.1.4 Online Selection of Class Number 239
9.1.5 Topologies Represent a Useful Backbone for Visualization or Analysis 240
9.2 Future Directions 240
9.2.1 Dynamic Navigation for Information Repositories 241
9.2.2 Interactive Knowledge-Assisted Visualization 243
9.2.3 Temporal Data Analysis Using Trajectories 245
Appendix A 249
A.1 Global and Local Consistency Error 249
References 251
Index 269
CHAPTER 1
Introduction
With the explosion of information brought about by this Multimedia Age, the question of how such information might be effectively harvested, archived, and analysed, remains a monumental challenge facing today’s research community. The processing of such information, however, is often fraught with the need for conceptual interpretation—a relatively simple task for humans, yet arduous for computers. In order to handle the oppressive volumes of information that are becoming readily accessible in consumer and industrial sectors, some level of automation is desirable.
Automation requires computational systems that exhibit some degree of intelligence, in terms of the ability of a system to formulate its own models of the data in question with little or no user intervention. Such systems must be able to make basic decisions about what information is actually important and what is not. In effect, like a human user, the system must be able to discover characteristic properties of the data in some appropriate manner, without a teacher. This process is known as unsupervised learning (sometimes referred to as clustering or unsupervised pattern classification; an essentially pure form of data mining).
This book primarily introduces a new approach to the general problem of unsupervised learning, based on the principles of dynamic self-organization. Inspired by the relative success of other popular research on self-organizing neural networks for data clustering and feature extraction, this book presents new members within the family of generative, Self-Organizing Maps, namely: the self-organizing tree map (SOTM) and its advanced form, the self-organizing hierarchical variance map (SOHVM). While the devised approach is essentially generic, the core application considered in this book is the automatic, unsupervised data clustering for multimedia applications and unsupervised segmentation of microbiological image data.
1.1 PART I: THE SELF-ORGANIZING METHOD
Computational technologies based on Artificial Neural Networks (ANN) have been the focus of much research into the problem of unsupervised learning, in particular, for network architectures that are based on principles of Self-Organization. Such principles are in many ways centered on Turing’s initial observation in 1952 [1], namely, that Global order can arise from Local interactions. With much support from neurobiological research, such mechanisms are believed to be analogous to the organization that takes place in the human brain.
Clustering algorithms use unsupervised learning rules to group unlabeled training data into similar or dense clusters. Unsupervised training algorithms depend upon internally generated error measures, which are derived solely from training data. The network has no knowledge of the correct answer during training and, consequently, must derive the errors and the necessary weight modifications directly from the statistics of the training data. As a result, input patterns are stored as a set of cluster prototypes or exemplars—representations or natural groupings of similar data. In forming a description of an unknown set of data, such network architectures are characterized by their adherence to four key properties [2]: synaptic self-amplification for mining correlated stimuli, competition over limited resources, cooperative encoding of information, and the implicit ability to encode pattern redundancy as knowledge. Such principles are, in many ways, a reflection of Turing’s observations previously discussed.
Part I of this book consists of Chapters 2 and 3. It gives an extensive review of the general problems of unsupervised clustering, with emphasis placed on the inherent relationship that exists between unsupervised learning and Self-Organization. The unsupervised learning problem is first defined with respect to the concepts of similarity and distance. A survey of unsupervised techniques from the broader field is then conducted to establish the context for more focused surveys on self-organization-based principles and architectures. The issue of validating unsupervised clustering solutions in the absence of a ground truth is also addressed.
1.2 PART II: DYNAMIC SELF-ORGANIZATION FOR IMAGE FILTERING AND MULTIMEDIA RETRIEVAL
Multimedia processing has seen impressive growth in the past decade in terms of both theoretical development and applications. It represents a leading technology in a number of important areas that warrant significant need for data mining, namely, digital telecommunications, multimedia systems, high dimensional image analysis and visualization, information retrieval, biology, robotics and manufacturing, and intelligent sensing systems. Inherently unsupervised in nature, neural network architectures based on principles of Self-Organization appear to be a natural fit.
In Part II of this book, the SOTM and its recently successful application in multimedia processing is presented. This neural network architecture incorporates hierarchical properties by virtue of its growth, in a manner that is flexible in terms of revealing the underlying data space without being constrained by an imposed topological framework. As such, the SOTM exhibits many desirable properties over traditional self-organizing feature map (SOFM) based strategies. Chapter 4 of the book will provide an in-depth coverage of this architecture. Chapters 5 and 6 will then cover a series of pertinent real-world applications with regard to the processing of multimedia data. This includes problems in image-processing techniques, such as the automated modeling and removal of impulse noise in digital images, and problems in image classification in multimedia indexing and retrieval.
In Chapter 4, the SOTM algorithm is explored and developed, wherein a number of enhancements and modifications are proposed, justified, and tested, with the goal of rendering the SOTM more robust under application to different datasets. Specifically, alternative modalities for hierarchical control and learning are considered, in addition to more appropriate stopping criteria linked to aspects of the input data. The SOTM is then explored as a means of segmenting biofilm images, where its strengths and flexibility as a dynamic clustering model for segmentation are explored. Limitations and deficiencies of the SOTM are also identified.
In Chapter 5, the SOTM is applied to the automated modeling and removal of impulse noise in digital images. Improving the quality of images degraded by noise is a classic problem in image processing [3]. In the early stages of signal and image processing, linear filters were the primary tools for noise cleaning. Later, the development of nonlinear filtering techniques for signal and image processing was spurred by some drawbacks of linear filters [4]. However, one problem with nonlinear filers such as the median filter is that they remove the fine details in the image and change the signal structure. In addition, improved nonlinear filters, such as the weighted median filter, multistage median filter, and nonlinear mean filters, have better detail-preserving characteristics at the expense of poorer noise suppression. Here, a novel approach for suppressing impulse noise in digital images is proposed for effectively preserving more image detail than previously proposed methods. The noise removal system, shown in Figure 1.1a, consists of two steps: the detection of the noise and the reconstruction of the image. As the SOTM network has the capability to classify pixels in an image, it is employed to detect the impulses. A noise-exclusive median (NEM) filtering algorithm and a noise-exclusive arithmetic mean (NEAM) filtering algorithm are proposed to restore the image. This system is able to detect noise locations accurately, and thus, achieves the best possible restoration of images corrupted by impulse noise.
FIGURE 1.1 Unsupervised Learning–based framework for (a) automated modeling and removal of impulse noise in digital images and (b) image classification in multimedia indexing and retrieval.
In Chapter 6, the SOTM is applied to problems in image classification in multimedia indexing and retrieval. The system architecture is shown in Figure 1.1b. In multimedia database retrieval, relevance feedback (RF) is a popular and effective way to improve the performance of image re-ranking and retrieval. However, RF needs a high level of human participation, which often leads to excessive subjective errors. Here, an automatic RF is present, using the SOTM, which minimizes user participation, providing a more user-friendly environment and avoiding errors caused by excessive human involvement. Unlike the conventional retrieval system, where the user’s direct input is required in the execution of the RF algorithm, SOTM estimation is now adopted to guide the adaptation of the RF parameters. As shown in Figure 1.1b, the initially retrieved samples are labeled with the unsupervised module, and image re-ranking is performed by the pseudo-labeled samples. As a result, instead of imposing a greater responsibility on the user, independent learning can be integrated to improve retrieval accuracy. This makes it possible to obtain either a fully automatic or a semiautomatic RF system suitable for practical applications.
1.3 PART III: DYNAMIC SELF-ORGANIZATION FOR IMAGE SEGMENTATION AND VISUALIZATION
Much emphasis of this book is placed on Part III, on the developments of the SOHVM and its application in the unsupervised segmentation and visualization of microbiological...
System requirements
File format: PDF
Copy-Protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our eBook Help page.