Chapter 1: Nonlinear dimensionality reduction
Nonlinear dimensionality reduction, also known as manifold learning, is any of various related techniques that aim to project high-dimensional data, potentially existing across non-linear manifolds which cannot be adequately captured by linear decomposition methods, onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low-dimensional space, or learning the mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa) itself. It is possible to understand the approaches that are presented here as generalizations of linear decomposition methods that are used for dimensionality reduction. Some examples of these methods include principal component analysis and singular value decomposition methods.
As a result of the substantial amount of time and space that is required for processing, high-dimensional data might be difficult for robots to deal with. It is difficult for humans to interpret or comprehend data that is distributed across more than three dimensions, which is another issue that it offers. For the purpose of making algorithms more effective and enabling analysts to visualize trends and patterns, reducing the dimensionality of a data collection while maintaining the basic aspects of the set mostly unchanged can be beneficial.
When referring to the representations of data that have fewer dimensions, the term "intrinsic variables" is frequently used. It can be inferred from this description that these are the values that contributed to the production of the data. As an illustration, take into consideration a dataset that includes pictures of the letter 'A' that have been rotated and scaled by different amounts. Every image consists of 3232 pixels. A vector consisting of 1024 pixel values can be used to represent each individual image. A sample on a two-dimensional manifold in a space with 1024 dimensions (a Hamming space) is represented by each row respectively. Due to the fact that two variables, namely rotation and scale, were altered in order to generate the data, the intrinsic dimensionality is two. Due to the fact that the appearance of the letter 'A' is consistent across all instances, the information regarding its shape and appearance is not included in the intrinsic variables. During the process of nonlinear dimensionality reduction, the information that is correlated will be discarded (the letter 'A'), and only the information that is variable (rotation and scale) will be recovered. The figure on the right displays several example images from this dataset (in order to conserve space, not all of the input images are included), as well as a plot of the two-dimensional points that are produced as a result of employing a non-linear deep learning (NLDR) technique (in this particular instance, Manifold Sculpting) to compress the data to only two dimensions.
When the same dataset is reduced into two dimensions using principal component analysis, which is an algorithm for linear dimensionality reduction, the values that are produced are not as well organized as they would be if the principle component analysis algorithm were employed. This indicates that the high-dimensional vectors that sample this manifold fluctuate in a non-linear manner. Each of these vectors represents a letter 'A'.
Because of this, it ought to be obvious that NLDR has a number of applications within the realm of computer vision. As an illustration, take into consideration a robot that navigates in a closed and static area by means of a photograph. It is possible to think of the images that were captured by that camera as samples on a manifold in high-dimensional space. The intrinsic variables of that manifold will be used to describe the location and orientation of the robot.
In dynamical systems, invariant manifolds are of widespread importance for the purpose of reducing the order of the model structure. It is important to note that in the event that there is an attractive invariant manifold in the phase space, neighboring trajectories will converge onto it and remain on it indefinitely. This makes the manifold a potential candidate for dimensionality reduction of the dynamical system. The theory of spectral submanifolds (SSM) provides requirements for the presence of unique attracting invariant objects in a wide variety of dynamical systems. This is despite the fact that the existence of such manifolds is not guaranteed in general. Research that is currently being conducted in the field of NLDR aims to develop modeling techniques by unraveling the observation manifolds that are connected with dynamical systems.
For your convenience, the following is a list of some of the more well-known nonlinear dimensionality reduction approaches.
One of the first and most widely used approaches for non-linear data recovery is Sammon's mapping.
Both the self-organizing map (SOM), which is also known as the Kohonen map, and its probabilistic counterpart, the generative topographic mapping (GTM), make use of a point representation in the embedded space in order to construct a latent variable model that is founded on a non-linear mapping from the embedded space to the high-dimensional space. The work that is done on density networks, which is likewise based around the same probabilistic model, is related to these methodologies.
One of the most used algorithms for dimensional reduction is called kernel principal component analysis (PCA). PCA starts with the computation of the covariance matrix of the data.
a.
×
a.
A display style that is m times n.
the matrix
X
\displaystyle \mathbf the letter X} }
The data is then projected onto the first k eigenvectors of the matrix that was just created. On the other hand, KPCA starts by computing the covariance matrix of the data after it has been transformed into a space with a higher dimension.
In the same manner as principal component analysis (PCA), it then projects the modified data onto the first k eigenvectors of that matrix. Through the utilization of the kernel trick, a significant portion of the computation is eliminated, allowing for the entire process to be carried out without the need for actual processing.
a.
()
x
()
The mathematical expression \displaystyle \Phi (\mathbf {x})}
.. Without a doubt
a.
\displaystyle the letter Phi}
chosen in such a way that it has a kernel that is already known to match to it. Unfortunately, it is not easy to identify an appropriate kernel for a specific problem, and as a result, KPCA does not produce satisfactory results for certain situations when typical kernels are used. Regarding the Swiss roll manifold, for instance, it is well-known that it performs badly when these kernels are present. However, by generating a data-dependent kernel matrix, one might see some other approaches that perform well in such contexts as special cases of kernel principal component analysis (PCA). For example, Laplacian Eigenmaps and LLE are two examples of such methods.
As a result of the fact that KPCA possesses an internal model, it is possible to use it to map points onto its embedding that were not accessible during the training phase.
Principal curves and manifolds provide the natural geometric framework for nonlinear dimensionality reduction. They also extend the geometric interpretation of principal component analysis (PCA) by explicitly creating an embedded manifold and by encoding using ordinary geometric projection onto the manifold. This method was initially suggested by Trevor Hastie in his thesis from 1984, which he formally presented in 1989 following its initial introduction. This concept has been investigated further by a great number of authors.
The "simplicity" of the manifold can be defined in a variety of ways, depending on the complexities of the situation at hand; nonetheless, it is typically evaluated based on the manifold's intrinsic dimensionality and/or smoothness. It is common practice to define the primary manifold as a solution to an optimization problem. Not only does the objective function incorporate a quality of data approximation, but it also incorporates some penalty terms for the bending of the manifold. Linear principal component analysis (PCA) and Kohonen's sum of moments (SOM) are the methods that are used to construct the most common initial approximations.
In order to accomplish dimensionality reduction, Laplacian eigenmaps make use of spectral techniques. The data are assumed to be located in a low-dimensional manifold within a high-dimensional space, which is the fundamental premise upon which this technique is based. It is not possible to embed out-of-sample points with this methodology; however, there are methods that are based on Reproducing kernel Hilbert space regularization that can be used to add this capacity. These kinds of methods are also applicable to other nonlinear dimensionality reduction approaches that are available currently.
Conventional methods, such as principal component analysis, do not take into account the inherent geometry of the data. The neighborhood information of the data set is used to construct a graph using the Laplacian eigenmaps algorithm. Each data point represents a node on the graph, and the connectivity between nodes is determined by the proximity of surrounding points (using a method such as the k-nearest neighbor algorithm, for example). It is possible to think of the graph that was formed in this manner as a discrete approximation of the low-dimensional manifold being represented in the high-dimensional space. The process of minimization of a cost function that is based on...