A kernel Principal Component Analysis (kPCA) digest with a new backward
mapping (pre-image reconstruction) strategy
- URL: http://arxiv.org/abs/2001.01958v2
- Date: Wed, 13 Jan 2021 18:28:20 GMT
- Title: A kernel Principal Component Analysis (kPCA) digest with a new backward
mapping (pre-image reconstruction) strategy
- Authors: Alberto Garc\'ia-Gonz\'alez, Antonio Huerta, Sergio Zlotnik and Pedro
D\'iez
- Abstract summary: Principal Component Analysis (PCA) is very effective if data have linear structure.
But fails in identifying a possible dimensionality reduction if data belong to a nonlinear low-dimensional manifold.
For nonlinear dimensionality reduction, kernel Principal Component Analysis (kPCA) is appreciated because of its simplicity and ease implementation.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Methodologies for multidimensionality reduction aim at discovering
low-dimensional manifolds where data ranges. Principal Component Analysis (PCA)
is very effective if data have linear structure. But fails in identifying a
possible dimensionality reduction if data belong to a nonlinear low-dimensional
manifold. For nonlinear dimensionality reduction, kernel Principal Component
Analysis (kPCA) is appreciated because of its simplicity and ease
implementation. The paper provides a concise review of PCA and kPCA main ideas,
trying to collect in a single document aspects that are often dispersed.
Moreover, a strategy to map back the reduced dimension into the original high
dimensional space is also devised, based on the minimization of a discrepancy
functional.
Related papers
- Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - Learning-Augmented K-Means Clustering Using Dimensional Reduction [1.7243216387069678]
We propose a solution to reduce the dimensionality of the dataset using Principal Component Analysis (PCA)
PCA is well-established in the literature and has become one of the most useful tools for data modeling, compression, and visualization.
arXiv Detail & Related papers (2024-01-06T12:02:33Z) - Entropic Wasserstein Component Analysis [8.744017403796406]
A key requirement for Dimension reduction (DR) is to incorporate global dependencies among original and embedded samples.
We combine the principles of optimal transport (OT) and principal component analysis (PCA)
Our method seeks the best linear subspace that minimizes reconstruction error using entropic OT, which naturally encodes the neighborhood information of the samples.
arXiv Detail & Related papers (2023-03-09T08:59:33Z) - DimenFix: A novel meta-dimensionality reduction method for feature
preservation [64.0476282000118]
We propose a novel meta-method, DimenFix, which can be operated upon any base dimensionality reduction method that involves a gradient-descent-like process.
By allowing users to define the importance of different features, which is considered in dimensionality reduction, DimenFix creates new possibilities to visualize and understand a given dataset.
arXiv Detail & Related papers (2022-11-30T05:35:22Z) - Supervised Linear Dimension-Reduction Methods: Review, Extensions, and
Comparisons [6.71092092685492]
Principal component analysis (PCA) is a well-known linear dimension-reduction method that has been widely used in data analysis and modeling.
This paper reviews selected techniques, extends some of them, and compares their performance through simulations.
Two of these techniques, partial least squares (PLS) and least-squares PCA (LSPCA), consistently outperform the others in this study.
arXiv Detail & Related papers (2021-09-09T17:57:25Z) - Improving Metric Dimensionality Reduction with Distributed Topology [68.8204255655161]
DIPOLE is a dimensionality-reduction post-processing step that corrects an initial embedding by minimizing a loss functional with both a local, metric term and a global, topological term.
We observe that DIPOLE outperforms popular methods like UMAP, t-SNE, and Isomap on a number of popular datasets.
arXiv Detail & Related papers (2021-06-14T17:19:44Z) - A Local Similarity-Preserving Framework for Nonlinear Dimensionality
Reduction with Neural Networks [56.068488417457935]
We propose a novel local nonlinear approach named Vec2vec for general purpose dimensionality reduction.
To train the neural network, we build the neighborhood similarity graph of a matrix and define the context of data points.
Experiments of data classification and clustering on eight real datasets show that Vec2vec is better than several classical dimensionality reduction methods in the statistical hypothesis test.
arXiv Detail & Related papers (2021-03-10T23:10:47Z) - Robust Principal Component Analysis: A Median of Means Approach [17.446104539598895]
Principal Component Analysis is a tool for data visualization, denoising, and dimensionality reduction.
Recent supervised learning methods have shown great success in dealing with outlying observations.
This paper proposes a PCA procedure based on the MoM principle.
arXiv Detail & Related papers (2021-02-05T19:59:05Z) - A Linearly Convergent Algorithm for Distributed Principal Component
Analysis [12.91948651812873]
This paper introduces a feedforward neural network-based one time-scale distributed PCA algorithm termed Distributed Sanger's Algorithm (DSA)
The proposed algorithm is shown to converge linearly to a neighborhood of the true solution.
arXiv Detail & Related papers (2021-01-05T00:51:14Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Manifold Learning via Manifold Deflation [105.7418091051558]
dimensionality reduction methods provide a valuable means to visualize and interpret high-dimensional data.
Many popular methods can fail dramatically, even on simple two-dimensional Manifolds.
This paper presents an embedding method for a novel, incremental tangent space estimator that incorporates global structure as coordinates.
Empirically, we show our algorithm recovers novel and interesting embeddings on real-world and synthetic datasets.
arXiv Detail & Related papers (2020-07-07T10:04:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.