Entropic Wasserstein Component Analysis
- URL: http://arxiv.org/abs/2303.05119v1
- Date: Thu, 9 Mar 2023 08:59:33 GMT
- Title: Entropic Wasserstein Component Analysis
- Authors: Antoine Collas, Titouan Vayer, R\'emi Flamary, Arnaud Breloy
- Abstract summary: A key requirement for Dimension reduction (DR) is to incorporate global dependencies among original and embedded samples.
We combine the principles of optimal transport (OT) and principal component analysis (PCA)
Our method seeks the best linear subspace that minimizes reconstruction error using entropic OT, which naturally encodes the neighborhood information of the samples.
- Score: 8.744017403796406
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Dimension reduction (DR) methods provide systematic approaches for analyzing
high-dimensional data. A key requirement for DR is to incorporate global
dependencies among original and embedded samples while preserving clusters in
the embedding space. To achieve this, we combine the principles of optimal
transport (OT) and principal component analysis (PCA). Our method seeks the
best linear subspace that minimizes reconstruction error using entropic OT,
which naturally encodes the neighborhood information of the samples. From an
algorithmic standpoint, we propose an efficient block-majorization-minimization
solver over the Stiefel manifold. Our experimental results demonstrate that our
approach can effectively preserve high-dimensional clusters, leading to more
interpretable and effective embeddings. Python code of the algorithms and
experiments is available online.
Related papers
- A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
In this work, we present a robust phase retrieval problem where the task is to recover an unknown signal.
Our proposed oracle avoids the need for computationally spectral descent, using a simple gradient step and outliers.
arXiv Detail & Related papers (2024-09-07T06:37:23Z) - Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL)
We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking.
We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z) - Entropic Neural Optimal Transport via Diffusion Processes [105.34822201378763]
We propose a novel neural algorithm for the fundamental problem of computing the entropic optimal transport (EOT) plan between continuous probability distributions.
Our algorithm is based on the saddle point reformulation of the dynamic version of EOT which is known as the Schr"odinger Bridge problem.
In contrast to the prior methods for large-scale EOT, our algorithm is end-to-end and consists of a single learning step.
arXiv Detail & Related papers (2022-11-02T14:35:13Z) - Entropic Descent Archetypal Analysis for Blind Hyperspectral Unmixing [45.82374977939355]
We introduce a new algorithm based on archetypal analysis for blind hyperspectral unmixing.
By using six standard real datasets, we show that our approach outperforms state-of-the-art matrix factorization and recent deep learning methods.
arXiv Detail & Related papers (2022-09-22T13:34:21Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z) - Unsupervised learning of disentangled representations in deep restricted
kernel machines with orthogonality constraints [15.296955630621566]
Constr-DRKM is a deep kernel method for the unsupervised learning of disentangled data representations.
We quantitatively evaluate the proposed method's effectiveness in disentangled feature learning.
arXiv Detail & Related papers (2020-11-25T11:40:10Z) - Deep Shells: Unsupervised Shape Correspondence with Optimal Transport [52.646396621449]
We propose a novel unsupervised learning approach to 3D shape correspondence.
We show that the proposed method significantly improves over the state-of-the-art on multiple datasets.
arXiv Detail & Related papers (2020-10-28T22:24:07Z) - A Manifold Proximal Linear Method for Sparse Spectral Clustering with
Application to Single-Cell RNA Sequencing Data Analysis [9.643152256249884]
This paper considers the widely adopted SSC model as an optimization model with nonsmooth and non objective.
We propose a new method (ManPL) that solves the original SSC problem.
Results of the proposed methods are established.
arXiv Detail & Related papers (2020-07-18T22:05:00Z) - IVFS: Simple and Efficient Feature Selection for High Dimensional
Topology Preservation [33.424663018395684]
We propose a simple and effective feature selection algorithm to enhance sample similarity preservation.
The proposed algorithm is able to well preserve the pairwise distances, as well as topological patterns, of the full data.
arXiv Detail & Related papers (2020-04-02T23:05:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.