Linearized Wasserstein dimensionality reduction with approximation
guarantees
- URL: http://arxiv.org/abs/2302.07373v1
- Date: Tue, 14 Feb 2023 22:12:16 GMT
- Title: Linearized Wasserstein dimensionality reduction with approximation
guarantees
- Authors: Alexander Cloninger, Keaton Hamm, Varun Khurana, Caroline Moosm\"uller
- Abstract summary: LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space.
We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size.
We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
- Score: 65.16758672591365
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce LOT Wassmap, a computationally feasible algorithm to uncover
low-dimensional structures in the Wasserstein space. The algorithm is motivated
by the observation that many datasets are naturally interpreted as probability
measures rather than points in $\mathbb{R}^n$, and that finding low-dimensional
descriptions of such datasets requires manifold learning algorithms in the
Wasserstein space. Most available algorithms are based on computing the
pairwise Wasserstein distance matrix, which can be computationally challenging
for large datasets in high dimensions. Our algorithm leverages approximation
schemes such as Sinkhorn distances and linearized optimal transport to speed-up
computations, and in particular, avoids computing a pairwise distance matrix.
We provide guarantees on the embedding quality under such approximations,
including when explicit descriptions of the probability measures are not
available and one must deal with finite samples instead. Experiments
demonstrate that LOT Wassmap attains correct embeddings and that the quality
improves with increased sample size. We also show how LOT Wassmap significantly
reduces the computational cost when compared to algorithms that depend on
pairwise distance computations.
Related papers
- A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
In this work, we present a robust phase retrieval problem where the task is to recover an unknown signal.
Our proposed oracle avoids the need for computationally spectral descent, using a simple gradient step and outliers.
arXiv Detail & Related papers (2024-09-07T06:37:23Z) - Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - Efficient and Accurate Optimal Transport with Mirror Descent and
Conjugate Gradients [15.128885770407132]
We design a novel algorithm for optimal transport by drawing from the entropic optimal transport, mirror descent and conjugate gradients literatures.
Our scalable and GPU parallelizable algorithm is able to compute the Wasserstein distance with extreme precision, reaching relative error rates of $10-8$ without numerical stability issues.
arXiv Detail & Related papers (2023-07-17T14:09:43Z) - Near-optimal estimation of smooth transport maps with kernel
sums-of-squares [81.02564078640275]
Under smoothness conditions, the squared Wasserstein distance between two distributions could be efficiently computed with appealing statistical error upper bounds.
The object of interest for applications such as generative modeling is the underlying optimal transport map.
We propose the first tractable algorithm for which the statistical $L2$ error on the maps nearly matches the existing minimax lower-bounds for smooth map estimation.
arXiv Detail & Related papers (2021-12-03T13:45:36Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - Projection Robust Wasserstein Barycenter [36.97843660480747]
approximating the Wasserstein barycenter is numerically challenging because of the curse of dimensionality.
This paper proposes the projection robust Wasserstein barycenter (PRWB) that mitigates the curse of dimensionality.
arXiv Detail & Related papers (2021-02-05T19:23:35Z) - Learning High Dimensional Wasserstein Geodesics [55.086626708837635]
We propose a new formulation and learning strategy for computing the Wasserstein geodesic between two probability distributions in high dimensions.
By applying the method of Lagrange multipliers to the dynamic formulation of the optimal transport (OT) problem, we derive a minimax problem whose saddle point is the Wasserstein geodesic.
We then parametrize the functions by deep neural networks and design a sample based bidirectional learning algorithm for training.
arXiv Detail & Related papers (2021-02-05T04:25:28Z) - A Study of Performance of Optimal Transport [16.847501106437534]
We show that network simplex and augmenting path based algorithms can consistently outperform numerical matrix-scaling based methods.
We present a new algorithm that improves upon the classical Kuhn-Munkres algorithm.
arXiv Detail & Related papers (2020-05-03T20:37:05Z) - Scalable Distributed Approximation of Internal Measures for Clustering
Evaluation [5.144809478361603]
Internal measure for clustering evaluation is the silhouette coefficient, whose computation requires a quadratic number of distance calculations.
We present the first scalable algorithm to compute such a rigorous approximation for the evaluation of clusterings based on any metric distances.
We also prove that the algorithm can be adapted to obtain rigorous approximations of other internal measures of clustering quality, such as cohesion and separation.
arXiv Detail & Related papers (2020-03-03T10:28:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.