Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformers
- URL: http://arxiv.org/abs/2404.09411v4
- Date: Tue, 4 Jun 2024 00:09:59 GMT
- Title: Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformers
- Authors: Doron Haviv, Russell Zhang Kunes, Thomas Dougherty, Cassandra Burdziak, Tal Nawy, Anna Gilbert, Dana Pe'er,
- Abstract summary: We present Wasserstein Wormhole, a transformer-based autoencoder that embeds empirical distributions into a latent space.
We show that our objective function implies a bound on the error incurred when embedding non-Euclidean distances.
Wasserstein Wormhole unlocks new avenues for data analysis in the fields of computational geometry and single-cell biology.
- Score: 8.86135871860412
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Optimal transport (OT) and the related Wasserstein metric (W) are powerful and ubiquitous tools for comparing distributions. However, computing pairwise Wasserstein distances rapidly becomes intractable as cohort size grows. An attractive alternative would be to find an embedding space in which pairwise Euclidean distances map to OT distances, akin to standard multidimensional scaling (MDS). We present Wasserstein Wormhole, a transformer-based autoencoder that embeds empirical distributions into a latent space wherein Euclidean distances approximate OT distances. Extending MDS theory, we show that our objective function implies a bound on the error incurred when embedding non-Euclidean distances. Empirically, distances between Wormhole embeddings closely match Wasserstein distances, enabling linear time computation of OT distances. Along with an encoder that maps distributions to embeddings, Wasserstein Wormhole includes a decoder that maps embeddings back to distributions, allowing for operations in the embedding space to generalize to OT spaces, such as Wasserstein barycenter estimation and OT interpolation. By lending scalability and interpretability to OT approaches, Wasserstein Wormhole unlocks new avenues for data analysis in the fields of computational geometry and single-cell biology.
Related papers
- Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds [13.851780805245477]
We derive general constructions of Sliced-Wasserstein distances on Cartimatan-Hadamard manifold.
We also propose non-parametric schemes to minimize these new distances by approxing their Wasserstein gradient flows.
arXiv Detail & Related papers (2024-03-11T10:01:21Z) - Federated Wasserstein Distance [16.892296712204597]
We introduce a principled way of computing the Wasserstein distance between two distributions in a federated manner.
We show how to estimate the Wasserstein distance between two samples stored and kept on different devices/clients whilst a central entity/server orchestrates the computations.
arXiv Detail & Related papers (2023-10-03T11:30:50Z) - Linearized Wasserstein dimensionality reduction with approximation
guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space.
We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size.
We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z) - Markovian Sliced Wasserstein Distances: Beyond Independent Projections [51.80527230603978]
We introduce a new family of SW distances, named Markovian sliced Wasserstein (MSW) distance, which imposes a first-order Markov structure on projecting directions.
We compare distances with previous SW variants in various applications such as flows, color transfer, and deep generative modeling to demonstrate the favorable performance of MSW.
arXiv Detail & Related papers (2023-01-10T01:58:15Z) - Unbalanced Optimal Transport, from Theory to Numerics [0.0]
We argue that unbalanced OT, entropic regularization and Gromov-Wasserstein (GW) can work hand-in-hand to turn OT into efficient geometric loss functions for data sciences.
The main motivation for this review is to explain how unbalanced OT, entropic regularization and GW can work hand-in-hand to turn OT into efficient geometric loss functions for data sciences.
arXiv Detail & Related papers (2022-11-16T09:02:52Z) - Geodesic Sinkhorn for Fast and Accurate Optimal Transport on Manifolds [53.110934987571355]
We propose Geodesic Sinkhorn -- based on a heat kernel on a manifold graph.
We apply our method to the computation of barycenters of several distributions of high dimensional single cell data from patient samples undergoing chemotherapy.
arXiv Detail & Related papers (2022-11-02T00:51:35Z) - Spherical Sliced-Wasserstein [14.98994743486746]
Sliced-Wasserstein distance (SW) is restricted to data living in Euclidean spaces.
We focus more specifically on the sphere, for which we define a novel SW discrepancy, which we call spherical Sliced-Wasserstein.
Our construction is notably based on closed-form solutions of the Wasserstein distance on the circle, together with a new spherical Radon transform.
arXiv Detail & Related papers (2022-06-17T13:48:50Z) - Learning High Dimensional Wasserstein Geodesics [55.086626708837635]
We propose a new formulation and learning strategy for computing the Wasserstein geodesic between two probability distributions in high dimensions.
By applying the method of Lagrange multipliers to the dynamic formulation of the optimal transport (OT) problem, we derive a minimax problem whose saddle point is the Wasserstein geodesic.
We then parametrize the functions by deep neural networks and design a sample based bidirectional learning algorithm for training.
arXiv Detail & Related papers (2021-02-05T04:25:28Z) - On Projection Robust Optimal Transport: Sample Complexity and Model
Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected.
Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances.
Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z) - Theoretical Guarantees for Bridging Metric Measure Embedding and Optimal
Transport [18.61019008000831]
We consider a method allowing to embed the metric measure spaces in a common Euclidean space and compute an optimal transport (OT) on the embedded distributions.
This leads to what we call a sub-embedding robust Wasserstein (SERW) distance.
arXiv Detail & Related papers (2020-02-19T17:52:01Z) - Fast and Robust Comparison of Probability Measures in Heterogeneous
Spaces [62.35667646858558]
We introduce the Anchor Energy (AE) and Anchor Wasserstein (AW) distances, which are respectively the energy and Wasserstein distances instantiated on such representations.
Our main contribution is to propose a sweep line algorithm to compute AE emphexactly in log-quadratic time, where a naive implementation would be cubic.
We show that AE and AW perform well in various experimental settings at a fraction of the computational cost of popular GW approximations.
arXiv Detail & Related papers (2020-02-05T03:09:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.