Related papers: Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformers

Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformers

URL: http://arxiv.org/abs/2404.09411v4
Date: Tue, 4 Jun 2024 00:09:59 GMT
Title: Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformers
Authors: Doron Haviv, Russell Zhang Kunes, Thomas Dougherty, Cassandra Burdziak, Tal Nawy, Anna Gilbert, Dana Pe'er,
Abstract summary: We present Wasserstein Wormhole, a transformer-based autoencoder that embeds empirical distributions into a latent space. We show that our objective function implies a bound on the error incurred when embedding non-Euclidean distances. Wasserstein Wormhole unlocks new avenues for data analysis in the fields of computational geometry and single-cell biology.
Score: 8.86135871860412
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Optimal transport (OT) and the related Wasserstein metric (W) are powerful and ubiquitous tools for comparing distributions. However, computing pairwise Wasserstein distances rapidly becomes intractable as cohort size grows. An attractive alternative would be to find an embedding space in which pairwise Euclidean distances map to OT distances, akin to standard multidimensional scaling (MDS). We present Wasserstein Wormhole, a transformer-based autoencoder that embeds empirical distributions into a latent space wherein Euclidean distances approximate OT distances. Extending MDS theory, we show that our objective function implies a bound on the error incurred when embedding non-Euclidean distances. Empirically, distances between Wormhole embeddings closely match Wasserstein distances, enabling linear time computation of OT distances. Along with an encoder that maps distributions to embeddings, Wasserstein Wormhole includes a decoder that maps embeddings back to distributions, allowing for operations in the embedding space to generalize to OT spaces, such as Wasserstein barycenter estimation and OT interpolation. By lending scalability and interpretability to OT approaches, Wasserstein Wormhole unlocks new avenues for data analysis in the fields of computational geometry and single-cell biology.

Related papers

Slicing Wasserstein Over Wasserstein Via Functional Optimal Transport [2.649859884914447]
Wasserstein distances define a metric between probability measures on arbitrary metric spaces.<n>Existing sliced WoW accelerations rely on parametric meta-measures or the existence of high-order moments.<n>We show that DSW minimization is equivalent to WoW minimization for discretized meta-measures.
arXiv Detail & Related papers (2025-09-26T09:59:14Z)
Fast Estimation of Wasserstein Distances via Regression on Sliced Wasserstein Distances [70.94157767200342]
We propose a fast estimation method based on regressing Wasserstein distance on sliced Wasserstein distances.<n>We show that accurate models can be learned from a small number of distribution pairs.<n>Our method consistently provides a better approximation of Wasserstein distance than the state-of-the-art Wasserstein embedding model, Wasserstein Wormhole.
arXiv Detail & Related papers (2025-09-24T19:30:53Z)
Differentiable Generalized Sliced Wasserstein Plans [10.764247782316984]
Optimal Transport (OT) has attracted significant interest in the machine learning community.<n>A novel slicing scheme, dubbed min-SWGG, lifts a single one-dimensional plan back to the original multidimensional space.<n>We show that min-SWGG inherits typical limitations of slicing methods.<n>We propose a differentiable approximation scheme to efficiently identify the optimal slice, even in high-dimensional settings.
arXiv Detail & Related papers (2025-05-28T07:18:08Z)
Wasserstein Distances Made Explainable: Insights into Dataset Shifts and Transport Phenomena [3.4991519098475843]
Wasserstein distances provide a powerful framework for comparing data distributions.<n>We propose a novel solution based on Explainable AI that allows us to efficiently and accurately attribute Wasserstein distances to various data components.
arXiv Detail & Related papers (2025-05-09T15:26:38Z)
Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds [13.851780805245477]
We derive general constructions of Sliced-Wasserstein distances on Cartimatan-Hadamard manifold. We also propose non-parametric schemes to minimize these new distances by approxing their Wasserstein gradient flows.
arXiv Detail & Related papers (2024-03-11T10:01:21Z)
Federated Wasserstein Distance [16.892296712204597]
We introduce a principled way of computing the Wasserstein distance between two distributions in a federated manner. We show how to estimate the Wasserstein distance between two samples stored and kept on different devices/clients whilst a central entity/server orchestrates the computations.
arXiv Detail & Related papers (2023-10-03T11:30:50Z)
Linearized Wasserstein dimensionality reduction with approximation guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space. We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size. We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z)
Markovian Sliced Wasserstein Distances: Beyond Independent Projections [51.80527230603978]
We introduce a new family of SW distances, named Markovian sliced Wasserstein (MSW) distance, which imposes a first-order Markov structure on projecting directions. We compare distances with previous SW variants in various applications such as flows, color transfer, and deep generative modeling to demonstrate the favorable performance of MSW.
arXiv Detail & Related papers (2023-01-10T01:58:15Z)
Unbalanced Optimal Transport, from Theory to Numerics [0.0]
We argue that unbalanced OT, entropic regularization and Gromov-Wasserstein (GW) can work hand-in-hand to turn OT into efficient geometric loss functions for data sciences. The main motivation for this review is to explain how unbalanced OT, entropic regularization and GW can work hand-in-hand to turn OT into efficient geometric loss functions for data sciences.
arXiv Detail & Related papers (2022-11-16T09:02:52Z)
Geodesic Sinkhorn for Fast and Accurate Optimal Transport on Manifolds [53.110934987571355]
We propose Geodesic Sinkhorn -- based on a heat kernel on a manifold graph. We apply our method to the computation of barycenters of several distributions of high dimensional single cell data from patient samples undergoing chemotherapy.
arXiv Detail & Related papers (2022-11-02T00:51:35Z)
Spherical Sliced-Wasserstein [14.98994743486746]
Sliced-Wasserstein distance (SW) is restricted to data living in Euclidean spaces. We focus more specifically on the sphere, for which we define a novel SW discrepancy, which we call spherical Sliced-Wasserstein. Our construction is notably based on closed-form solutions of the Wasserstein distance on the circle, together with a new spherical Radon transform.
arXiv Detail & Related papers (2022-06-17T13:48:50Z)
Learning High Dimensional Wasserstein Geodesics [55.086626708837635]
We propose a new formulation and learning strategy for computing the Wasserstein geodesic between two probability distributions in high dimensions. By applying the method of Lagrange multipliers to the dynamic formulation of the optimal transport (OT) problem, we derive a minimax problem whose saddle point is the Wasserstein geodesic. We then parametrize the functions by deep neural networks and design a sample based bidirectional learning algorithm for training.
arXiv Detail & Related papers (2021-02-05T04:25:28Z)
On Projection Robust Optimal Transport: Sample Complexity and Model Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected. Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances. Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z)
Theoretical Guarantees for Bridging Metric Measure Embedding and Optimal Transport [18.61019008000831]
We consider a method allowing to embed the metric measure spaces in a common Euclidean space and compute an optimal transport (OT) on the embedded distributions. This leads to what we call a sub-embedding robust Wasserstein (SERW) distance.
arXiv Detail & Related papers (2020-02-19T17:52:01Z)
Fast and Robust Comparison of Probability Measures in Heterogeneous Spaces [62.35667646858558]
We introduce the Anchor Energy (AE) and Anchor Wasserstein (AW) distances, which are respectively the energy and Wasserstein distances instantiated on such representations. Our main contribution is to propose a sweep line algorithm to compute AE emphexactly in log-quadratic time, where a naive implementation would be cubic. We show that AE and AW perform well in various experimental settings at a fraction of the computational cost of popular GW approximations.
arXiv Detail & Related papers (2020-02-05T03:09:23Z)
Max-Sliced Wasserstein Distance and its use for GANs [55.09958914575673]
Generative adversarial nets (GANs) and variational auto-encoders have significantly improved our distribution modeling capabilities.<n>We show that the sample complexity of the distance metrics remains one of the factors affecting GAN training.<n>We show that a proposed distance trains GANs on high-dimensional images up to a resolution of 256x256 easily.
arXiv Detail & Related papers (2019-04-11T17:59:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.