Approximating 1-Wasserstein Distance with Trees
- URL: http://arxiv.org/abs/2206.12116v1
- Date: Fri, 24 Jun 2022 07:19:50 GMT
- Title: Approximating 1-Wasserstein Distance with Trees
- Authors: Makoto Yamada, Yuki Takezawa, Ryoma Sato, Han Bao, Zornitsa Kozareva,
Sujith Ravi
- Abstract summary: Wasserstein distance shows efficacy in various types of natural language processing (NLP) and computer vision (CV) applications.
One of the challenges in estimating Wasserstein distance is that it is computationally expensive and does not scale well for many distribution comparison tasks.
We propose a simple yet efficient L1-regularized approach to learning the weights of the edges in a tree.
- Score: 41.77145868123863
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Wasserstein distance, which measures the discrepancy between distributions,
shows efficacy in various types of natural language processing (NLP) and
computer vision (CV) applications. One of the challenges in estimating
Wasserstein distance is that it is computationally expensive and does not scale
well for many distribution comparison tasks. In this paper, we aim to
approximate the 1-Wasserstein distance by the tree-Wasserstein distance (TWD),
where TWD is a 1-Wasserstein distance with tree-based embedding and can be
computed in linear time with respect to the number of nodes on a tree. More
specifically, we propose a simple yet efficient L1-regularized approach to
learning the weights of the edges in a tree. To this end, we first show that
the 1-Wasserstein approximation problem can be formulated as a distance
approximation problem using the shortest path distance on a tree. We then show
that the shortest path distance can be represented by a linear model and can be
formulated as a Lasso-based regression problem. Owing to the convex
formulation, we can obtain a globally optimal solution efficiently. Moreover,
we propose a tree-sliced variant of these methods. Through experiments, we
demonstrated that the weighted TWD can accurately approximate the original
1-Wasserstein distance.
Related papers
- Fast unsupervised ground metric learning with tree-Wasserstein distance [14.235762519615175]
unsupervised ground metric learning approaches have been introduced.
We propose to augment the WSV method by embedding samples and features on trees, on which we compute the tree-Wasserstein distance (TWD)
We demonstrate theoretically and empirically that the algorithm converges to a better approximation of the full WSV approach than the best known alternatives, and does so with $mathcalO(n3)$ complexity.
arXiv Detail & Related papers (2024-11-11T23:21:01Z) - Relative-Translation Invariant Wasserstein Distance [82.6068808353647]
We introduce a new family of distances, relative-translation invariant Wasserstein distances ($RW_p$)
We show that $RW_p distances are also real distance metrics defined on the quotient set $mathcalP_p(mathbbRn)/sim$ invariant to distribution translations.
arXiv Detail & Related papers (2024-09-04T03:41:44Z) - Sharper Bounds for Chebyshev Moment Matching with Applications to Differential Privacy and Beyond [26.339024618084476]
We prove that accurate recovery in the Wasserstein distance is possible with more noise than previously known.
As a main application, our result yields a simple "linear query" algorithm for constructing a differentially private synthetic data distribution.
We illustrate a second application of our new moment-based recovery bound in numerical linear algebra.
arXiv Detail & Related papers (2024-08-22T13:26:41Z) - A New Robust Partial $p$-Wasserstein-Based Metric for Comparing Distributions [8.176521989807748]
We introduce a new family of distances parameterized by $k ge 0$, called $k$-RPW that is based on computing the partial $2$-Wasserstein distance.
Our distance function achieves higher accuracy in comparison to the $1$-Wasserstein, $2$-Wasserstein, and TV distances for image retrieval tasks on noisy real-world data sets.
arXiv Detail & Related papers (2024-05-06T17:41:13Z) - Linearized Wasserstein dimensionality reduction with approximation
guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space.
We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size.
We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z) - Robust Multi-Object Tracking by Marginal Inference [92.48078680697311]
Multi-object tracking in videos requires to solve a fundamental problem of one-to-one assignment between objects in adjacent frames.
We present an efficient approach to compute a marginal probability for each pair of objects in real time.
It achieves competitive results on MOT17 and MOT20 benchmarks.
arXiv Detail & Related papers (2022-08-07T14:04:45Z) - Fixed Support Tree-Sliced Wasserstein Barycenter [40.087553363884396]
We propose a barycenter under the tree-Wasserstein distance, called the fixed support tree-Wasserstein barycenter (FS-TWB) and its extension, called the fixed support tree-sliced Wasserstein barycenter (FS-TSWB)
We show that the FS-TWB and FS-TSWB can be solved two orders of magnitude faster than the original Wasserstein barycenter.
arXiv Detail & Related papers (2021-09-08T04:50:33Z) - Learning High Dimensional Wasserstein Geodesics [55.086626708837635]
We propose a new formulation and learning strategy for computing the Wasserstein geodesic between two probability distributions in high dimensions.
By applying the method of Lagrange multipliers to the dynamic formulation of the optimal transport (OT) problem, we derive a minimax problem whose saddle point is the Wasserstein geodesic.
We then parametrize the functions by deep neural networks and design a sample based bidirectional learning algorithm for training.
arXiv Detail & Related papers (2021-02-05T04:25:28Z) - Supervised Tree-Wasserstein Distance [21.9998734051455]
We propose the Supervised Tree-Wasserstein (STW) distance, a fast, supervised metric learning method based on the tree metric.
We show that the STW distance can be computed fast, and improves the accuracy of document classification tasks.
arXiv Detail & Related papers (2021-01-27T16:24:51Z) - On Projection Robust Optimal Transport: Sample Complexity and Model
Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected.
Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances.
Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.