Approximating 1-Wasserstein Distance with Trees
- URL: http://arxiv.org/abs/2206.12116v1
- Date: Fri, 24 Jun 2022 07:19:50 GMT
- Title: Approximating 1-Wasserstein Distance with Trees
- Authors: Makoto Yamada, Yuki Takezawa, Ryoma Sato, Han Bao, Zornitsa Kozareva,
Sujith Ravi
- Abstract summary: Wasserstein distance shows efficacy in various types of natural language processing (NLP) and computer vision (CV) applications.
One of the challenges in estimating Wasserstein distance is that it is computationally expensive and does not scale well for many distribution comparison tasks.
We propose a simple yet efficient L1-regularized approach to learning the weights of the edges in a tree.
- Score: 41.77145868123863
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Wasserstein distance, which measures the discrepancy between distributions,
shows efficacy in various types of natural language processing (NLP) and
computer vision (CV) applications. One of the challenges in estimating
Wasserstein distance is that it is computationally expensive and does not scale
well for many distribution comparison tasks. In this paper, we aim to
approximate the 1-Wasserstein distance by the tree-Wasserstein distance (TWD),
where TWD is a 1-Wasserstein distance with tree-based embedding and can be
computed in linear time with respect to the number of nodes on a tree. More
specifically, we propose a simple yet efficient L1-regularized approach to
learning the weights of the edges in a tree. To this end, we first show that
the 1-Wasserstein approximation problem can be formulated as a distance
approximation problem using the shortest path distance on a tree. We then show
that the shortest path distance can be represented by a linear model and can be
formulated as a Lasso-based regression problem. Owing to the convex
formulation, we can obtain a globally optimal solution efficiently. Moreover,
we propose a tree-sliced variant of these methods. Through experiments, we
demonstrated that the weighted TWD can accurately approximate the original
1-Wasserstein distance.
Related papers
- A New Robust Partial $p$-Wasserstein-Based Metric for Comparing Distributions [8.176521989807748]
We introduce a new family of distances parameterized by $k ge 0$, called $k$-RPW that is based on computing the partial $2$-Wasserstein distance.
Our distance function achieves higher accuracy in comparison to the $1$-Wasserstein, $2$-Wasserstein, and TV distances for image retrieval tasks on noisy real-world data sets.
arXiv Detail & Related papers (2024-05-06T17:41:13Z) - Linearized Wasserstein dimensionality reduction with approximation
guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space.
We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size.
We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z) - Robust Multi-Object Tracking by Marginal Inference [92.48078680697311]
Multi-object tracking in videos requires to solve a fundamental problem of one-to-one assignment between objects in adjacent frames.
We present an efficient approach to compute a marginal probability for each pair of objects in real time.
It achieves competitive results on MOT17 and MOT20 benchmarks.
arXiv Detail & Related papers (2022-08-07T14:04:45Z) - Fixed Support Tree-Sliced Wasserstein Barycenter [40.087553363884396]
We propose a barycenter under the tree-Wasserstein distance, called the fixed support tree-Wasserstein barycenter (FS-TWB) and its extension, called the fixed support tree-sliced Wasserstein barycenter (FS-TSWB)
We show that the FS-TWB and FS-TSWB can be solved two orders of magnitude faster than the original Wasserstein barycenter.
arXiv Detail & Related papers (2021-09-08T04:50:33Z) - Learning High Dimensional Wasserstein Geodesics [55.086626708837635]
We propose a new formulation and learning strategy for computing the Wasserstein geodesic between two probability distributions in high dimensions.
By applying the method of Lagrange multipliers to the dynamic formulation of the optimal transport (OT) problem, we derive a minimax problem whose saddle point is the Wasserstein geodesic.
We then parametrize the functions by deep neural networks and design a sample based bidirectional learning algorithm for training.
arXiv Detail & Related papers (2021-02-05T04:25:28Z) - Supervised Tree-Wasserstein Distance [21.9998734051455]
We propose the Supervised Tree-Wasserstein (STW) distance, a fast, supervised metric learning method based on the tree metric.
We show that the STW distance can be computed fast, and improves the accuracy of document classification tasks.
arXiv Detail & Related papers (2021-01-27T16:24:51Z) - Two-sample Test using Projected Wasserstein Distance [18.46110328123008]
We develop a projected Wasserstein distance for the two-sample test, a fundamental problem in statistics and machine learning.
A key contribution is to couple optimal projection to find the low dimensional linear mapping to maximize the Wasserstein distance between projected probability distributions.
arXiv Detail & Related papers (2020-10-22T18:08:58Z) - The Unbalanced Gromov Wasserstein Distance: Conic Formulation and
Relaxation [0.0]
Comparing metric measure spaces (i.e. a metric space endowed with aprobability distribution) is at the heart of many machine learning problems.
The most popular distance between such metric spaces is the metric measureGro-Wasserstein (GW) distance of which is a quadratic.
The GW formulation alleviates the comparison of metric spaces equipped with arbitrary positive measures up to isometries.
arXiv Detail & Related papers (2020-09-09T12:38:14Z) - On Projection Robust Optimal Transport: Sample Complexity and Model
Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected.
Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances.
Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z) - Fast and Robust Comparison of Probability Measures in Heterogeneous
Spaces [62.35667646858558]
We introduce the Anchor Energy (AE) and Anchor Wasserstein (AW) distances, which are respectively the energy and Wasserstein distances instantiated on such representations.
Our main contribution is to propose a sweep line algorithm to compute AE emphexactly in log-quadratic time, where a naive implementation would be cubic.
We show that AE and AW perform well in various experimental settings at a fraction of the computational cost of popular GW approximations.
arXiv Detail & Related papers (2020-02-05T03:09:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.