Fast and Robust Comparison of Probability Measures in Heterogeneous
Spaces
- URL: http://arxiv.org/abs/2002.01615v3
- Date: Wed, 10 Feb 2021 09:47:53 GMT
- Title: Fast and Robust Comparison of Probability Measures in Heterogeneous
Spaces
- Authors: Ryoma Sato, Marco Cuturi, Makoto Yamada, Hisashi Kashima
- Abstract summary: We introduce the Anchor Energy (AE) and Anchor Wasserstein (AW) distances, which are respectively the energy and Wasserstein distances instantiated on such representations.
Our main contribution is to propose a sweep line algorithm to compute AE emphexactly in log-quadratic time, where a naive implementation would be cubic.
We show that AE and AW perform well in various experimental settings at a fraction of the computational cost of popular GW approximations.
- Score: 62.35667646858558
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Comparing two probability measures supported on heterogeneous spaces is an
increasingly important problem in machine learning. Such problems arise when
comparing for instance two populations of biological cells, each described with
its own set of features, or when looking at families of word embeddings trained
across different corpora/languages. For such settings, the Gromov Wasserstein
(GW) distance is often presented as the gold standard. GW is intuitive, as it
quantifies whether one measure can be isomorphically mapped to the other.
However, its exact computation is intractable, and most algorithms that claim
to approximate it remain expensive. Building on \cite{memoli-2011}, who
proposed to represent each point in each distribution as the 1D distribution of
its distances to all other points, we introduce in this paper the Anchor Energy
(AE) and Anchor Wasserstein (AW) distances, which are respectively the energy
and Wasserstein distances instantiated on such representations. Our main
contribution is to propose a sweep line algorithm to compute AE \emph{exactly}
in log-quadratic time, where a naive implementation would be cubic. This is
quasi-linear w.r.t. the description of the problem itself. Our second
contribution is the proposal of robust variants of AE and AW that uses rank
statistics rather than the original distances. We show that AE and AW perform
well in various experimental settings at a fraction of the computational cost
of popular GW approximations. Code is available at
\url{https://github.com/joisino/anchor-energy}.
Related papers
- Relative-Translation Invariant Wasserstein Distance [82.6068808353647]
We introduce a new family of distances, relative-translation invariant Wasserstein distances ($RW_p$)
We show that $RW_p distances are also real distance metrics defined on the quotient set $mathcalP_p(mathbbRn)/sim$ invariant to distribution translations.
arXiv Detail & Related papers (2024-09-04T03:41:44Z) - Properties of Discrete Sliced Wasserstein Losses [11.280151521887076]
The Sliced Wasserstein (SW) distance has become a popular alternative to the Wasserstein distance for comparing probability measures.
Widespread applications include image processing, domain adaptation and generative modelling, where it is common to optimise some parameters in order to minimise SW.
We investigate the regularity and optimisation properties of this energy, as well as its Monte-Carlo approximation $mathcalE_p$.
arXiv Detail & Related papers (2023-07-19T21:21:18Z) - Energy-Based Sliced Wasserstein Distance [47.18652387199418]
A key component of the sliced Wasserstein (SW) distance is the slicing distribution.
We propose to design the slicing distribution as an energy-based distribution that is parameter-free.
We then derive a novel sliced Wasserstein metric, energy-based sliced Waserstein (EBSW) distance.
arXiv Detail & Related papers (2023-04-26T14:28:45Z) - Linearized Wasserstein dimensionality reduction with approximation
guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space.
We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size.
We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z) - Markovian Sliced Wasserstein Distances: Beyond Independent Projections [51.80527230603978]
We introduce a new family of SW distances, named Markovian sliced Wasserstein (MSW) distance, which imposes a first-order Markov structure on projecting directions.
We compare distances with previous SW variants in various applications such as flows, color transfer, and deep generative modeling to demonstrate the favorable performance of MSW.
arXiv Detail & Related papers (2023-01-10T01:58:15Z) - Linear-Time Gromov Wasserstein Distances using Low Rank Couplings and
Costs [45.87981728307819]
The ability to compare and align related datasets living in heterogeneous spaces plays an increasingly important role in machine learning.
The Gromov-Wasserstein (GW) formalism can help tackle this problem.
arXiv Detail & Related papers (2021-06-02T12:50:56Z) - Quantized Gromov-Wasserstein [10.592277756185046]
Quantized Gromov Wasserstein (qGW) is a metric that treats parts as fundamental objects and fits into a hierarchy of theoretical upper bounds for the problem.
We develop an algorithm for approximating optimal GW matchings which yields algorithmic speedups and reductions in memory complexity.
We are able to go beyond outperforming state-of-the-art and apply GW matching at scales that are an order of magnitude larger than in the existing literature.
arXiv Detail & Related papers (2021-04-05T17:03:20Z) - The Unbalanced Gromov Wasserstein Distance: Conic Formulation and
Relaxation [0.0]
Comparing metric measure spaces (i.e. a metric space endowed with aprobability distribution) is at the heart of many machine learning problems.
The most popular distance between such metric spaces is the metric measureGro-Wasserstein (GW) distance of which is a quadratic.
The GW formulation alleviates the comparison of metric spaces equipped with arbitrary positive measures up to isometries.
arXiv Detail & Related papers (2020-09-09T12:38:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.