Augmented Sliced Wasserstein Distances
- URL: http://arxiv.org/abs/2006.08812v7
- Date: Thu, 17 Mar 2022 12:14:25 GMT
- Title: Augmented Sliced Wasserstein Distances
- Authors: Xiongjie Chen, Yongxin Yang, Yunpeng Li
- Abstract summary: We propose a new family of distance metrics, called augmented sliced Wasserstein distances (ASWDs)
ASWDs are constructed by first mapping samples to higher-dimensional hypersurfaces parameterized by neural networks.
Numerical results demonstrate that the ASWD significantly outperforms other Wasserstein variants for both synthetic and real-world problems.
- Score: 55.028065567756066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While theoretically appealing, the application of the Wasserstein distance to
large-scale machine learning problems has been hampered by its prohibitive
computational cost. The sliced Wasserstein distance and its variants improve
the computational efficiency through the random projection, yet they suffer
from low accuracy if the number of projections is not sufficiently large,
because the majority of projections result in trivially small values. In this
work, we propose a new family of distance metrics, called augmented sliced
Wasserstein distances (ASWDs), constructed by first mapping samples to
higher-dimensional hypersurfaces parameterized by neural networks. It is
derived from a key observation that (random) linear projections of samples
residing on these hypersurfaces would translate to much more flexible nonlinear
projections in the original sample space, so they can capture complex
structures of the data distribution. We show that the hypersurfaces can be
optimized by gradient ascent efficiently. We provide the condition under which
the ASWD is a valid metric and show that this can be obtained by an injective
neural network architecture. Numerical results demonstrate that the ASWD
significantly outperforms other Wasserstein variants for both synthetic and
real-world problems.
Related papers
- Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels [78.6096486885658]
We introduce lower bounds to the linearized Laplace approximation of the marginal likelihood.
These bounds are amenable togradient-based optimization and allow to trade off estimation accuracy against computational complexity.
arXiv Detail & Related papers (2023-06-06T19:02:57Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - Linearized Wasserstein dimensionality reduction with approximation
guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space.
We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size.
We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z) - Projected Sliced Wasserstein Autoencoder-based Hyperspectral Images
Anomaly Detection [42.585075865267946]
We propose the Projected Sliced Wasserstein (PSW) autoencoder-based anomaly detection method.
In particular, the computation-friendly eigen-decomposition method is leveraged to find the principal component for slicing the high-dimensional data.
Comprehensive experiments conducted on various real-world hyperspectral anomaly detection benchmarks demonstrate the superior performance of the proposed method.
arXiv Detail & Related papers (2021-12-20T09:21:02Z) - Fast Approximation of the Sliced-Wasserstein Distance Using
Concentration of Random Projections [19.987683989865708]
The Sliced-Wasserstein distance (SW) is being increasingly used in machine learning applications.
We propose a new perspective to approximate SW by making use of the concentration of measure phenomenon.
Our method does not require sampling a number of random projections, and is therefore both accurate and easy to use compared to the usual Monte Carlo approximation.
arXiv Detail & Related papers (2021-06-29T13:56:19Z) - Two-sample Test using Projected Wasserstein Distance [18.46110328123008]
We develop a projected Wasserstein distance for the two-sample test, a fundamental problem in statistics and machine learning.
A key contribution is to couple optimal projection to find the low dimensional linear mapping to maximize the Wasserstein distance between projected probability distributions.
arXiv Detail & Related papers (2020-10-22T18:08:58Z) - On Projection Robust Optimal Transport: Sample Complexity and Model
Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected.
Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances.
Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z) - Projection Robust Wasserstein Distance and Riemannian Optimization [107.93250306339694]
We show that projection robustly solidstein (PRW) is a robust variant of Wasserstein projection (WPP)
This paper provides a first step into the computation of the PRW distance and provides the links between their theory and experiments on and real data.
arXiv Detail & Related papers (2020-06-12T20:40:22Z) - Distributional Sliced-Wasserstein and Applications to Generative
Modeling [27.014748003733544]
Sliced-Wasserstein distance (SW) and its variant, Max Sliced-Wasserstein distance (Max-SW) have been used widely in the recent years.
We propose a novel distance, named Distributional Sliced-Wasserstein distance (DSW)
We show that the DSW is a generalization of Max-SW, and it can be computed efficiently by searching for the optimal push-forward measure.
arXiv Detail & Related papers (2020-02-18T04:35:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.