Two-sample Test using Projected Wasserstein Distance
- URL: http://arxiv.org/abs/2010.11970v4
- Date: Fri, 29 Mar 2024 14:40:04 GMT
- Title: Two-sample Test using Projected Wasserstein Distance
- Authors: Jie Wang, Rui Gao, Yao Xie,
- Abstract summary: We develop a projected Wasserstein distance for the two-sample test, a fundamental problem in statistics and machine learning.
A key contribution is to couple optimal projection to find the low dimensional linear mapping to maximize the Wasserstein distance between projected probability distributions.
- Score: 18.46110328123008
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop a projected Wasserstein distance for the two-sample test, a fundamental problem in statistics and machine learning: given two sets of samples, to determine whether they are from the same distribution. In particular, we aim to circumvent the curse of dimensionality in Wasserstein distance: when the dimension is high, it has diminishing testing power, which is inherently due to the slow concentration property of Wasserstein metrics in the high dimension space. A key contribution is to couple optimal projection to find the low dimensional linear mapping to maximize the Wasserstein distance between projected probability distributions. We characterize the theoretical property of the finite-sample convergence rate on IPMs and present practical algorithms for computing this metric. Numerical examples validate our theoretical results.
Related papers
- Statistical and Computational Guarantees of Kernel Max-Sliced Wasserstein Distances [9.608373793625107]
kernel max-sliced (KMS) Wasserstein distance is developed for this purpose.
We show that computing the KMS $2$-Wasserstein distance is NP-hard.
arXiv Detail & Related papers (2024-05-24T11:14:56Z) - Linearized Wasserstein dimensionality reduction with approximation
guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space.
We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size.
We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z) - Optimal 1-Wasserstein Distance for WGANs [2.1174215880331775]
We provide a thorough analysis of Wasserstein GANs (WGANs) in both the finite sample and regimes.
We derive in passing new results on optimal transport theory in the semi-discrete setting.
arXiv Detail & Related papers (2022-01-08T13:04:03Z) - Depth-based pseudo-metrics between probability distributions [1.1470070927586016]
We propose two new pseudo-metrics between continuous probability measures based on data depth and its associated central regions.
In contrast to the Wasserstein distance, the proposed pseudo-metrics do not suffer from the curse of dimensionality.
The regions-based pseudo-metric appears to be robust w.r.t. both outliers and heavy tails.
arXiv Detail & Related papers (2021-03-23T17:33:18Z) - Learning High Dimensional Wasserstein Geodesics [55.086626708837635]
We propose a new formulation and learning strategy for computing the Wasserstein geodesic between two probability distributions in high dimensions.
By applying the method of Lagrange multipliers to the dynamic formulation of the optimal transport (OT) problem, we derive a minimax problem whose saddle point is the Wasserstein geodesic.
We then parametrize the functions by deep neural networks and design a sample based bidirectional learning algorithm for training.
arXiv Detail & Related papers (2021-02-05T04:25:28Z) - Continuous Wasserstein-2 Barycenter Estimation without Minimax
Optimization [94.18714844247766]
Wasserstein barycenters provide a geometric notion of the weighted average of probability measures based on optimal transport.
We present a scalable algorithm to compute Wasserstein-2 barycenters given sample access to the input measures.
arXiv Detail & Related papers (2021-02-02T21:01:13Z) - On Projection Robust Optimal Transport: Sample Complexity and Model
Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected.
Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances.
Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z) - Augmented Sliced Wasserstein Distances [55.028065567756066]
We propose a new family of distance metrics, called augmented sliced Wasserstein distances (ASWDs)
ASWDs are constructed by first mapping samples to higher-dimensional hypersurfaces parameterized by neural networks.
Numerical results demonstrate that the ASWD significantly outperforms other Wasserstein variants for both synthetic and real-world problems.
arXiv Detail & Related papers (2020-06-15T23:00:08Z) - Projection Robust Wasserstein Distance and Riemannian Optimization [107.93250306339694]
We show that projection robustly solidstein (PRW) is a robust variant of Wasserstein projection (WPP)
This paper provides a first step into the computation of the PRW distance and provides the links between their theory and experiments on and real data.
arXiv Detail & Related papers (2020-06-12T20:40:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.