Statistical, Robustness, and Computational Guarantees for Sliced
Wasserstein Distances
- URL: http://arxiv.org/abs/2210.09160v1
- Date: Mon, 17 Oct 2022 15:04:51 GMT
- Title: Statistical, Robustness, and Computational Guarantees for Sliced
Wasserstein Distances
- Authors: Sloan Nietert, Ritwik Sadhu, Ziv Goldfeld, and Kengo Kato
- Abstract summary: Sliced Wasserstein distances preserve properties of classic Wasserstein distances while being more scalable for computation and estimation in high dimensions.
We quantify this scalability from three key aspects: (i) empirical convergence rates; (ii) robustness to data contamination; and (iii) efficient computational methods.
- Score: 18.9717974398864
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sliced Wasserstein distances preserve properties of classic Wasserstein
distances while being more scalable for computation and estimation in high
dimensions. The goal of this work is to quantify this scalability from three
key aspects: (i) empirical convergence rates; (ii) robustness to data
contamination; and (iii) efficient computational methods. For empirical
convergence, we derive fast rates with explicit dependence of constants on
dimension, subject to log-concavity of the population distributions. For
robustness, we characterize minimax optimal, dimension-free robust estimation
risks, and show an equivalence between robust sliced 1-Wasserstein estimation
and robust mean estimation. This enables lifting statistical and algorithmic
guarantees available for the latter to the sliced 1-Wasserstein setting. Moving
on to computational aspects, we analyze the Monte Carlo estimator for the
average-sliced distance, demonstrating that larger dimension can result in
faster convergence of the numerical integration error. For the max-sliced
distance, we focus on a subgradient-based local optimization algorithm that is
frequently used in practice, albeit without formal guarantees, and establish an
$O(\epsilon^{-4})$ computational complexity bound for it. Our theory is
validated by numerical experiments, which altogether provide a comprehensive
quantitative account of the scalability question.
Related papers
- Linearized Wasserstein dimensionality reduction with approximation
guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space.
We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size.
We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Near-optimal estimation of smooth transport maps with kernel
sums-of-squares [81.02564078640275]
Under smoothness conditions, the squared Wasserstein distance between two distributions could be efficiently computed with appealing statistical error upper bounds.
The object of interest for applications such as generative modeling is the underlying optimal transport map.
We propose the first tractable algorithm for which the statistical $L2$ error on the maps nearly matches the existing minimax lower-bounds for smooth map estimation.
arXiv Detail & Related papers (2021-12-03T13:45:36Z) - Continuous Wasserstein-2 Barycenter Estimation without Minimax
Optimization [94.18714844247766]
Wasserstein barycenters provide a geometric notion of the weighted average of probability measures based on optimal transport.
We present a scalable algorithm to compute Wasserstein-2 barycenters given sample access to the input measures.
arXiv Detail & Related papers (2021-02-02T21:01:13Z) - Efficient Interpolation of Density Estimators [23.154249845820306]
We study the problem of space and time efficient evaluation of a nonparametric estimator that approximates an unknown density.
Our result gives a new statistical perspective on the problem of fast evaluation of kernel density estimators in the presence of underlying smoothness.
arXiv Detail & Related papers (2020-11-10T06:05:00Z) - Faster Wasserstein Distance Estimation with the Sinkhorn Divergence [0.0]
The squared Wasserstein distance is a quantity to compare probability distributions in a non-parametric setting.
In this work, we propose instead to estimate it with the Sinkhorn divergence.
We show that, for smooth densities, this estimator has a comparable sample complexity but allows higher regularization levels.
arXiv Detail & Related papers (2020-06-15T06:58:16Z) - $\gamma$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a
Robust Divergence Estimator [95.71091446753414]
We propose to use a nearest-neighbor-based $gamma$-divergence estimator as a data discrepancy measure.
Our method achieves significantly higher robustness than existing discrepancy measures.
arXiv Detail & Related papers (2020-06-13T06:09:27Z) - Projection Robust Wasserstein Distance and Riemannian Optimization [107.93250306339694]
We show that projection robustly solidstein (PRW) is a robust variant of Wasserstein projection (WPP)
This paper provides a first step into the computation of the PRW distance and provides the links between their theory and experiments on and real data.
arXiv Detail & Related papers (2020-06-12T20:40:22Z) - Scalable Distributed Approximation of Internal Measures for Clustering
Evaluation [5.144809478361603]
Internal measure for clustering evaluation is the silhouette coefficient, whose computation requires a quadratic number of distance calculations.
We present the first scalable algorithm to compute such a rigorous approximation for the evaluation of clusterings based on any metric distances.
We also prove that the algorithm can be adapted to obtain rigorous approximations of other internal measures of clustering quality, such as cohesion and separation.
arXiv Detail & Related papers (2020-03-03T10:28:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.