Robust Estimation under the Wasserstein Distance
- URL: http://arxiv.org/abs/2302.01237v1
- Date: Thu, 2 Feb 2023 17:20:25 GMT
- Title: Robust Estimation under the Wasserstein Distance
- Authors: Sloan Nietert, Rachel Cummings, and Ziv Goldfeld
- Abstract summary: We introduce a new outlier-robust Wasserstein distance $mathsfW_pvarepsilon$ which allows for $varepsilon$ outlier mass to be removed from its input distributions.
We show that minimum distance estimation under $mathsfW_pvarepsilon$ achieves minimax optimal robust estimation risk.
- Score: 28.792608997509376
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of robust distribution estimation under the Wasserstein
metric, a popular discrepancy measure between probability distributions rooted
in optimal transport (OT) theory. We introduce a new outlier-robust Wasserstein
distance $\mathsf{W}_p^\varepsilon$ which allows for $\varepsilon$ outlier mass
to be removed from its input distributions, and show that minimum distance
estimation under $\mathsf{W}_p^\varepsilon$ achieves minimax optimal robust
estimation risk. Our analysis is rooted in several new results for partial OT,
including an approximate triangle inequality, which may be of independent
interest. To address computational tractability, we derive a dual formulation
for $\mathsf{W}_p^\varepsilon$ that adds a simple penalty term to the classic
Kantorovich dual objective. As such, $\mathsf{W}_p^\varepsilon$ can be
implemented via an elementary modification to standard, duality-based OT
solvers. Our results are extended to sliced OT, where distributions are
projected onto low-dimensional subspaces, and applications to homogeneity and
independence testing are explored. We illustrate the virtues of our framework
via applications to generative modeling with contaminated datasets.
Related papers
- Relative-Translation Invariant Wasserstein Distance [82.6068808353647]
We introduce a new family of distances, relative-translation invariant Wasserstein distances ($RW_p$)
We show that $RW_p distances are also real distance metrics defined on the quotient set $mathcalP_p(mathbbRn)/sim$ invariant to distribution translations.
arXiv Detail & Related papers (2024-09-04T03:41:44Z) - Statistical Efficiency of Distributional Temporal Difference Learning and Freedman's Inequality in Hilbert Spaces [24.03281329962804]
In this paper, we focus on the non-asymptotic statistical rates of distributional temporal difference learning.
We show that for NTD with a generative model, we need $tildeO(varepsilon-2 mu_pi,min-1 (1-gamma)-3+t_mixmu_pi,min-1 (1-gamma)-1)$ sample complexity bounds in the case of the $1$-Wasserstein distance.
We establish a novel Freedman's inequality
arXiv Detail & Related papers (2024-03-09T06:19:53Z) - Offline Imitation from Observation via Primal Wasserstein State Occupancy Matching [111.78179839856293]
We propose Primal Wasserstein DICE to minimize the primal Wasserstein distance between the learner and expert state occupancies.
Our framework is a generalization of SMODICE, and is the first work that unifies $f$-divergence and Wasserstein minimization.
arXiv Detail & Related papers (2023-11-02T15:41:57Z) - Mutual Wasserstein Discrepancy Minimization for Sequential
Recommendation [82.0801585843835]
We propose a novel self-supervised learning framework based on Mutual WasserStein discrepancy minimization MStein for the sequential recommendation.
We also propose a novel contrastive learning loss based on Wasserstein Discrepancy Measurement.
arXiv Detail & Related papers (2023-01-28T13:38:48Z) - Robust computation of optimal transport by $\beta$-potential
regularization [79.24513412588745]
Optimal transport (OT) has become a widely used tool in the machine learning field to measure the discrepancy between probability distributions.
We propose regularizing OT with the beta-potential term associated with the so-called $beta$-divergence.
We experimentally demonstrate that the transport matrix computed with our algorithm helps estimate a probability distribution robustly even in the presence of outliers.
arXiv Detail & Related papers (2022-12-26T18:37:28Z) - Sample Complexity of Nonparametric Off-Policy Evaluation on
Low-Dimensional Manifolds using Deep Networks [71.95722100511627]
We consider the off-policy evaluation problem of reinforcement learning using deep neural networks.
We show that, by choosing network size appropriately, one can leverage the low-dimensional manifold structure in the Markov decision process.
arXiv Detail & Related papers (2022-06-06T20:25:20Z) - Outlier-Robust Optimal Transport: Duality, Structure, and Statistical
Applications [25.410110072480187]
Wasserstein distances are sensitive to outliers in the considered distributions.
We propose a new outlier-robust Wasserstein distance $mathsfW_pvarepsilon$ which allows for $varepsilon$ outlier mass to be removed from each contaminated distribution.
arXiv Detail & Related papers (2021-11-02T04:05:45Z) - Limit Distribution Theory for the Smooth 1-Wasserstein Distance with
Applications [18.618590805279187]
smooth 1-Wasserstein distance (SWD) $W_1sigma$ was recently proposed as a means to mitigate the curse of dimensionality in empirical approximation.
This work conducts a thorough statistical study of the SWD, including a high-dimensional limit distribution result.
arXiv Detail & Related papers (2021-07-28T17:02:24Z) - Distributionally Robust Prescriptive Analytics with Wasserstein Distance [10.475438374386886]
This paper proposes a new distributionally robust approach under Wasserstein ambiguity sets.
We show that the nominal distribution converges to the actual conditional distribution under the Wasserstein distance.
arXiv Detail & Related papers (2021-06-10T13:08:17Z) - On Projection Robust Optimal Transport: Sample Complexity and Model
Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected.
Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances.
Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.