Related papers: Robust Estimation under the Wasserstein Distance

Robust Estimation under the Wasserstein Distance

URL: http://arxiv.org/abs/2302.01237v1
Date: Thu, 2 Feb 2023 17:20:25 GMT
Title: Robust Estimation under the Wasserstein Distance
Authors: Sloan Nietert, Rachel Cummings, and Ziv Goldfeld
Abstract summary: We introduce a new outlier-robust Wasserstein distance $mathsfW_pvarepsilon$ which allows for $varepsilon$ outlier mass to be removed from its input distributions. We show that minimum distance estimation under $mathsfW_pvarepsilon$ achieves minimax optimal robust estimation risk.
Score: 28.792608997509376
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of robust distribution estimation under the Wasserstein metric, a popular discrepancy measure between probability distributions rooted in optimal transport (OT) theory. We introduce a new outlier-robust Wasserstein distance $\mathsf{W}_p^\varepsilon$ which allows for $\varepsilon$ outlier mass to be removed from its input distributions, and show that minimum distance estimation under $\mathsf{W}_p^\varepsilon$ achieves minimax optimal robust estimation risk. Our analysis is rooted in several new results for partial OT, including an approximate triangle inequality, which may be of independent interest. To address computational tractability, we derive a dual formulation for $\mathsf{W}_p^\varepsilon$ that adds a simple penalty term to the classic Kantorovich dual objective. As such, $\mathsf{W}_p^\varepsilon$ can be implemented via an elementary modification to standard, duality-based OT solvers. Our results are extended to sliced OT, where distributions are projected onto low-dimensional subspaces, and applications to homogeneity and independence testing are explored. We illustrate the virtues of our framework via applications to generative modeling with contaminated datasets.

Related papers

Relative-Translation Invariant Wasserstein Distance [82.6068808353647]
We introduce a new family of distances, relative-translation invariant Wasserstein distances ($RW_p$) We show that $RW_p distances are also real distance metrics defined on the quotient set $mathcalP_p(mathbbRn)/sim$ invariant to distribution translations.
arXiv Detail & Related papers (2024-09-04T03:41:44Z)
Statistical Efficiency of Distributional Temporal Difference Learning and Freedman's Inequality in Hilbert Spaces [24.03281329962804]
In this paper, we focus on the non-asymptotic statistical rates of distributional temporal difference learning. We show that for NTD with a generative model, we need $tildeO(varepsilon-2 mu_pi,min-1 (1-gamma)-3+t_mixmu_pi,min-1 (1-gamma)-1)$ sample complexity bounds in the case of the $1$-Wasserstein distance. We establish a novel Freedman's inequality
arXiv Detail & Related papers (2024-03-09T06:19:53Z)
Offline Imitation from Observation via Primal Wasserstein State Occupancy Matching [111.78179839856293]
We propose Primal Wasserstein DICE to minimize the primal Wasserstein distance between the learner and expert state occupancies. Our framework is a generalization of SMODICE, and is the first work that unifies $f$-divergence and Wasserstein minimization.
arXiv Detail & Related papers (2023-11-02T15:41:57Z)
Mutual Wasserstein Discrepancy Minimization for Sequential Recommendation [82.0801585843835]
We propose a novel self-supervised learning framework based on Mutual WasserStein discrepancy minimization MStein for the sequential recommendation. We also propose a novel contrastive learning loss based on Wasserstein Discrepancy Measurement.
arXiv Detail & Related papers (2023-01-28T13:38:48Z)
Robust computation of optimal transport by $\beta$-potential regularization [79.24513412588745]
Optimal transport (OT) has become a widely used tool in the machine learning field to measure the discrepancy between probability distributions. We propose regularizing OT with the beta-potential term associated with the so-called $beta$-divergence. We experimentally demonstrate that the transport matrix computed with our algorithm helps estimate a probability distribution robustly even in the presence of outliers.
arXiv Detail & Related papers (2022-12-26T18:37:28Z)
Sample Complexity of Nonparametric Off-Policy Evaluation on Low-Dimensional Manifolds using Deep Networks [71.95722100511627]
We consider the off-policy evaluation problem of reinforcement learning using deep neural networks. We show that, by choosing network size appropriately, one can leverage the low-dimensional manifold structure in the Markov decision process.
arXiv Detail & Related papers (2022-06-06T20:25:20Z)
Outlier-Robust Optimal Transport: Duality, Structure, and Statistical Applications [25.410110072480187]
Wasserstein distances are sensitive to outliers in the considered distributions. We propose a new outlier-robust Wasserstein distance $mathsfW_pvarepsilon$ which allows for $varepsilon$ outlier mass to be removed from each contaminated distribution.
arXiv Detail & Related papers (2021-11-02T04:05:45Z)
Limit Distribution Theory for the Smooth 1-Wasserstein Distance with Applications [18.618590805279187]
smooth 1-Wasserstein distance (SWD) $W_1sigma$ was recently proposed as a means to mitigate the curse of dimensionality in empirical approximation. This work conducts a thorough statistical study of the SWD, including a high-dimensional limit distribution result.
arXiv Detail & Related papers (2021-07-28T17:02:24Z)
Distributionally Robust Prescriptive Analytics with Wasserstein Distance [10.475438374386886]
This paper proposes a new distributionally robust approach under Wasserstein ambiguity sets. We show that the nominal distribution converges to the actual conditional distribution under the Wasserstein distance.
arXiv Detail & Related papers (2021-06-10T13:08:17Z)
On Projection Robust Optimal Transport: Sample Complexity and Model Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected. Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances. Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.