Robust Estimation under the Wasserstein Distance
- URL: http://arxiv.org/abs/2302.01237v2
- Date: Tue, 24 Sep 2024 17:18:09 GMT
- Title: Robust Estimation under the Wasserstein Distance
- Authors: Sloan Nietert, Rachel Cummings, Ziv Goldfeld,
- Abstract summary: Given $n$ samples, of which $varepsilon n$ are adversarially corrupted, we seek an estimate for $mu$ with minimal Wasserstein error.
We prove new structural properties for POT and use them to show that MDE under a partial Wasserstein distance achieves the minimax-optimal robust estimation risk.
Since the popular Wasserstein generative adversarial network (WGAN) framework implements Wasserstein MDE via Kantorovich duality, our penalized dual enables large-scale generative modeling with contaminated datasets.
- Score: 27.382258439576606
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of robust distribution estimation under the Wasserstein distance, a popular discrepancy measure between probability distributions rooted in optimal transport (OT) theory. Given $n$ samples from an unknown distribution $\mu$, of which $\varepsilon n$ are adversarially corrupted, we seek an estimate for $\mu$ with minimal Wasserstein error. To address this task, we draw upon two frameworks from OT and robust statistics: partial OT (POT) and minimum distance estimation (MDE). We prove new structural properties for POT and use them to show that MDE under a partial Wasserstein distance achieves the minimax-optimal robust estimation risk in many settings. Along the way, we derive a novel dual form for POT that adds a sup-norm penalty to the classic Kantorovich dual for standard OT. Since the popular Wasserstein generative adversarial network (WGAN) framework implements Wasserstein MDE via Kantorovich duality, our penalized dual enables large-scale generative modeling with contaminated datasets via an elementary modification to WGAN. Numerical experiments demonstrating the efficacy of our approach in mitigating the impact of adversarial corruptions are provided.
Related papers
- Relative-Translation Invariant Wasserstein Distance [82.6068808353647]
We introduce a new family of distances, relative-translation invariant Wasserstein distances ($RW_p$)
We show that $RW_p distances are also real distance metrics defined on the quotient set $mathcalP_p(mathbbRn)/sim$ invariant to distribution translations.
arXiv Detail & Related papers (2024-09-04T03:41:44Z) - Offline Imitation from Observation via Primal Wasserstein State Occupancy Matching [111.78179839856293]
We propose Primal Wasserstein DICE to minimize the primal Wasserstein distance between the learner and expert state occupancies.
Our framework is a generalization of SMODICE, and is the first work that unifies $f$-divergence and Wasserstein minimization.
arXiv Detail & Related papers (2023-11-02T15:41:57Z) - Mutual Wasserstein Discrepancy Minimization for Sequential
Recommendation [82.0801585843835]
We propose a novel self-supervised learning framework based on Mutual WasserStein discrepancy minimization MStein for the sequential recommendation.
We also propose a novel contrastive learning loss based on Wasserstein Discrepancy Measurement.
arXiv Detail & Related papers (2023-01-28T13:38:48Z) - Robust computation of optimal transport by $\beta$-potential
regularization [79.24513412588745]
Optimal transport (OT) has become a widely used tool in the machine learning field to measure the discrepancy between probability distributions.
We propose regularizing OT with the beta-potential term associated with the so-called $beta$-divergence.
We experimentally demonstrate that the transport matrix computed with our algorithm helps estimate a probability distribution robustly even in the presence of outliers.
arXiv Detail & Related papers (2022-12-26T18:37:28Z) - Sample Complexity of Nonparametric Off-Policy Evaluation on
Low-Dimensional Manifolds using Deep Networks [71.95722100511627]
We consider the off-policy evaluation problem of reinforcement learning using deep neural networks.
We show that, by choosing network size appropriately, one can leverage the low-dimensional manifold structure in the Markov decision process.
arXiv Detail & Related papers (2022-06-06T20:25:20Z) - Outlier-Robust Optimal Transport: Duality, Structure, and Statistical
Applications [25.410110072480187]
Wasserstein distances are sensitive to outliers in the considered distributions.
We propose a new outlier-robust Wasserstein distance $mathsfW_pvarepsilon$ which allows for $varepsilon$ outlier mass to be removed from each contaminated distribution.
arXiv Detail & Related papers (2021-11-02T04:05:45Z) - Limit Distribution Theory for the Smooth 1-Wasserstein Distance with
Applications [18.618590805279187]
smooth 1-Wasserstein distance (SWD) $W_1sigma$ was recently proposed as a means to mitigate the curse of dimensionality in empirical approximation.
This work conducts a thorough statistical study of the SWD, including a high-dimensional limit distribution result.
arXiv Detail & Related papers (2021-07-28T17:02:24Z) - Distributionally Robust Prescriptive Analytics with Wasserstein Distance [10.475438374386886]
This paper proposes a new distributionally robust approach under Wasserstein ambiguity sets.
We show that the nominal distribution converges to the actual conditional distribution under the Wasserstein distance.
arXiv Detail & Related papers (2021-06-10T13:08:17Z) - On Projection Robust Optimal Transport: Sample Complexity and Model
Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected.
Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances.
Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.