On Generalization and Regularization via Wasserstein Distributionally
Robust Optimization
- URL: http://arxiv.org/abs/2212.05716v1
- Date: Mon, 12 Dec 2022 05:48:07 GMT
- Title: On Generalization and Regularization via Wasserstein Distributionally
Robust Optimization
- Authors: Qinyu Wu, Jonathan Yu-Meng Li, Tiantian Mao
- Abstract summary: Wasserstein distributionally robust optimization (DRO) has found success in operations research and machine learning applications.
We show that by focusing on Wasserstein DRO problems with affine decision rules, it is possible to obtain generalization bounds and the equivalency to regularization.
- Score: 0.5801044612920815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Wasserstein distributionally robust optimization (DRO) has found success in
operations research and machine learning applications as a powerful means to
obtain solutions with favourable out-of-sample performances. Two compelling
explanations for the success are the generalization bounds derived from
Wasserstein DRO and the equivalency between Wasserstein DRO and the
regularization scheme commonly applied in machine learning. Existing results on
generalization bounds and the equivalency to regularization are largely limited
to the setting where the Wasserstein ball is of a certain type and the decision
criterion takes certain forms of an expected function. In this paper, we show
that by focusing on Wasserstein DRO problems with affine decision rules, it is
possible to obtain generalization bounds and the equivalency to regularization
in a significantly broader setting where the Wasserstein ball can be of a
general type and the decision criterion can be a general measure of risk, i.e.,
nonlinear in distributions. This allows for accommodating many important
classification, regression, and risk minimization applications that have not
been addressed to date using Wasserstein DRO. Our results are strong in that
the generalization bounds do not suffer from the curse of dimensionality and
the equivalency to regularization is exact. As a byproduct, our regularization
results broaden considerably the class of Wasserstein DRO models that can be
solved efficiently via regularization formulations.
Related papers
- Universal generalization guarantees for Wasserstein distributionally robust models [10.036727981085223]
Distributionally robust optimization has emerged as an attractive way to train robust machine learning models.
Recent statistical analyses have proved that generalization guarantees of robust models based on the Wasserstein distance have generalization guarantees that do not suffer from the curse of dimensionality.
We establish exact generalization guarantees that cover a wide range of cases, with arbitrary transport costs and parametric loss functions.
arXiv Detail & Related papers (2024-02-19T09:27:47Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - Tangential Wasserstein Projections [0.4297070083645048]
We develop a notion of projections between sets of probability measures using the geometric properties of the 2-Wasserstein space.
The idea is to work on regular tangent cones of the Wasserstein space using generalized geodesics.
arXiv Detail & Related papers (2022-07-29T14:59:58Z) - A general sample complexity analysis of vanilla policy gradient [101.16957584135767]
Policy gradient (PG) is one of the most popular reinforcement learning (RL) problems.
"vanilla" theoretical understanding of PG trajectory is one of the most popular methods for solving RL problems.
arXiv Detail & Related papers (2021-07-23T19:38:17Z) - Complexity-Free Generalization via Distributionally Robust Optimization [4.313143197674466]
We present an alternate route to obtain generalization bounds on the solution from distributionally robust optimization (DRO)
Our DRO bounds depend on the ambiguity set geometry and its compatibility with the true loss function.
Notably, when using maximum mean discrepancy as a DRO distance metric, our analysis implies, to the best of our knowledge, the first generalization bound in the literature that depends solely on the true loss function.
arXiv Detail & Related papers (2021-06-21T15:19:52Z) - Policy Mirror Descent for Regularized Reinforcement Learning: A
Generalized Framework with Linear Convergence [60.20076757208645]
This paper proposes a general policy mirror descent (GPMD) algorithm for solving regularized RL.
We demonstrate that our algorithm converges linearly over an entire range learning rates, in a dimension-free fashion, to the global solution.
arXiv Detail & Related papers (2021-05-24T02:21:34Z) - Reinforcement Learning for Adaptive Mesh Refinement [63.7867809197671]
We propose a novel formulation of AMR as a Markov decision process and apply deep reinforcement learning to train refinement policies directly from simulation.
The model sizes of these policy architectures are independent of the mesh size and hence scale to arbitrarily large and complex simulations.
arXiv Detail & Related papers (2021-03-01T22:55:48Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z) - Finite-Sample Guarantees for Wasserstein Distributionally Robust
Optimization: Breaking the Curse of Dimensionality [5.888646114353371]
Wasserstein distributionally robust optimization aims to find robust and generalizable solutions by hedging against data perturbations in Wasserstein distance.
Existing performance guarantees for generic loss functions are either overly conservative due to the curse of dimensionality, or plausible only in large sample complexitys.
We develop a non-asymptotic framework for analyzing the out-of-sample performance for Wasserstein robust learning and the generalization bound for its related Lipschitz and gradient regularization problems.
arXiv Detail & Related papers (2020-09-09T16:02:57Z) - Distributional Robustness and Regularization in Reinforcement Learning [62.23012916708608]
We introduce a new regularizer for empirical value functions and show that it lower bounds the Wasserstein distributionally robust value function.
It suggests using regularization as a practical tool for dealing with $textitexternal uncertainty$ in reinforcement learning.
arXiv Detail & Related papers (2020-03-05T19:56:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.