On Generalization and Regularization via Wasserstein Distributionally Robust Optimization
- URL: http://arxiv.org/abs/2212.05716v2
- Date: Thu, 19 Dec 2024 20:39:38 GMT
- Title: On Generalization and Regularization via Wasserstein Distributionally Robust Optimization
- Authors: Qinyu Wu, Jonathan Yu-Meng Li, Tiantian Mao,
- Abstract summary: Wasserstein distributionally robust optimization (DRO) has gained prominence in operations research and machine learning.
We show that generalization bounds and regularization equivalence can be obtained in a significantly broader setting.
- Score: 0.1843404256219181
- License:
- Abstract: Wasserstein distributionally robust optimization (DRO) has gained prominence in operations research and machine learning as a powerful method for achieving solutions with favorable out-of-sample performance. Two compelling explanations for its success are the generalization bounds derived from Wasserstein DRO and its equivalence to regularization schemes commonly used in machine learning. However, existing results on generalization bounds and regularization equivalence are largely limited to settings where the Wasserstein ball is of a specific type, and the decision criterion takes certain forms of expected functions. In this paper, we show that generalization bounds and regularization equivalence can be obtained in a significantly broader setting, where the Wasserstein ball is of a general type and the decision criterion accommodates any form, including general risk measures. This not only addresses important machine learning and operations management applications but also expands to general decision-theoretical frameworks previously unaddressed by Wasserstein DRO. Our results are strong in that the generalization bounds do not suffer from the curse of dimensionality and the equivalency to regularization is exact. As a by-product, we show that Wasserstein DRO coincides with the recent max-sliced Wasserstein DRO for {\it any} decision criterion under affine decision rules -- resulting in both being efficiently solvable as convex programs via our general regularization results. These general assurances provide a strong foundation for expanding the application of Wasserstein DRO across diverse domains of data-driven decision problems.
Related papers
- Universal generalization guarantees for Wasserstein distributionally robust models [10.036727981085223]
Distributionally robust optimization has emerged as an attractive way to train robust machine learning models.
Recent statistical analyses have proved that generalization guarantees of robust models based on the Wasserstein distance have generalization guarantees that do not suffer from the curse of dimensionality.
We establish exact generalization guarantees that cover a wide range of cases, with arbitrary transport costs and parametric loss functions.
arXiv Detail & Related papers (2024-02-19T09:27:47Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - Tangential Wasserstein Projections [0.4297070083645048]
We develop a notion of projections between sets of probability measures using the geometric properties of the 2-Wasserstein space.
The idea is to work on regular tangent cones of the Wasserstein space using generalized geodesics.
arXiv Detail & Related papers (2022-07-29T14:59:58Z) - A general sample complexity analysis of vanilla policy gradient [101.16957584135767]
Policy gradient (PG) is one of the most popular reinforcement learning (RL) problems.
"vanilla" theoretical understanding of PG trajectory is one of the most popular methods for solving RL problems.
arXiv Detail & Related papers (2021-07-23T19:38:17Z) - Complexity-Free Generalization via Distributionally Robust Optimization [4.313143197674466]
We present an alternate route to obtain generalization bounds on the solution from distributionally robust optimization (DRO)
Our DRO bounds depend on the ambiguity set geometry and its compatibility with the true loss function.
Notably, when using maximum mean discrepancy as a DRO distance metric, our analysis implies, to the best of our knowledge, the first generalization bound in the literature that depends solely on the true loss function.
arXiv Detail & Related papers (2021-06-21T15:19:52Z) - Policy Mirror Descent for Regularized Reinforcement Learning: A
Generalized Framework with Linear Convergence [60.20076757208645]
This paper proposes a general policy mirror descent (GPMD) algorithm for solving regularized RL.
We demonstrate that our algorithm converges linearly over an entire range learning rates, in a dimension-free fashion, to the global solution.
arXiv Detail & Related papers (2021-05-24T02:21:34Z) - Reinforcement Learning for Adaptive Mesh Refinement [63.7867809197671]
We propose a novel formulation of AMR as a Markov decision process and apply deep reinforcement learning to train refinement policies directly from simulation.
The model sizes of these policy architectures are independent of the mesh size and hence scale to arbitrarily large and complex simulations.
arXiv Detail & Related papers (2021-03-01T22:55:48Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z) - Finite-Sample Guarantees for Wasserstein Distributionally Robust
Optimization: Breaking the Curse of Dimensionality [5.888646114353371]
Wasserstein distributionally robust optimization aims to find robust and generalizable solutions by hedging against data perturbations in Wasserstein distance.
Existing performance guarantees for generic loss functions are either overly conservative due to the curse of dimensionality, or plausible only in large sample complexitys.
We develop a non-asymptotic framework for analyzing the out-of-sample performance for Wasserstein robust learning and the generalization bound for its related Lipschitz and gradient regularization problems.
arXiv Detail & Related papers (2020-09-09T16:02:57Z) - Distributional Robustness and Regularization in Reinforcement Learning [62.23012916708608]
We introduce a new regularizer for empirical value functions and show that it lower bounds the Wasserstein distributionally robust value function.
It suggests using regularization as a practical tool for dealing with $textitexternal uncertainty$ in reinforcement learning.
arXiv Detail & Related papers (2020-03-05T19:56:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.