Hedging Complexity in Generalization via a Parametric Distributionally
Robust Optimization Framework
- URL: http://arxiv.org/abs/2212.01518v2
- Date: Sun, 24 Sep 2023 15:18:25 GMT
- Title: Hedging Complexity in Generalization via a Parametric Distributionally
Robust Optimization Framework
- Authors: Garud Iyengar, Henry Lam, Tianyu Wang
- Abstract summary: Empirical risk minimization (ERM) and distributionally robust optimization (DRO) are popular approaches for solving optimization problems.
We propose a simple approach in which the distribution of random perturbations is approximated using a parametric family of distributions.
We show that this new source of error can be controlled by suitable DRO formulations.
- Score: 18.6306170209029
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Empirical risk minimization (ERM) and distributionally robust optimization
(DRO) are popular approaches for solving stochastic optimization problems that
appear in operations management and machine learning. Existing generalization
error bounds for these methods depend on either the complexity of the cost
function or dimension of the random perturbations. Consequently, the
performance of these methods can be poor for high-dimensional problems with
complex objective functions. We propose a simple approach in which the
distribution of random perturbations is approximated using a parametric family
of distributions. This mitigates both sources of complexity; however, it
introduces a model misspecification error. We show that this new source of
error can be controlled by suitable DRO formulations. Our proposed parametric
DRO approach has significantly improved generalization bounds over existing ERM
and DRO methods and parametric ERM for a wide variety of settings. Our method
is particularly effective under distribution shifts and works broadly in
contextual optimization. We also illustrate the superior performance of our
approach on both synthetic and real-data portfolio optimization and regression
tasks.
Related papers
- A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning.
These problems are often formalized as Bi-Level optimizations (BLO)
We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z) - Generalization Bounds of Surrogate Policies for Combinatorial Optimization Problems [61.580419063416734]
A recent stream of structured learning approaches has improved the practical state of the art for a range of optimization problems.
The key idea is to exploit the statistical distribution over instances instead of dealing with instances separately.
In this article, we investigate methods that smooth the risk by perturbing the policy, which eases optimization and improves the generalization error.
arXiv Detail & Related papers (2024-07-24T12:00:30Z) - Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models.
We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling.
We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z) - Scalable Distributional Robustness in a Class of Non Convex Optimization
with Guarantees [7.541571634887807]
Distributionally robust optimization (DRO) has shown robustness in learning as well as sample based problems.
We propose a mixed-integer clustering program (MISOCP) which does not scale enough to solve problems with real worldsets.
arXiv Detail & Related papers (2022-05-31T09:07:01Z) - Learning Distributionally Robust Models at Scale via Composite
Optimization [45.47760229170775]
We show how different variants of DRO are simply instances of a finite-sum composite optimization for which we provide scalable methods.
We also provide empirical results that demonstrate the effectiveness of our proposed algorithm with respect to the prior art in order to learn robust models from very large datasets.
arXiv Detail & Related papers (2022-03-17T20:47:42Z) - Complexity-Free Generalization via Distributionally Robust Optimization [4.313143197674466]
We present an alternate route to obtain generalization bounds on the solution from distributionally robust optimization (DRO)
Our DRO bounds depend on the ambiguity set geometry and its compatibility with the true loss function.
Notably, when using maximum mean discrepancy as a DRO distance metric, our analysis implies, to the best of our knowledge, the first generalization bound in the literature that depends solely on the true loss function.
arXiv Detail & Related papers (2021-06-21T15:19:52Z) - Modeling the Second Player in Distributionally Robust Optimization [90.25995710696425]
We argue for the use of neural generative models to characterize the worst-case distribution.
This approach poses a number of implementation and optimization challenges.
We find that the proposed approach yields models that are more robust than comparable baselines.
arXiv Detail & Related papers (2021-03-18T14:26:26Z) - Wasserstein Distributionally Robust Inverse Multiobjective Optimization [14.366265951396587]
We develop a distributionally robust inverse multiobjective optimization problem (WRO-IMOP)
We show that the excess risk of the WRO-IMOP estimator has a sub-linear convergence rate.
We demonstrate the effectiveness of our method on both a synthetic multiobjective quadratic program and a real world portfolio optimization problem.
arXiv Detail & Related papers (2020-09-30T10:44:07Z) - Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large.
Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z) - An Online Method for A Class of Distributionally Robust Optimization
with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems.
Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.