A Primal-Dual Algorithm for Faster Distributionally Robust Optimization
- URL: http://arxiv.org/abs/2403.10763v1
- Date: Sat, 16 Mar 2024 02:06:14 GMT
- Title: A Primal-Dual Algorithm for Faster Distributionally Robust Optimization
- Authors: Ronak Mehta, Jelena Diakonikolas, Zaid Harchaoui,
- Abstract summary: We present Drago, a primal-dual algorithm that achieves a state-of-the-art linear convergence rate on strongly convex-strongly concave DRO problems.
We support our theoretical results with numerical benchmarks in classification and regression.
- Score: 12.311794669976047
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the penalized distributionally robust optimization (DRO) problem with a closed, convex uncertainty set, a setting that encompasses the $f$-DRO, Wasserstein-DRO, and spectral/$L$-risk formulations used in practice. We present Drago, a stochastic primal-dual algorithm that achieves a state-of-the-art linear convergence rate on strongly convex-strongly concave DRO problems. The method combines both randomized and cyclic components with mini-batching, which effectively handles the unique asymmetric nature of the primal and dual problems in DRO. We support our theoretical results with numerical benchmarks in classification and regression.
Related papers
- Large-Scale Non-convex Stochastic Constrained Distributionally Robust Optimization [23.029511473335145]
This paper focuses on constrained DRO, which has an explicit characterization of the robustness of its performance.
The complexity of our algorithm at each $chi2$-divergences point$ is independent overall dataset size, and thus is suitable for large-scale applications.
arXiv Detail & Related papers (2024-04-01T15:56:58Z) - Analysis of Kernel Mirror Prox for Measure Optimization [4.6080589718744305]
We study in a unified framework a class of functional saddle-point optimization problems, which we term the Mixed Functional Nash Equilibrium (MFNE)
We model the saddle-point optimization dynamics as an interacting Fisher-Rao-RKHS gradient flow.
We provide a unified convergence analysis of KMP in an infinite-dimensional setting for this class of MFNE problems.
arXiv Detail & Related papers (2024-02-29T21:55:17Z) - ALEXR: An Optimal Single-Loop Algorithm for Convex Finite-Sum Coupled Compositional Stochastic Optimization [53.14532968909759]
We introduce an efficient single-loop primal-dual block-coordinate algorithm, dubbed ALEXR.
We establish the convergence rates of ALEXR in both convex and strongly convex cases under smoothness and non-smoothness conditions.
We present lower complexity bounds to demonstrate that the convergence rates of ALEXR are optimal among first-order block-coordinate algorithms for the considered class of cFCCO problems.
arXiv Detail & Related papers (2023-12-04T19:00:07Z) - Stochastic Constrained DRO with a Complexity Independent of Sample Size [38.56406595022129]
We propose and analyze algorithms that apply to both non- and convex losses for solving Kullback divergence constrained DRO problem.
We establish a nearly optimal complexity for finding an $$ilon stationary solution for non- losses and an optimal batch complexity for finding an optimal solution for broad applications.
arXiv Detail & Related papers (2022-10-11T19:11:19Z) - When AUC meets DRO: Optimizing Partial AUC for Deep Learning with
Non-Convex Convergence Guarantee [51.527543027813344]
We propose systematic and efficient gradient-based methods for both one-way and two-way partial AUC (pAUC)
For both one-way and two-way pAUC, we propose two algorithms and prove their convergence for optimizing their two formulations, respectively.
arXiv Detail & Related papers (2022-03-01T01:59:53Z) - Non-convex Distributionally Robust Optimization: Non-asymptotic Analysis [16.499651513178012]
Distributionally robust optimization (DRO) is a widely-used approach to learn models that are robust against distribution shift.
We provide non-asymptotic convergence guarantees even though the objective function is possibly prominent nonsmooth- and has normalized gradient descent.
arXiv Detail & Related papers (2021-10-24T14:56:38Z) - Sinkhorn Distributionally Robust Optimization [15.194516549163245]
We derive convex programming dual reformulation for general nominal distributions, transport costs, and loss functions.
Compared with Wasserstein DRO, our proposed approach offers enhanced computational tractability for a broader class of loss functions.
arXiv Detail & Related papers (2021-09-24T12:40:48Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - Large-Scale Methods for Distributionally Robust Optimization [53.98643772533416]
We prove that our algorithms require a number of evaluations gradient independent of training set size and number of parameters.
Experiments on MNIST and ImageNet confirm the theoretical scaling of our algorithms, which are 9--36 times more efficient than full-batch methods.
arXiv Detail & Related papers (2020-10-12T17:41:44Z) - An Online Method for A Class of Distributionally Robust Optimization
with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems.
Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z) - Distributionally Robust Bayesian Optimization [121.71766171427433]
We present a novel distributionally robust Bayesian optimization algorithm (DRBO) for zeroth-order, noisy optimization.
Our algorithm provably obtains sub-linear robust regret in various settings.
We demonstrate the robust performance of our method on both synthetic and real-world benchmarks.
arXiv Detail & Related papers (2020-02-20T22:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.