Related papers: An Inexact Halpern Iteration with Application to Distributionally Robust Optimization

An Inexact Halpern Iteration with Application to Distributionally Robust Optimization

URL: http://arxiv.org/abs/2402.06033v3
Date: Tue, 27 May 2025 01:58:14 GMT
Title: An Inexact Halpern Iteration with Application to Distributionally Robust Optimization
Authors: Ling Liang, Zusen Xu, Kim-Chuan Toh, Jia-Jie Zhu,
Abstract summary: We show that by choosing the inexactness appropriately, the inexact schemes admit an $O(k-1) convergence rate in terms of the (expected) residue norm.<n>We demonstrate how the proposed methods can be applied for solving two classes of data-driven distributionally robust optimization problems.
Score: 8.722877733571796
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Halpern iteration for solving monotone inclusion problems has gained increasing interests in recent years due to its simple form and appealing convergence properties. In this paper, we investigate the inexact variants of the scheme in both deterministic and stochastic settings. We conduct extensive convergence analysis and show that by choosing the inexactness tolerances appropriately, the inexact schemes admit an $O(k^{-1})$ convergence rate in terms of the (expected) residue norm. Our results relax the state-of-the-art inexactness conditions employed in the literature while sharing the same competitive convergence properties. We then demonstrate how the proposed methods can be applied for solving two classes of data-driven Wasserstein distributionally robust optimization problems that admit convex-concave min-max optimization reformulations. We highlight its capability of performing inexact computations for distributionally robust learning with stochastic first-order methods and for general nonlinear convex-concave loss functions, which are competitive in the literature.

Related papers

Unregularized limit of stochastic gradient method for Wasserstein distributionally robust optimization [8.784017987697688]
Distributionally robust optimization offers a compelling framework for model fitting in machine learning.<n>We investigate the regularized problem where entropic smoothing yields a sampling-based approximation of the original objective.
arXiv Detail & Related papers (2025-06-05T12:21:44Z)
Stochastic Optimization with Optimal Importance Sampling [49.484190237840714]
We propose an iterative-based algorithm that jointly updates the decision and the IS distribution without requiring time-scale separation between the two. Our method achieves the lowest possible variable variance and guarantees global convergence under convexity of the objective and mild assumptions on the IS distribution family.
arXiv Detail & Related papers (2025-04-04T16:10:18Z)
Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems. We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z)
Contextual Optimization under Covariate Shift: A Robust Approach by Intersecting Wasserstein Balls [18.047245099229325]
We propose a distributionally robust approach that uses an ambiguity set by the intersection of two Wasserstein balls. We demonstrate the strong empirical performance of our proposed models.
arXiv Detail & Related papers (2024-06-04T15:46:41Z)
High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance [59.211456992422136]
We propose algorithms with high-probability convergence results under less restrictive assumptions. These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
arXiv Detail & Related papers (2023-02-02T10:37:23Z)
Distributed Stochastic Optimization under a General Variance Condition [13.911633636387059]
Distributed optimization has drawn great attention recently due to its effectiveness in solving largescale machine learning problems. We revisit the classical Federated Averaging (Avg) and establish the convergence results under only a mild variance for smooth non objective functions. Almost a stationary convergence point is also established under the gradients condition.
arXiv Detail & Related papers (2023-01-30T05:48:09Z)
RIGID: Robust Linear Regression with Missing Data [7.638042073679073]
We present a robust framework to perform linear regression with missing entries in the features. We show that the proposed formulation, which naturally takes into account the dependency between different variables, reduces to a convex program. In addition to a detailed analysis, we also analyze the behavior of the proposed framework, and present technical discussions.
arXiv Detail & Related papers (2022-05-26T21:10:17Z)
Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity [49.66890309455787]
We introduce the expected co-coercivity condition, explain its benefits, and provide the first last-iterate convergence guarantees of SGDA and SCO. We prove linear convergence of both methods to a neighborhood of the solution when they use constant step-size. Our convergence guarantees hold under the arbitrary sampling paradigm, and we give insights into the complexity of minibatching.
arXiv Detail & Related papers (2021-06-30T18:32:46Z)
Optimal Rates for Random Order Online Optimization [60.011653053877126]
We study the citetgarber 2020online, where the loss functions may be chosen by an adversary, but are then presented online in a uniformly random order. We show that citetgarber 2020online algorithms achieve the optimal bounds and significantly improve their stability.
arXiv Detail & Related papers (2021-06-29T09:48:46Z)
High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability. Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level. We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z)
Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large. Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z)
Variance-Reduced Splitting Schemes for Monotone Stochastic Generalized Equations [0.0]
We consider monotone inclusion problems where the operators may be expectation-valued. A direct application of splitting schemes is complicated by the need to resolve problems with expectation-valued maps at each step. We propose an avenue for addressing uncertainty in the mapping: Variance-reduced modified forward-backward splitting scheme.
arXiv Detail & Related papers (2020-08-26T02:33:27Z)
Provably Convergent Working Set Algorithm for Non-Convex Regularized Regression [0.0]
This paper proposes a working set algorithm for non-regular regularizers with convergence guarantees. Our results demonstrate high gain compared to the full problem solver for both block-coordinates or a gradient solver.
arXiv Detail & Related papers (2020-06-24T07:40:31Z)
Convergence of adaptive algorithms for weakly convex constrained optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope. Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.