Variance Reduced Halpern Iteration for Finite-Sum Monotone Inclusions
- URL: http://arxiv.org/abs/2310.02987v2
- Date: Thu, 26 Oct 2023 17:47:10 GMT
- Title: Variance Reduced Halpern Iteration for Finite-Sum Monotone Inclusions
- Authors: Xufeng Cai, Ahmet Alacaoglu, Jelena Diakonikolas
- Abstract summary: We study finite-sum monotone inclusion problems, which model broad classes of equilibrium problems.
Our main contributions are variants of the classical Halpern iteration that employ variance reduction to obtain improved complexity guarantees.
We argue that, up to poly-logarithmic factors, this complexity is unimprovable in the monotone Lipschitz setting.
- Score: 18.086061048484616
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning approaches relying on such criteria as adversarial
robustness or multi-agent settings have raised the need for solving
game-theoretic equilibrium problems. Of particular relevance to these
applications are methods targeting finite-sum structure, which generically
arises in empirical variants of learning problems in these contexts. Further,
methods with computable approximation errors are highly desirable, as they
provide verifiable exit criteria. Motivated by these applications, we study
finite-sum monotone inclusion problems, which model broad classes of
equilibrium problems. Our main contributions are variants of the classical
Halpern iteration that employ variance reduction to obtain improved complexity
guarantees in which $n$ component operators in the finite sum are ``on
average'' either cocoercive or Lipschitz continuous and monotone, with
parameter $L$. The resulting oracle complexity of our methods, which provide
guarantees for the last iterate and for a (computable) operator norm residual,
is $\widetilde{\mathcal{O}}( n + \sqrt{n}L\varepsilon^{-1})$, which improves
upon existing methods by a factor up to $\sqrt{n}$. This constitutes the first
variance reduction-type result for general finite-sum monotone inclusions and
for more specific problems such as convex-concave optimization when operator
norm residual is the optimality measure. We further argue that, up to
poly-logarithmic factors, this complexity is unimprovable in the monotone
Lipschitz setting; i.e., the provided result is near-optimal.
Related papers
- Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems.
We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z) - Sparsity via Sparse Group $k$-max Regularization [22.05774771336432]
In this paper, we propose a novel and concise regularization, namely the sparse group $k$-max regularization.
We verify the effectiveness and flexibility of the proposed method through numerical experiments on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-02-13T14:41:28Z) - An Inexact Halpern Iteration with Application to Distributionally Robust
Optimization [9.529117276663431]
We investigate the inexact variants of the scheme in both deterministic and deterministic convergence settings.
We show that by choosing the inexactness appropriately, the inexact schemes admit an $O(k-1) convergence rate in terms of the (expected) residue norm.
arXiv Detail & Related papers (2024-02-08T20:12:47Z) - Conditional expectation using compactification operators [0.0]
This paper describes an operator theoretic approach to estimating the conditional expectation.
Kernel integral operators are used as a compactification tool, to set up the estimation problem as a linear inverse problem in a reproducing kernel Hilbert space.
arXiv Detail & Related papers (2023-06-18T16:11:40Z) - High-Probability Bounds for Stochastic Optimization and Variational
Inequalities: the Case of Unbounded Variance [59.211456992422136]
We propose algorithms with high-probability convergence results under less restrictive assumptions.
These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
arXiv Detail & Related papers (2023-02-02T10:37:23Z) - Optimal Algorithms for Stochastic Complementary Composite Minimization [55.26935605535377]
Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization.
We provide novel excess risk bounds, both in expectation and with high probability.
Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems.
arXiv Detail & Related papers (2022-11-03T12:40:24Z) - A Primal-Dual Approach to Solving Variational Inequalities with General Constraints [54.62996442406718]
Yang et al. (2023) recently showed how to use first-order gradient methods to solve general variational inequalities.
We prove the convergence of this method and show that the gap function of the last iterate of the method decreases at a rate of $O(frac1sqrtK)$ when the operator is $L$-Lipschitz and monotone.
arXiv Detail & Related papers (2022-10-27T17:59:09Z) - A Stochastic Halpern Iteration with Variance Reduction for Stochastic
Monotone Inclusion Problems [17.597000481404883]
We study monotone inclusion problems, which widely appear in machine learning applications.
Our algorithm attains $epsilon$ norm of the operator with $mathcalO(frac1epsilon3)$ operator evaluations.
arXiv Detail & Related papers (2022-03-17T16:48:57Z) - Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z) - Efficient Methods for Structured Nonconvex-Nonconcave Min-Max
Optimization [98.0595480384208]
We propose a generalization extraient spaces which converges to a stationary point.
The algorithm applies not only to general $p$-normed spaces, but also to general $p$-dimensional vector spaces.
arXiv Detail & Related papers (2020-10-31T21:35:42Z) - Variance-Reduced Splitting Schemes for Monotone Stochastic Generalized
Equations [0.0]
We consider monotone inclusion problems where the operators may be expectation-valued.
A direct application of splitting schemes is complicated by the need to resolve problems with expectation-valued maps at each step.
We propose an avenue for addressing uncertainty in the mapping: Variance-reduced modified forward-backward splitting scheme.
arXiv Detail & Related papers (2020-08-26T02:33:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.