The Last-Iterate Convergence Rate of Optimistic Mirror Descent in
Stochastic Variational Inequalities
- URL: http://arxiv.org/abs/2107.01906v1
- Date: Mon, 5 Jul 2021 09:54:47 GMT
- Title: The Last-Iterate Convergence Rate of Optimistic Mirror Descent in
Stochastic Variational Inequalities
- Authors: Wa\"iss Azizian, Franck Iutzeler, J\'er\^ome Malick, Panayotis
Mertikopoulos
- Abstract summary: We show an intricate relation between the algorithm's rate of convergence and the local geometry induced by the method's underlying Bregman function.
We show that this exponent determines both the optimal step-size policy of the algorithm and the optimal rates attained.
- Score: 29.0058976973771
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we analyze the local convergence rate of optimistic mirror
descent methods in stochastic variational inequalities, a class of optimization
problems with important applications to learning theory and machine learning.
Our analysis reveals an intricate relation between the algorithm's rate of
convergence and the local geometry induced by the method's underlying Bregman
function. We quantify this relation by means of the Legendre exponent, a notion
that we introduce to measure the growth rate of the Bregman divergence relative
to the ambient norm near a solution. We show that this exponent determines both
the optimal step-size policy of the algorithm and the optimal rates attained,
explaining in this way the differences observed for some popular Bregman
functions (Euclidean projection, negative entropy, fractional power, etc.).
Related papers
- Distributed Optimization via Energy Conservation Laws in Dilated Coordinates [5.35599092568615]
This paper introduces an energy conservation approach for analyzing continuous-time dynamical systems in dilated coordinates.
convergence rates can be explicitly expressed in terms of the inverse time-dilation factor.
Its accelerated convergence behavior is benchmarked against various state-of-the-art distributed optimization algorithms on practical, large-scale problems.
arXiv Detail & Related papers (2024-09-28T08:02:43Z) - Analytical Approximation of the ELBO Gradient in the Context of the Clutter Problem [0.0]
We propose an analytical solution for approximating the gradient of the Evidence Lower Bound (ELBO) in variational inference problems.
The proposed method demonstrates good accuracy and rate of convergence together with linear computational complexity.
arXiv Detail & Related papers (2024-04-16T13:19:46Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Stochastic Mirror Descent for Large-Scale Sparse Recovery [13.500750042707407]
We discuss an application of quadratic Approximation to statistical estimation of high-dimensional sparse parameters.
We show that the proposed algorithm attains the optimal convergence of the estimation error under weak assumptions on the regressor distribution.
arXiv Detail & Related papers (2022-10-23T23:23:23Z) - Stochastic Approximation with Decision-Dependent Distributions: Asymptotic Normality and Optimality [8.771678221101368]
We analyze an approximation for decision-dependent problems, wherein the data distribution used by the algorithm evolves along the iterate sequence.
We show that under mild assumptions, the deviation between the iterate of the algorithm and its solution isally normal.
We also show that the performance of the algorithm with averaging is locally minimax optimal.
arXiv Detail & Related papers (2022-07-09T01:44:17Z) - Optimal variance-reduced stochastic approximation in Banach spaces [114.8734960258221]
We study the problem of estimating the fixed point of a contractive operator defined on a separable Banach space.
We establish non-asymptotic bounds for both the operator defect and the estimation error.
arXiv Detail & Related papers (2022-01-21T02:46:57Z) - Amortized Implicit Differentiation for Stochastic Bilevel Optimization [53.12363770169761]
We study a class of algorithms for solving bilevel optimization problems in both deterministic and deterministic settings.
We exploit a warm-start strategy to amortize the estimation of the exact gradient.
By using this framework, our analysis shows these algorithms to match the computational complexity of methods that have access to an unbiased estimate of the gradient.
arXiv Detail & Related papers (2021-11-29T15:10:09Z) - Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large.
Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z) - Sparse recovery by reduced variance stochastic approximation [5.672132510411465]
We discuss application of iterative quadratic optimization routines to the problem of sparse signal recovery from noisy observation.
We show how one can straightforwardly enhance reliability of the corresponding solution by using Median-of-Means like techniques.
arXiv Detail & Related papers (2020-06-11T12:31:20Z) - A Distributional Analysis of Sampling-Based Reinforcement Learning
Algorithms [67.67377846416106]
We present a distributional approach to theoretical analyses of reinforcement learning algorithms for constant step-sizes.
We show that value-based methods such as TD($lambda$) and $Q$-Learning have update rules which are contractive in the space of distributions of functions.
arXiv Detail & Related papers (2020-03-27T05:13:29Z) - Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis [102.29671176698373]
We address the problem of policy evaluation in discounted decision processes, and provide Markov-dependent guarantees on the $ell_infty$error under a generative model.
We establish both and non-asymptotic versions of local minimax lower bounds for policy evaluation, thereby providing an instance-dependent baseline by which to compare algorithms.
arXiv Detail & Related papers (2020-03-16T17:15:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.