Related papers: Last-Iterate Convergence of Saddle-Point Optimizers via High-Resolution Differential Equations

Last-Iterate Convergence of Saddle-Point Optimizers via High-Resolution Differential Equations

URL: http://arxiv.org/abs/2112.13826v3
Date: Mon, 31 Jul 2023 15:12:29 GMT
Title: Last-Iterate Convergence of Saddle-Point Optimizers via High-Resolution Differential Equations
Authors: Tatjana Chavdarova, Michael I. Jordan and Manolis Zampetakis
Abstract summary: Several widely-used first-order saddle-point optimization methods yield an identical continuous-time ordinary differential equation (ODE) when derived naively. However, the convergence properties of these methods are qualitatively different, even on simple bilinear games. We adopt a framework studied in fluid dynamics to design differential equation models for several saddle-point optimization methods.
Score: 83.3201889218775
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Several widely-used first-order saddle-point optimization methods yield an identical continuous-time ordinary differential equation (ODE) that is identical to that of the Gradient Descent Ascent (GDA) method when derived naively. However, the convergence properties of these methods are qualitatively different, even on simple bilinear games. Thus the ODE perspective, which has proved powerful in analyzing single-objective optimization methods, has not played a similar role in saddle-point optimization. We adopt a framework studied in fluid dynamics -- known as High-Resolution Differential Equations (HRDEs) -- to design differential equation models for several saddle-point optimization methods. Critically, these HRDEs are distinct for various saddle-point optimization methods. Moreover, in bilinear games, the convergence properties of the HRDEs match the qualitative features of the corresponding discrete methods. Additionally, we show that the HRDE of Optimistic Gradient Descent Ascent (OGDA) exhibits \emph{last-iterate convergence} for general monotone variational inequalities. Finally, we provide rates of convergence for the \emph{best-iterate convergence} of the OGDA method, relying solely on the first-order smoothness of the monotone operator.

Related papers

A Natural Primal-Dual Hybrid Gradient Method for Adversarial Neural Network Training on Solving Partial Differential Equations [9.588717577573684]
We propose a scalable preconditioned primal hybrid gradient algorithm for solving partial differential equations (PDEs) We compare the performance of the proposed method with several commonly used deep learning algorithms. The numerical results suggest that the proposed method performs efficiently and robustly and converges more stably.
arXiv Detail & Related papers (2024-11-09T20:39:10Z)
ODE-based Learning to Optimize [28.380622776436905]
We present a comprehensive framework integrating the inertial systems with Hessian-driven damping equation (ISHD) We formulate a novel learning to optimize (L2O) problem aimed at minimizing the stopping time subject to the convergence and stability condition. Empirical validation of our framework is conducted through extensive numerical experiments across a diverse set of optimization problems.
arXiv Detail & Related papers (2024-06-04T06:39:45Z)
Backward error analysis and the qualitative behaviour of stochastic optimization algorithms: Application to stochastic coordinate descent [1.534667887016089]
We propose a class of differential equations that approximate the dynamics of general optimization methods more closely than the original gradient flow. We study the stability of the modified equation in the case of coordinate descent.
arXiv Detail & Related papers (2023-09-05T09:39:56Z)
A Homogenization Approach for Gradient-Dominated Stochastic Optimization [6.1144486886258065]
We propose a homogeneous second-order descent method (SHSOD) for functions enjoying gradient dominance. Our findings show that SHSODM matches the best-known sample complexity achieved by other second-order methods for gradient-dominated optimization.
arXiv Detail & Related papers (2023-08-21T11:03:04Z)
Comparison of Single- and Multi- Objective Optimization Quality for Evolutionary Equation Discovery [77.34726150561087]
Evolutionary differential equation discovery proved to be a tool to obtain equations with less a priori assumptions. The proposed comparison approach is shown on classical model examples -- Burgers equation, wave equation, and Korteweg - de Vries equation.
arXiv Detail & Related papers (2023-06-29T15:37:19Z)
An Optimization-based Deep Equilibrium Model for Hyperspectral Image Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem. A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network. The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z)
First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities [91.46841922915418]
We present a unified approach for the theoretical analysis of first-order variation methods. Our approach covers both non-linear gradient and strongly Monte Carlo problems. We provide bounds that match the oracle strongly in the case of convex method optimization problems.
arXiv Detail & Related papers (2023-05-25T11:11:31Z)
Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets. We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z)
Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information. We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z)
Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large. Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.