Related papers: A Convergent and Dimension-Independent Min-Max Optimization Algorithm

A Convergent and Dimension-Independent Min-Max Optimization Algorithm

URL: http://arxiv.org/abs/2006.12376v6
Date: Thu, 30 Jun 2022 19:53:45 GMT
Title: A Convergent and Dimension-Independent Min-Max Optimization Algorithm
Authors: Vijay Keswani, Oren Mangoubi, Sushant Sachdeva, Nisheeth K. Vishnoi
Abstract summary: We show that a distribution which the min-player uses to update its parameters depends on a smooth and bounded nonnon-concave function. Our algorithm converges to an approximate local equilibrium in a number of iterations that does not depend on the iterations.
Score: 32.492526162436405
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study a variant of a recently introduced min-max optimization framework where the max-player is constrained to update its parameters in a greedy manner until it reaches a first-order stationary point. Our equilibrium definition for this framework depends on a proposal distribution which the min-player uses to choose directions in which to update its parameters. We show that, given a smooth and bounded nonconvex-nonconcave objective function, access to any proposal distribution for the min-player's updates, and stochastic gradient oracle for the max-player, our algorithm converges to the aforementioned approximate local equilibrium in a number of iterations that does not depend on the dimension. The equilibrium point found by our algorithm depends on the proposal distribution, and when applying our algorithm to train GANs we choose the proposal distribution to be a distribution of stochastic gradients. We empirically evaluate our algorithm on challenging nonconvex-nonconcave test-functions and loss functions arising in GAN training. Our algorithm converges on these test functions and, when used to train GANs, trains stably on synthetic and real-world datasets and avoids mode collapse

Related papers

Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z)
Online Optimization Perspective on First-Order and Zero-Order Decentralized Nonsmooth Nonconvex Stochastic Optimization [8.670873561640903]
We investigate the finite-time analysis of finding ($delta,,ilon$)-stationary points for nonsmooth nonsmooth objectives in decentralized settings. $O is the first finite-time guarantee for decentralized nonsmooth optimization.
arXiv Detail & Related papers (2024-06-03T16:09:34Z)
Constrained Optimization via Exact Augmented Lagrangian and Randomized Iterative Sketching [55.28394191394675]
We develop an adaptive inexact Newton method for equality-constrained nonlinear, nonIBS optimization problems. We demonstrate the superior performance of our method on benchmark nonlinear problems, constrained logistic regression with data from LVM, and a PDE-constrained problem.
arXiv Detail & Related papers (2023-05-28T06:33:37Z)
Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees [56.848265937921354]
Inverse reinforcement learning (IRL) aims to recover the reward function and the associated optimal policy. Many algorithms for IRL have an inherently nested structure. We develop a novel single-loop algorithm for IRL that does not compromise reward estimation accuracy.
arXiv Detail & Related papers (2022-10-04T17:13:45Z)
Hessian Averaging in Stochastic Newton Methods Achieves Superlinear Convergence [69.65563161962245]
We consider a smooth and strongly convex objective function using a Newton method. We show that there exists a universal weighted averaging scheme that transitions to local convergence at an optimal stage.
arXiv Detail & Related papers (2022-04-20T07:14:21Z)
Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration and Lower Bounds [54.51566432934556]
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging. We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
arXiv Detail & Related papers (2022-03-18T05:49:13Z)
Distributed stochastic proximal algorithm with random reshuffling for non-smooth finite-sum optimization [28.862321453597918]
Non-smooth finite-sum minimization is a fundamental problem in machine learning. This paper develops a distributed proximal-gradient algorithm with random reshuffling to solve the problem.
arXiv Detail & Related papers (2021-11-06T07:29:55Z)
A Decentralized Adaptive Momentum Method for Solving a Class of Min-Max Optimization Problems [9.653157073271021]
We develop a decentralized adaptive momentum (ADAM)-type algorithm for solving min-max optimization problem. We obtain non-asymptotic rates of convergence of the proposed algorithm for finding a (stochastic) first-order Nash equilibrium point.
arXiv Detail & Related papers (2021-06-10T22:32:01Z)
Convergence Analysis of Nonconvex Distributed Stochastic Zeroth-order Coordinate Method [3.860616339202303]
This paper investigates the distributed non optimization problem of minimizing a global cost function formed by the summation of $ZOn$ local cost functions. Agents approximate their own ZO coordinate method to solve the problem.
arXiv Detail & Related papers (2021-03-24T03:07:46Z)
Riemannian Langevin Algorithm for Solving Semidefinite Programs [9.340611077939828]
We propose a Langevin-based algorithm for non- optimization and sampling on a product manifold of spheres. We show that the proposed Langevin algorithm achieves $epsilon accuracy with high probability.
arXiv Detail & Related papers (2020-10-21T17:51:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.