A Convergent and Dimension-Independent Min-Max Optimization Algorithm
- URL: http://arxiv.org/abs/2006.12376v6
- Date: Thu, 30 Jun 2022 19:53:45 GMT
- Title: A Convergent and Dimension-Independent Min-Max Optimization Algorithm
- Authors: Vijay Keswani, Oren Mangoubi, Sushant Sachdeva, Nisheeth K. Vishnoi
- Abstract summary: We show that a distribution which the min-player uses to update its parameters depends on a smooth and bounded nonnon-concave function.
Our algorithm converges to an approximate local equilibrium in a number of iterations that does not depend on the iterations.
- Score: 32.492526162436405
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study a variant of a recently introduced min-max optimization framework
where the max-player is constrained to update its parameters in a greedy manner
until it reaches a first-order stationary point. Our equilibrium definition for
this framework depends on a proposal distribution which the min-player uses to
choose directions in which to update its parameters. We show that, given a
smooth and bounded nonconvex-nonconcave objective function, access to any
proposal distribution for the min-player's updates, and stochastic gradient
oracle for the max-player, our algorithm converges to the aforementioned
approximate local equilibrium in a number of iterations that does not depend on
the dimension. The equilibrium point found by our algorithm depends on the
proposal distribution, and when applying our algorithm to train GANs we choose
the proposal distribution to be a distribution of stochastic gradients. We
empirically evaluate our algorithm on challenging nonconvex-nonconcave
test-functions and loss functions arising in GAN training. Our algorithm
converges on these test functions and, when used to train GANs, trains stably
on synthetic and real-world datasets and avoids mode collapse
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries.
We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z) - Online Optimization Perspective on First-Order and Zero-Order Decentralized Nonsmooth Nonconvex Stochastic Optimization [8.670873561640903]
We investigate the finite-time analysis of finding ($delta,,ilon$)-stationary points for nonsmooth nonsmooth objectives in decentralized settings.
$O is the first finite-time guarantee for decentralized nonsmooth optimization.
arXiv Detail & Related papers (2024-06-03T16:09:34Z) - Constrained Optimization via Exact Augmented Lagrangian and Randomized
Iterative Sketching [55.28394191394675]
We develop an adaptive inexact Newton method for equality-constrained nonlinear, nonIBS optimization problems.
We demonstrate the superior performance of our method on benchmark nonlinear problems, constrained logistic regression with data from LVM, and a PDE-constrained problem.
arXiv Detail & Related papers (2023-05-28T06:33:37Z) - Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time
Guarantees [56.848265937921354]
Inverse reinforcement learning (IRL) aims to recover the reward function and the associated optimal policy.
Many algorithms for IRL have an inherently nested structure.
We develop a novel single-loop algorithm for IRL that does not compromise reward estimation accuracy.
arXiv Detail & Related papers (2022-10-04T17:13:45Z) - Hessian Averaging in Stochastic Newton Methods Achieves Superlinear
Convergence [69.65563161962245]
We consider a smooth and strongly convex objective function using a Newton method.
We show that there exists a universal weighted averaging scheme that transitions to local convergence at an optimal stage.
arXiv Detail & Related papers (2022-04-20T07:14:21Z) - Distributed Sketching for Randomized Optimization: Exact
Characterization, Concentration and Lower Bounds [54.51566432934556]
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging.
We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
arXiv Detail & Related papers (2022-03-18T05:49:13Z) - Distributed stochastic proximal algorithm with random reshuffling for
non-smooth finite-sum optimization [28.862321453597918]
Non-smooth finite-sum minimization is a fundamental problem in machine learning.
This paper develops a distributed proximal-gradient algorithm with random reshuffling to solve the problem.
arXiv Detail & Related papers (2021-11-06T07:29:55Z) - A Decentralized Adaptive Momentum Method for Solving a Class of Min-Max
Optimization Problems [9.653157073271021]
We develop a decentralized adaptive momentum (ADAM)-type algorithm for solving min-max optimization problem.
We obtain non-asymptotic rates of convergence of the proposed algorithm for finding a (stochastic) first-order Nash equilibrium point.
arXiv Detail & Related papers (2021-06-10T22:32:01Z) - Convergence Analysis of Nonconvex Distributed Stochastic Zeroth-order
Coordinate Method [3.860616339202303]
This paper investigates the distributed non optimization problem of minimizing a global cost function formed by the summation of $ZOn$ local cost functions.
Agents approximate their own ZO coordinate method to solve the problem.
arXiv Detail & Related papers (2021-03-24T03:07:46Z) - Riemannian Langevin Algorithm for Solving Semidefinite Programs [9.340611077939828]
We propose a Langevin-based algorithm for non- optimization and sampling on a product manifold of spheres.
We show that the proposed Langevin algorithm achieves $epsilon accuracy with high probability.
arXiv Detail & Related papers (2020-10-21T17:51:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.