Related papers: HessianFR: An Efficient Hessian-based Follow-the-Ridge Algorithm for Minimax Optimization

HessianFR: An Efficient Hessian-based Follow-the-Ridge Algorithm for Minimax Optimization

URL: http://arxiv.org/abs/2205.11030v1
Date: Mon, 23 May 2022 04:28:52 GMT
Title: HessianFR: An Efficient Hessian-based Follow-the-Ridge Algorithm for Minimax Optimization
Authors: Yihang Gao, Huafeng Liu, Michael K. Ng and Mingjie Zhou
Abstract summary: HessianFR is an efficient Hessian-based Follow-the-Ridge algorithm with theoretical guarantees. Experiments of training generative adversarial networks (GANs) have been conducted on both synthetic and real-world large-scale image datasets.
Score: 18.61046317693516
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Wide applications of differentiable two-player sequential games (e.g., image generation by GANs) have raised much interest and attention of researchers to study efficient and fast algorithms. Most of the existing algorithms are developed based on nice properties of simultaneous games, i.e., convex-concave payoff functions, but are not applicable in solving sequential games with different settings. Some conventional gradient descent ascent algorithms theoretically and numerically fail to find the local Nash equilibrium of the simultaneous game or the local minimax (i.e., local Stackelberg equilibrium) of the sequential game. In this paper, we propose the HessianFR, an efficient Hessian-based Follow-the-Ridge algorithm with theoretical guarantees. Furthermore, the convergence of the stochastic algorithm and the approximation of Hessian inverse are exploited to improve algorithm efficiency. A series of experiments of training generative adversarial networks (GANs) have been conducted on both synthetic and real-world large-scale image datasets (e.g. MNIST, CIFAR-10 and CelebA). The experimental results demonstrate that the proposed HessianFR outperforms baselines in terms of convergence and image generation quality.

Related papers

Efficiently Solving Turn-Taking Stochastic Games with Extensive-Form Correlation [52.16923999754027]
We give an algorithm for computing a Stackelberg extensive-form correlated equilibrium. We also give an efficient algorithm for approximately computing an optimal extensive-form correlated equilibrium. Our algorithm for approximately optimal EFCE is, to our knowledge, the first that achieves 3 desiderata simultaneously.
arXiv Detail & Related papers (2024-12-22T09:12:05Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
Efficient Last-iterate Convergence Algorithms in Solving Games [20.00785679098103]
No-regret algorithms are popular for learning Nash equilibrium in two-player zero-sum normal-form games (NFGs) and extensive-form games (EFGs) Recent works propose a Reward Transformation (RT) framework for MWU, which removes the uniqueness condition and achieves competitive performance with OMWU. We show that the bottleneck of RT-based algorithms is the speed of solving convex-concave optimization problems (SCCPs)
arXiv Detail & Related papers (2023-08-22T07:59:49Z)
Ordering for Non-Replacement SGD [7.11967773739707]
We seek to find an ordering that can improve the convergence rates for the non-replacement form of the algorithm. We develop optimal orderings for constant and decreasing step sizes for strongly convex and convex functions. In addition, we are able to combine the ordering with mini-batch and further apply it to more complex neural networks.
arXiv Detail & Related papers (2023-06-28T00:46:58Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
Differentiable Bilevel Programming for Stackelberg Congestion Games [47.60156422249365]
In a Stackelberg congestion game (SCG), a leader aims to maximize their own gain by anticipating and manipulating the equilibrium state at which the followers settle by playing a congestion game. Here, we attempt to tackle this computational challenge by marrying traditional methodologies with the latest differentiable programming techniques in machine learning. We propose two new local search algorithms for SCGs. The first is a gradient descent algorithm that obtains the derivatives by unrolling ILD via differentiable programming. The second algorithm adds a twist by cutting short the followers' evolution trajectory.
arXiv Detail & Related papers (2022-09-15T21:32:23Z)
An Algebraically Converging Stochastic Gradient Descent Algorithm for Global Optimization [14.121491356732188]
A key component in the algorithm is the randomness based on the value of the objective function. We prove the convergence of the algorithm with an algebra and tuning in the parameter space. We present several numerical examples to demonstrate the efficiency and robustness of the algorithm.
arXiv Detail & Related papers (2022-04-12T16:27:49Z)
Last-iterate Convergence in Extensive-Form Games [49.31256241275577]
We study last-iterate convergence of optimistic algorithms in sequential games. We show that all of these algorithms enjoy last-iterate convergence, with some of them even converging exponentially fast.
arXiv Detail & Related papers (2021-06-27T22:02:26Z)
Minimax Optimization with Smooth Algorithmic Adversaries [59.47122537182611]
We propose a new algorithm for the min-player against smooth algorithms deployed by an adversary. Our algorithm is guaranteed to make monotonic progress having no limit cycles, and to find an appropriate number of gradient ascents.
arXiv Detail & Related papers (2021-06-02T22:03:36Z)
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step. Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.