Training Generative Adversarial Networks with Adaptive Composite
Gradient
- URL: http://arxiv.org/abs/2111.05508v1
- Date: Wed, 10 Nov 2021 03:13:53 GMT
- Title: Training Generative Adversarial Networks with Adaptive Composite
Gradient
- Authors: Huiqing Qi, Fang Li, Shengli Tan, Xiangyun Zhang
- Abstract summary: This paper proposes the adaptive Composite Gradients (ACG) method, linearly convergent in bilinear games.
ACG is a semi-gradient-free algorithm since it does not need to calculate the gradient of each step.
Results show ACG is competitive with the previous algorithms.
- Score: 2.471982349512685
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The wide applications of Generative adversarial networks benefit from the
successful training methods, guaranteeing that an object function converges to
the local minima. Nevertheless, designing an efficient and competitive training
method is still a challenging task due to the cyclic behaviors of some
gradient-based ways and the expensive computational cost of these methods based
on the Hessian matrix. This paper proposed the adaptive Composite Gradients
(ACG) method, linearly convergent in bilinear games under suitable settings.
Theory and toy-function experiments suggest that our approach can alleviate the
cyclic behaviors and converge faster than recently proposed algorithms.
Significantly, the ACG method is not only used to find stable fixed points in
bilinear games as well as in general games. The ACG method is a novel
semi-gradient-free algorithm since it does not need to calculate the gradient
of each step, reducing the computational cost of gradient and Hessian by
utilizing the predictive information in future iterations. We conducted two
mixture of Gaussians experiments by integrating ACG to existing algorithms with
Linear GANs. Results show ACG is competitive with the previous algorithms.
Realistic experiments on four prevalent data sets (MNIST, Fashion-MNIST,
CIFAR-10, and CelebA) with DCGANs show that our ACG method outperforms several
baselines, which illustrates the superiority and efficacy of our method.
Related papers
- Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - Stochastic Average Gradient : A Simple Empirical Investigation [0.0]
Average gradient (SAG) is a method for optimizing the sum of a finite number of smooth functions.
SAG converges faster than other iterations on simple toy problems and performs better than many other iterations on simple machine learning problems.
We also propose a combination of SAG with the momentum algorithm and Adam.
arXiv Detail & Related papers (2023-07-27T17:34:26Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - Adapting Step-size: A Unified Perspective to Analyze and Improve
Gradient-based Methods for Adversarial Attacks [21.16546620434816]
We provide a unified theoretical interpretation of gradient-based adversarial learning methods.
We show that each of these algorithms is in fact a specific reformulation of their original gradient methods.
We present a broad class of adaptive gradient-based algorithms based on the regular gradient methods.
arXiv Detail & Related papers (2023-01-27T06:17:51Z) - Leveraging Non-uniformity in First-order Non-convex Optimization [93.6817946818977]
Non-uniform refinement of objective functions leads to emphNon-uniform Smoothness (NS) and emphNon-uniform Lojasiewicz inequality (NL)
New definitions inspire new geometry-aware first-order methods that converge to global optimality faster than the classical $Omega (1/t2)$ lower bounds.
arXiv Detail & Related papers (2021-05-13T04:23:07Z) - Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate
in Gradient Descent [20.47598828422897]
We propose textit-Meta-Regularization, a novel approach for the adaptive choice of the learning rate in first-order descent methods.
Our approach modifies the objective function by adding a regularization term, and casts the joint process parameters.
arXiv Detail & Related papers (2021-04-12T13:13:34Z) - A Distributed Training Algorithm of Generative Adversarial Networks with
Quantized Gradients [8.202072658184166]
We propose a distributed GANs training algorithm with quantized gradient, dubbed DQGAN, which is the first distributed training method with quantized gradient for GANs.
The new method trains GANs based on a specific single machine algorithm called Optimistic Mirror Descent (OMD) algorithm, and is applicable to any gradient compression method that satisfies a general $delta$-approximate compressor.
Theoretically, we establish the non-asymptotic convergence of DQGAN algorithm to first-order stationary point, which shows that the proposed algorithm can achieve a linear speedup in the
arXiv Detail & Related papers (2020-10-26T06:06:43Z) - Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem.
We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent.
Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z) - Optimization of Graph Total Variation via Active-Set-based Combinatorial
Reconditioning [48.42916680063503]
We propose a novel adaptive preconditioning strategy for proximal algorithms on this problem class.
We show that nested-forest decomposition of the inactive edges yields a guaranteed local linear convergence rate.
Our results suggest that local convergence analysis can serve as a guideline for selecting variable metrics in proximal algorithms.
arXiv Detail & Related papers (2020-02-27T16:33:09Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.