Related papers: A method for escaping limit cycles in training GANs

A method for escaping limit cycles in training GANs

URL: http://arxiv.org/abs/2010.03322v3
Date: Fri, 11 Aug 2023 16:28:40 GMT
Title: A method for escaping limit cycles in training GANs
Authors: Li Keke and Yang Xinmin
Abstract summary: We first derive the upper and lower bounds on the last-iterate convergence rates of PCAA for the general bilinear game. We then combine PCAA with the adaptive moment estimation algorithm (Adam) to propose PCAA-Adam, a practical approach for training GANs.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper mainly conducts further research to alleviate the issue of limit cycling behavior in training generative adversarial networks (GANs) through the proposed predictive centripetal acceleration algorithm (PCAA). Specifically, we first derive the upper and lower bounds on the last-iterate convergence rates of PCAA for the general bilinear game, with the upper bound notably improving upon previous results. Then, we combine PCAA with the adaptive moment estimation algorithm (Adam) to propose PCAA-Adam, a practical approach for training GANs. Finally, we validate the effectiveness of the proposed algorithm through experiments conducted on bilinear games, multivariate Gaussian distributions, and the CelebA dataset, respectively.

Related papers

Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training. Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z)
Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z)
Improving Generalization of Complex Models under Unbounded Loss Using PAC-Bayes Bounds [10.94126149188336]
PAC-Bayes learning theory has focused extensively on establishing tight upper bounds for test errors. A recently proposed training procedure called PAC-Bayes training, updates the model toward minimizing these bounds. This approach is theoretically sound, in practice, it has not achieved a test error as low as those obtained by empirical risk minimization (ERM) We introduce a new PAC-Bayes training algorithm with improved performance and reduced reliance on prior tuning.
arXiv Detail & Related papers (2023-05-30T17:31:25Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
Federated Compositional Deep AUC Maximization [58.25078060952361]
We develop a novel federated learning method for imbalanced data by directly optimizing the area under curve (AUC) score. To the best of our knowledge, this is the first work to achieve such favorable theoretical results.
arXiv Detail & Related papers (2023-04-20T05:49:41Z)
Variational Linearized Laplace Approximation for Bayesian Deep Learning [11.22428369342346]
We propose a new method for approximating Linearized Laplace Approximation (LLA) using a variational sparse Gaussian Process (GP) Our method is based on the dual RKHS formulation of GPs and retains, as the predictive mean, the output of the original DNN. It allows for efficient optimization, which results in sub-linear training time in the size of the training dataset.
arXiv Detail & Related papers (2023-02-24T10:32:30Z)
Offline Policy Optimization in RL with Variance Regularizaton [142.87345258222942]
We propose variance regularization for offline RL algorithms, using stationary distribution corrections. We show that by using Fenchel duality, we can avoid double sampling issues for computing the gradient of the variance regularizer. The proposed algorithm for offline variance regularization (OVAR) can be used to augment any existing offline policy optimization algorithms.
arXiv Detail & Related papers (2022-12-29T18:25:01Z)
Computational Complexity of Sub-linear Convergent Algorithms [0.0]
We will explore how starting with a small sample and then geometrically increasing it, and using the solution of the previous sample to compute the new ERM. This will solve problems with first-order optimization algorithms of sublinear convergence but with lower computational complexity.
arXiv Detail & Related papers (2022-09-29T05:38:06Z)
Training Generative Adversarial Networks with Adaptive Composite Gradient [2.471982349512685]
This paper proposes the adaptive Composite Gradients (ACG) method, linearly convergent in bilinear games. ACG is a semi-gradient-free algorithm since it does not need to calculate the gradient of each step. Results show ACG is competitive with the previous algorithms.
arXiv Detail & Related papers (2021-11-10T03:13:53Z)
On the Convergence of Prior-Guided Zeroth-Order Optimization Algorithms [33.96864594479152]
We analyze the convergence of prior-guided ZO algorithms under a greedy descent framework with various gradient estimators. We also present a new accelerated random search (ARS) algorithm that incorporates prior information, together with a convergence analysis.
arXiv Detail & Related papers (2021-07-21T14:39:40Z)
Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics. Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis. We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.