Related papers: Conjugate Gradient Method for Generative Adversarial Networks

Conjugate Gradient Method for Generative Adversarial Networks

URL: http://arxiv.org/abs/2203.14495v1
Date: Mon, 28 Mar 2022 04:44:45 GMT
Title: Conjugate Gradient Method for Generative Adversarial Networks
Authors: Hiroki Naganuma, Hideaki Iiduka
Abstract summary: It is not feasible to calculate the Jensen-Shannon divergence of the density function of the data and the density function of the model of deep neural networks. Generative adversarial networks (GANs) can be used to formulate this problem as a discriminative problem with two models, a generator and a discriminator. We propose to apply the conjugate gradient method to solve the local Nash equilibrium problem in GANs.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While the generative model has many advantages, it is not feasible to calculate the Jensen-Shannon divergence of the density function of the data and the density function of the model of deep neural networks; for this reason, various alternative approaches have been developed. Generative adversarial networks (GANs) can be used to formulate this problem as a discriminative problem with two models, a generator and a discriminator whose learning can be formulated in the context of game theory and the local Nash equilibrium. Since this optimization is more difficult than minimization of a single objective function, we propose to apply the conjugate gradient method to solve the local Nash equilibrium problem in GANs. We give a proof and convergence analysis under mild assumptions showing that the proposed method converges to a local Nash equilibrium with three different learning-rate schedules including a constant learning rate. Furthermore, we demonstrate the convergence of a simple toy problem to a local Nash equilibrium and compare the proposed method with other optimization methods in experiments using real-world data, finding that the proposed method outperforms stochastic gradient descent (SGD) and momentum SGD.

Related papers

FBSJNN: A Theoretically Interpretable and Efficiently Deep Learning method for Solving Partial Integro-Differential Equations [0.0]
We propose a novel framework for solving a class of Partial Integro-Differential Equations (PIDEs) through a deep learning-based approach. This method, termed the Forward-Backward Jump Neural Network (FBNN), is both theoretically interpretable and numerically effective. Numerical experiments indicate that the FBSJNN scheme can obtain numerical solutions with a relative error on the scale of $10-3$.
arXiv Detail & Related papers (2024-12-15T01:37:48Z)
Total Uncertainty Quantification in Inverse PDE Solutions Obtained with Reduced-Order Deep Learning Surrogate Models [50.90868087591973]
We propose an approximate Bayesian method for quantifying the total uncertainty in inverse PDE solutions obtained with machine learning surrogate models. We test the proposed framework by comparing it with the iterative ensemble smoother and deep ensembling methods for a non-linear diffusion equation.
arXiv Detail & Related papers (2024-08-20T19:06:02Z)
Dynamical Measure Transport and Neural PDE Solvers for Sampling [77.38204731939273]
We tackle the task of sampling from a probability density as transporting a tractable density function to the target. We employ physics-informed neural networks (PINNs) to approximate the respective partial differential equations (PDEs) solutions. PINNs allow for simulation- and discretization-free optimization and can be trained very efficiently.
arXiv Detail & Related papers (2024-07-10T17:39:50Z)
Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data. Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z)
Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers. We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art. In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z)
Enhancing Low-Order Discontinuous Galerkin Methods with Neural Ordinary Differential Equations for Compressible Navier--Stokes Equations [0.1578515540930834]
We introduce an end-to-end differentiable framework for solving the compressible Navier-Stokes equations. This integrated approach combines a differentiable discontinuous Galerkin solver with a neural network source term. We demonstrate the performance of the proposed framework through two examples.
arXiv Detail & Related papers (2023-10-29T04:26:23Z)
A Homogenization Approach for Gradient-Dominated Stochastic Optimization [6.1144486886258065]
We propose a homogeneous second-order descent method (SHSOD) for functions enjoying gradient dominance. Our findings show that SHSODM matches the best-known sample complexity achieved by other second-order methods for gradient-dominated optimization.
arXiv Detail & Related papers (2023-08-21T11:03:04Z)
Resource-Adaptive Newton's Method for Distributed Learning [16.588456212160928]
This paper introduces a novel and efficient algorithm called RANL, which overcomes the limitations of Newton's method. Unlike traditional first-order methods, RANL exhibits remarkable independence from the condition number of the problem.
arXiv Detail & Related papers (2023-08-20T04:01:30Z)
Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise. In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z)
An Optimization-based Deep Equilibrium Model for Hyperspectral Image Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem. A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network. The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
An Exponentially Converging Particle Method for the Mixed Nash Equilibrium of Continuous Games [0.0]
We consider the problem of computing mixed Nash equilibria of two-player zero-sum games with continuous sets of pure strategies and with first-order access to the payoff function. This problem arises for example in game-inspired machine learning applications, such as distributionally-robust learning. We introduce and analyze a particle-based method that enjoys guaranteed local convergence for this problem.
arXiv Detail & Related papers (2022-11-02T17:03:40Z)
Statistical optimality and stability of tangent transform algorithms in logit models [6.9827388859232045]
We provide conditions on the data generating process to derive non-asymptotic upper bounds to the risk incurred by the logistical optima. In particular, we establish local variation of the algorithm without any assumptions on the data-generating process. We explore a special case involving a semi-orthogonal design under which a global convergence is obtained.
arXiv Detail & Related papers (2020-10-25T05:15:13Z)
SODEN: A Scalable Continuous-Time Survival Model through Ordinary Differential Equation Networks [14.564168076456822]
We propose a flexible model for survival analysis using neural networks along with scalable optimization algorithms. We demonstrate the effectiveness of the proposed method in comparison to existing state-of-the-art deep learning survival analysis models.
arXiv Detail & Related papers (2020-08-19T19:11:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.