A Forward Propagation Algorithm for Online Optimization of Nonlinear
Stochastic Differential Equations
- URL: http://arxiv.org/abs/2207.04496v1
- Date: Sun, 10 Jul 2022 16:06:42 GMT
- Title: A Forward Propagation Algorithm for Online Optimization of Nonlinear
Stochastic Differential Equations
- Authors: Ziheng Wang and Justin Sirignano
- Abstract summary: We study the convergence of the forward propagation algorithm for nonlinear dissipative SDEs.
We prove bounds on the solution of a partial differential equation (PDE) for the expected time integral of the algorithm's fluctuations around the direction of steepest descent.
Our main result is a convergence theorem for the forward propagation algorithm for nonlinear dissipative SDEs.
- Score: 1.116812194101501
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optimizing over the stationary distribution of stochastic differential
equations (SDEs) is computationally challenging. A new forward propagation
algorithm has been recently proposed for the online optimization of SDEs. The
algorithm solves an SDE, derived using forward differentiation, which provides
a stochastic estimate for the gradient. The algorithm continuously updates the
SDE model's parameters and the gradient estimate simultaneously. This paper
studies the convergence of the forward propagation algorithm for nonlinear
dissipative SDEs. We leverage the ergodicity of this class of nonlinear SDEs to
characterize the convergence rate of the transition semi-group and its
derivatives. Then, we prove bounds on the solution of a Poisson partial
differential equation (PDE) for the expected time integral of the algorithm's
stochastic fluctuations around the direction of steepest descent. We then
re-write the algorithm using the PDE solution, which allows us to characterize
the parameter evolution around the direction of steepest descent. Our main
result is a convergence theorem for the forward propagation algorithm for
nonlinear dissipative SDEs.
Related papers
- A Natural Primal-Dual Hybrid Gradient Method for Adversarial Neural Network Training on Solving Partial Differential Equations [9.588717577573684]
We propose a scalable preconditioned primal hybrid gradient algorithm for solving partial differential equations (PDEs)
We compare the performance of the proposed method with several commonly used deep learning algorithms.
The numerical results suggest that the proposed method performs efficiently and robustly and converges more stably.
arXiv Detail & Related papers (2024-11-09T20:39:10Z) - Total Uncertainty Quantification in Inverse PDE Solutions Obtained with Reduced-Order Deep Learning Surrogate Models [50.90868087591973]
We propose an approximate Bayesian method for quantifying the total uncertainty in inverse PDE solutions obtained with machine learning surrogate models.
We test the proposed framework by comparing it with the iterative ensemble smoother and deep ensembling methods for a non-linear diffusion equation.
arXiv Detail & Related papers (2024-08-20T19:06:02Z) - Differentially Private Optimization with Sparse Gradients [60.853074897282625]
We study differentially private (DP) optimization problems under sparsity of individual gradients.
Building on this, we obtain pure- and approximate-DP algorithms with almost optimal rates for convex optimization with sparse gradients.
arXiv Detail & Related papers (2024-04-16T20:01:10Z) - A Deep-Genetic Algorithm (Deep-GA) Approach for High-Dimensional
Nonlinear Parabolic Partial Differential Equations [0.0]
We propose a new method, called a deep-genetic algorithm (deep-GA) to accelerate the performance of the so-called deep-BSDE method.
Recognizing the sensitivity of the solver to the initial guess selection, we embed a genetic algorithm (GA) into the solver to optimize the selection.
We show that our method provides comparable accuracy with significantly improved computational efficiency.
arXiv Detail & Related papers (2023-11-20T06:35:23Z) - Parameter-free projected gradient descent [0.0]
We consider the problem of minimizing a convex function over a closed convex set, with Projected Gradient Descent (PGD)
We propose a fully parameter-free version of AdaGrad, which is adaptive to the distance between the initialization and the optimum, and to the sum of the square norm of the subgradients.
Our algorithm is able to handle projection steps, does not involve restarts, reweighing along the trajectory or additional evaluations compared to the classical PGD.
arXiv Detail & Related papers (2023-05-31T07:22:44Z) - Continuous-time stochastic gradient descent for optimizing over the
stationary distribution of stochastic differential equations [7.65995376636176]
We develop a new continuous-time gradient descent method for optimizing over the stationary distribution oficity differential equation (SDE) models.
We rigorously prove convergence of the online forward propagation algorithm for linear SDE models and present its numerical results for nonlinear examples.
arXiv Detail & Related papers (2022-02-14T11:45:22Z) - Last-Iterate Convergence of Saddle-Point Optimizers via High-Resolution
Differential Equations [83.3201889218775]
Several widely-used first-order saddle-point optimization methods yield an identical continuous-time ordinary differential equation (ODE) when derived naively.
However, the convergence properties of these methods are qualitatively different, even on simple bilinear games.
We adopt a framework studied in fluid dynamics to design differential equation models for several saddle-point optimization methods.
arXiv Detail & Related papers (2021-12-27T18:31:34Z) - Distributed stochastic optimization with large delays [59.95552973784946]
One of the most widely used methods for solving large-scale optimization problems is distributed asynchronous gradient descent (DASGD)
We show that DASGD converges to a global optimal implementation model under same delay assumptions.
arXiv Detail & Related papers (2021-07-06T21:59:49Z) - Solving and Learning Nonlinear PDEs with Gaussian Processes [11.09729362243947]
We introduce a simple, rigorous, and unified framework for solving nonlinear partial differential equations.
The proposed approach provides a natural generalization of collocation kernel methods to nonlinear PDEs and IPs.
For IPs, while the traditional approach has been to iterate between the identifications of parameters in the PDE and the numerical approximation of its solution, our algorithm tackles both simultaneously.
arXiv Detail & Related papers (2021-03-24T03:16:08Z) - Proximal Gradient Algorithm with Momentum and Flexible Parameter Restart
for Nonconvex Optimization [73.38702974136102]
Various types of parameter restart schemes have been proposed for accelerated algorithms to facilitate their practical convergence in rates.
In this paper, we propose an algorithm for solving nonsmooth problems.
arXiv Detail & Related papers (2020-02-26T16:06:27Z) - Stochastic Normalizing Flows [52.92110730286403]
We introduce normalizing flows for maximum likelihood estimation and variational inference (VI) using differential equations (SDEs)
Using the theory of rough paths, the underlying Brownian motion is treated as a latent variable and approximated, enabling efficient training of neural SDEs.
These SDEs can be used for constructing efficient chains to sample from the underlying distribution of a given dataset.
arXiv Detail & Related papers (2020-02-21T20:47:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.