Related papers: Neural Actor-Critic Methods for Hamilton-Jacobi-Bellman PDEs: Asymptotic Analysis and Numerical Studies

Neural Actor-Critic Methods for Hamilton-Jacobi-Bellman PDEs: Asymptotic Analysis and Numerical Studies

URL: http://arxiv.org/abs/2507.06428v1
Date: Tue, 08 Jul 2025 22:20:22 GMT
Title: Neural Actor-Critic Methods for Hamilton-Jacobi-Bellman PDEs: Asymptotic Analysis and Numerical Studies
Authors: Samuel N. Cohen, Jackson Hebner, Deqing Jiang, Justin Sirignano,
Abstract summary: We mathematically analyze and numerically analyze an actor-critic machine learning algorithm for solving HamiltonJacobiBellman equations.<n>In our numerical studies, we demonstrate that the algorithm can solve control problems accurately in up to 200 dimensions.
Score: 3.566534591413616
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We mathematically analyze and numerically study an actor-critic machine learning algorithm for solving high-dimensional Hamilton-Jacobi-Bellman (HJB) partial differential equations from stochastic control theory. The architecture of the critic (the estimator for the value function) is structured so that the boundary condition is always perfectly satisfied (rather than being included in the training loss) and utilizes a biased gradient which reduces computational cost. The actor (the estimator for the optimal control) is trained by minimizing the integral of the Hamiltonian over the domain, where the Hamiltonian is estimated using the critic. We show that the training dynamics of the actor and critic neural networks converge in a Sobolev-type space to a certain infinite-dimensional ordinary differential equation (ODE) as the number of hidden units in the actor and critic $\rightarrow \infty$. Further, under a convexity-like assumption on the Hamiltonian, we prove that any fixed point of this limit ODE is a solution of the original stochastic control problem. This provides an important guarantee for the algorithm's performance in light of the fact that finite-width neural networks may only converge to a local minimizers (and not optimal solutions) due to the non-convexity of their loss functions. In our numerical studies, we demonstrate that the algorithm can solve stochastic control problems accurately in up to 200 dimensions. In particular, we construct a series of increasingly complex stochastic control problems with known analytic solutions and study the algorithm's numerical performance on them. These problems range from a linear-quadratic regulator equation to highly challenging equations with non-convex Hamiltonians, allowing us to identify and analyze the strengths and limitations of this neural actor-critic method for solving HJB equations.

Related papers

Solving nonconvex Hamilton--Jacobi--Isaacs equations with PINN-based policy iteration [1.3654846342364308]
We present a framework that combines classical dynamic programming with neural networks (PINNs) to solve non-subscriber Hamilton-Jacobi-Isaac equations.<n>Our results suggest that integrating PINNs with policy policy is a practical and theoretically grounded method for solving high-dimensional, nonsubscriber HJI equations.
arXiv Detail & Related papers (2025-07-21T10:06:53Z)
Quantum algorithm for solving nonlinear differential equations based on physics-informed effective Hamiltonians [14.379311972506791]
We propose a distinct approach to solving differential equations on quantum computers by encoding the problem into ground states of effective Hamiltonian operators.<n>Our algorithm relies on constructing such operators in the Chebyshev space, where an effective Hamiltonian is a sum of global differential and data constraints.
arXiv Detail & Related papers (2025-04-17T17:59:33Z)
An Iterative Deep Ritz Method for Monotone Elliptic Problems [0.29792392019703945]
We present a novel iterative deep Ritz method (IDRM) for solving a general class of elliptic problems.<n>The algorithm is applicable to elliptic problems involving a monotone operator.<n>We establish a convergence rate for the method using tools from geometry of Banach spaces and theory of monotone operators.
arXiv Detail & Related papers (2025-01-25T11:50:24Z)
A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks. We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks. Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z)
Deep Graphic FBSDEs for Opinion Dynamics Stochastic Control [27.38625075499457]
We present a scalable deep learning approach to solve opinion dynamics optimal control problems with mean field term coupling in the dynamics and cost function. The proposed framework opens up the possibility for future applications on extremely large-scale problems.
arXiv Detail & Related papers (2022-04-05T22:07:32Z)
Message Passing Neural PDE Solvers [60.77761603258397]
We build a neural message passing solver, replacing allally designed components in the graph with backprop-optimized neural function approximators. We show that neural message passing solvers representationally contain some classical methods, such as finite differences, finite volumes, and WENO schemes. We validate our method on various fluid-like flow problems, demonstrating fast, stable, and accurate performance across different domain topologies, equation parameters, discretizations, etc., in 1D and 2D.
arXiv Detail & Related papers (2022-02-07T17:47:46Z)
Deep Learning Approximation of Diffeomorphisms via Linear-Control Systems [91.3755431537592]
We consider a control system of the form $dot x = sum_i=1lF_i(x)u_i$, with linear dependence in the controls. We use the corresponding flow to approximate the action of a diffeomorphism on a compact ensemble of points.
arXiv Detail & Related papers (2021-10-24T08:57:46Z)
Analysis and Optimisation of Bellman Residual Errors with Neural Function Approximation [0.0]
Recent development of Deep Reinforcement Learning has demonstrated superior performance of neural networks in solving challenging problems with large or even continuous state spaces. One specific approach is to deploy neural networks to approximate value by minimising the Mean Squared Bellman Error function.
arXiv Detail & Related papers (2021-06-16T13:35:14Z)
Meta-Solver for Neural Ordinary Differential Equations [77.8918415523446]
We investigate how the variability in solvers' space can improve neural ODEs performance. We show that the right choice of solver parameterization can significantly affect neural ODEs models in terms of robustness to adversarial attacks.
arXiv Detail & Related papers (2021-03-15T17:26:34Z)
Multipole Graph Neural Operator for Parametric Partial Differential Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data. We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity. Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z)
Neural Control Variates [71.42768823631918]
We show that a set of neural networks can face the challenge of finding a good approximation of the integrand. We derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice. Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.
arXiv Detail & Related papers (2020-06-02T11:17:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.