Backpropagation through Time and Space: Learning Numerical Methods with
Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2203.08937v2
- Date: Fri, 18 Mar 2022 14:36:36 GMT
- Title: Backpropagation through Time and Space: Learning Numerical Methods with
Multi-Agent Reinforcement Learning
- Authors: Elliot Way, Dheeraj S.K. Kapilavai, Yiwei Fu, Lei Yu
- Abstract summary: We treat the numerical schemes underlying partial differential equations as a Partially Observable Markov Game (OMG) in Reinforcement Learning (RL)
Similar to numerical solvers, our agent acts at each discrete location a computational space for efficient generalizable learning.
To learn higher-order spatial methods by acting on local states, the agent must discern how its actions at a giventemporal location affect the future evolution of the state.
- Score: 6.598324641949299
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce Backpropagation Through Time and Space (BPTTS), a method for
training a recurrent spatio-temporal neural network, that is used in a
homogeneous multi-agent reinforcement learning (MARL) setting to learn
numerical methods for hyperbolic conservation laws. We treat the numerical
schemes underlying partial differential equations (PDEs) as a Partially
Observable Markov Game (POMG) in Reinforcement Learning (RL). Similar to
numerical solvers, our agent acts at each discrete location of a computational
space for efficient and generalizable learning. To learn higher-order spatial
methods by acting on local states, the agent must discern how its actions at a
given spatiotemporal location affect the future evolution of the state. The
manifestation of this non-stationarity is addressed by BPTTS, which allows for
the flow of gradients across both space and time. The learned numerical
policies are comparable to the SOTA numerics in two settings, the Burgers'
Equation and the Euler Equations, and generalize well to other simulation
set-ups.
Related papers
- TimewarpVAE: Simultaneous Time-Warping and Representation Learning of Trajectories [15.28090738928877]
TimewarpVAE is a manifold-learning algorithm that simultaneously learns timing variations and latent factors of spatial variation.
We show how the algorithm learns appropriate time alignments and meaningful representations of spatial variations in handwriting and fork manipulation datasets.
arXiv Detail & Related papers (2023-10-24T17:43:16Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - Machine learning in and out of equilibrium [58.88325379746631]
Our study uses a Fokker-Planck approach, adapted from statistical physics, to explore these parallels.
We focus in particular on the stationary state of the system in the long-time limit, which in conventional SGD is out of equilibrium.
We propose a new variation of Langevin dynamics (SGLD) that harnesses without replacement minibatching.
arXiv Detail & Related papers (2023-06-06T09:12:49Z) - Locally Regularized Neural Differential Equations: Some Black Boxes Were
Meant to Remain Closed! [3.222802562733787]
Implicit layer deep learning techniques, like Neural Differential Equations, have become an important modeling framework.
We develop two sampling strategies to trade off between performance and training time.
Our method reduces the number of function evaluations to 0.556-0.733x and accelerates predictions by 1.3-2x.
arXiv Detail & Related papers (2023-03-03T23:31:15Z) - Diversity Through Exclusion (DTE): Niche Identification for
Reinforcement Learning through Value-Decomposition [63.67574523750839]
We propose a generic reinforcement learning (RL) algorithm that performs better than baseline deep Q-learning algorithms in environments with multiple variably-valued niches.
We show that agents trained this way can escape poor-but-attractive local optima to instead converge to harder-to-discover higher value strategies.
arXiv Detail & Related papers (2023-02-02T16:00:19Z) - Semi-supervised Learning of Partial Differential Operators and Dynamical
Flows [68.77595310155365]
We present a novel method that combines a hyper-network solver with a Fourier Neural Operator architecture.
We test our method on various time evolution PDEs, including nonlinear fluid flows in one, two, and three spatial dimensions.
The results show that the new method improves the learning accuracy at the time point of supervision point, and is able to interpolate and the solutions to any intermediate time.
arXiv Detail & Related papers (2022-07-28T19:59:14Z) - Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC)
Our algorithm alleviates problems with local minima through a smooth critic function.
We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv Detail & Related papers (2022-04-14T17:46:26Z) - Opening the Blackbox: Accelerating Neural Differential Equations by
Regularizing Internal Solver Heuristics [0.0]
We describe a novel regularization method that uses the internal cost of adaptive differential equation solvers combined with discrete sensitivities to guide the training process.
This approach opens up the blackbox numerical analysis behind the differential equation solver's algorithm and uses its local error estimates and stiffnesss as cheap and accurate cost estimates.
We demonstrate how our approach can halve the prediction time and showcases how this can increase the training time by an order of magnitude.
arXiv Detail & Related papers (2021-05-09T12:03:03Z) - Deep learning approaches to surrogates for solving the diffusion
equation for mechanistic real-world simulations [0.0]
In medical, biological, physical and engineered models the numerical solution of partial differential equations (PDEs) can make simulations impractically slow.
Machine learning surrogates, neural networks trained to provide approximate solutions to such complicated numerical problems, can often provide speed-ups of several orders of magnitude compared to direct calculation.
We use a Convolutional Neural Network to approximate the stationary solution to the diffusion equation in the case of two equal-diameter, circular, constant-value sources.
arXiv Detail & Related papers (2021-02-10T16:15:17Z) - A Kernel-Based Approach to Non-Stationary Reinforcement Learning in
Metric Spaces [53.47210316424326]
KeRNS is an algorithm for episodic reinforcement learning in non-stationary Markov Decision Processes.
We prove a regret bound that scales with the covering dimension of the state-action space and the total variation of the MDP with time.
arXiv Detail & Related papers (2020-07-09T21:37:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.