Gradient-free online learning of subgrid-scale dynamics with neural
emulators
- URL: http://arxiv.org/abs/2310.19385v3
- Date: Thu, 7 Dec 2023 19:00:57 GMT
- Title: Gradient-free online learning of subgrid-scale dynamics with neural
emulators
- Authors: Hugo Frezat, Ronan Fablet, Guillaume Balarac, Julien Le Sommer
- Abstract summary: We propose a generic algorithm to train machine learning-based subgrid parametrizations online.
We are able to train a parametrization that recovers most of the benefits of online strategies without having to compute the gradient of the original solver.
- Score: 5.77219319717314
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a generic algorithm to train machine learning-based
subgrid parametrizations online, i.e., with a posteriori loss functions, but
for non-differentiable numerical solvers. The proposed approach leverages a
neural emulator to approximate the reduced state-space solver, which is then
used to allow gradient propagation through temporal integration steps. We apply
this methodology on a single layer quasi-geostrophic system with topography,
known to be highly unstable in around 500 temporal iterations with offline
strategies. Using our algorithm, we are able to train a parametrization that
recovers most of the benefits of online strategies without having to compute
the gradient of the original solver. It is demonstrated that training the
neural emulator and parametrization components separately with different loss
quantities is necessary in order to minimize the propagation of approximation
biases. Experiments on emulator architectures with different complexities also
indicates that emulator performance is key in order to learn an accurate
parametrization. This work is a step towards learning parametrization with
online strategies for realistic climate models.
Related papers
- Hierarchical deep learning-based adaptive time-stepping scheme for
multiscale simulations [0.0]
This study proposes a new method for simulating multiscale problems using deep neural networks.
By leveraging the hierarchical learning of neural network time steppers, the method adapts time steps to approximate dynamical system flow maps across timescales.
This approach achieves state-of-the-art performance in less computational time compared to fixed-step neural network solvers.
arXiv Detail & Related papers (2023-11-10T09:47:58Z) - Online Network Source Optimization with Graph-Kernel MAB [62.6067511147939]
We propose Grab-UCB, a graph- kernel multi-arms bandit algorithm to learn online the optimal source placement in large scale networks.
We describe the network processes with an adaptive graph dictionary model, which typically leads to sparse spectral representations.
We derive the performance guarantees that depend on network parameters, which further influence the learning curve of the sequential decision strategy.
arXiv Detail & Related papers (2023-07-07T15:03:42Z) - Smoothed Online Learning for Prediction in Piecewise Affine Systems [43.64498536409903]
This paper builds on the recently developed smoothed online learning framework.
It provides the first algorithms for prediction and simulation in piecewise affine systems.
arXiv Detail & Related papers (2023-01-26T15:54:14Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - RISP: Rendering-Invariant State Predictor with Differentiable Simulation
and Rendering for Cross-Domain Parameter Estimation [110.4255414234771]
Existing solutions require massive training data or lack generalizability to unknown rendering configurations.
We propose a novel approach that marries domain randomization and differentiable rendering gradients to address this problem.
Our approach achieves significantly lower reconstruction errors and has better generalizability among unknown rendering configurations.
arXiv Detail & Related papers (2022-05-11T17:59:51Z) - Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC)
Our algorithm alleviates problems with local minima through a smooth critic function.
We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv Detail & Related papers (2022-04-14T17:46:26Z) - Adaptive Learning Rate and Momentum for Training Deep Neural Networks [0.0]
We develop a fast training method motivated by the nonlinear Conjugate Gradient (CG) framework.
Experiments in image classification datasets show that our method yields faster convergence than other local solvers.
arXiv Detail & Related papers (2021-06-22T05:06:56Z) - Analytically Tractable Bayesian Deep Q-Learning [0.0]
We adapt the temporal difference Q-learning framework to make it compatible with the tractable approximate Gaussian inference (TAGI)
We demonstrate that TAGI can reach a performance comparable to backpropagation-trained networks.
arXiv Detail & Related papers (2021-06-21T13:11:52Z) - GradInit: Learning to Initialize Neural Networks for Stable and
Efficient Training [59.160154997555956]
We present GradInit, an automated and architecture method for initializing neural networks.
It is based on a simple agnostic; the variance of each network layer is adjusted so that a single step of SGD or Adam results in the smallest possible loss value.
It also enables training the original Post-LN Transformer for machine translation without learning rate warmup.
arXiv Detail & Related papers (2021-02-16T11:45:35Z) - Activation Relaxation: A Local Dynamical Approximation to
Backpropagation in the Brain [62.997667081978825]
Activation Relaxation (AR) is motivated by constructing the backpropagation gradient as the equilibrium point of a dynamical system.
Our algorithm converges rapidly and robustly to the correct backpropagation gradients, requires only a single type of computational unit, and can operate on arbitrary computation graphs.
arXiv Detail & Related papers (2020-09-11T11:56:34Z) - Randomized Automatic Differentiation [22.95414996614006]
We develop a general framework and approach for randomized automatic differentiation (RAD)
RAD can allow unbiased estimates to be computed with reduced memory in return for variance.
We show that RAD converges in fewer iterations than using a small batch size for feedforward networks, and in a similar number for recurrent networks.
arXiv Detail & Related papers (2020-07-20T19:03:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.