How to train your neural ODE: the world of Jacobian and kinetic
regularization
- URL: http://arxiv.org/abs/2002.02798v3
- Date: Tue, 23 Jun 2020 15:54:19 GMT
- Title: How to train your neural ODE: the world of Jacobian and kinetic
regularization
- Authors: Chris Finlay, J\"orn-Henrik Jacobsen, Levon Nurbekyan, Adam M Oberman
- Abstract summary: Training neural ODEs on large datasets has not been tractable due to the necessity of allowing the adaptive numerical ODE solver to refine its step size to very small values.
We introduce a theoretically-grounded combination of both optimal transport and stability regularizations which encourage neural ODEs to prefer simpler dynamics out of all the dynamics that solve a problem well.
- Score: 7.83405844354125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training neural ODEs on large datasets has not been tractable due to the
necessity of allowing the adaptive numerical ODE solver to refine its step size
to very small values. In practice this leads to dynamics equivalent to many
hundreds or even thousands of layers. In this paper, we overcome this apparent
difficulty by introducing a theoretically-grounded combination of both optimal
transport and stability regularizations which encourage neural ODEs to prefer
simpler dynamics out of all the dynamics that solve a problem well. Simpler
dynamics lead to faster convergence and to fewer discretizations of the solver,
considerably decreasing wall-clock time without loss in performance. Our
approach allows us to train neural ODE-based generative models to the same
performance as the unregularized dynamics, with significant reductions in
training time. This brings neural ODEs closer to practical relevance in
large-scale applications.
Related papers
- Faster Training of Neural ODEs Using Gau{\ss}-Legendre Quadrature [68.9206193762751]
We propose an alternative way to speed up the training of neural ODEs.
We use Gauss-Legendre quadrature to solve integrals faster than ODE-based methods.
We also extend the idea to training SDEs using the Wong-Zakai theorem, by training a corresponding ODE and transferring the parameters.
arXiv Detail & Related papers (2023-08-21T11:31:15Z) - NeuralStagger: Accelerating Physics-constrained Neural PDE Solver with
Spatial-temporal Decomposition [67.46012350241969]
This paper proposes a general acceleration methodology called NeuralStagger.
It decomposing the original learning tasks into several coarser-resolution subtasks.
We demonstrate the successful application of NeuralStagger on 2D and 3D fluid dynamics simulations.
arXiv Detail & Related papers (2023-02-20T19:36:52Z) - Homotopy-based training of NeuralODEs for accurate dynamics discovery [0.0]
We develop a new training method for NeuralODEs, based on synchronization and homotopy optimization.
We show that synchronizing the model dynamics and the training data tames the originally irregular loss landscape.
Our method achieves competitive or better training loss while often requiring less than half the number of training epochs.
arXiv Detail & Related papers (2022-10-04T06:32:45Z) - On Fast Simulation of Dynamical System with Neural Vector Enhanced
Numerical Solver [59.13397937903832]
We introduce a deep learning-based corrector called Neural Vector (NeurVec)
NeurVec can compensate for integration errors and enable larger time step sizes in simulations.
Our experiments on a variety of complex dynamical system benchmarks demonstrate that NeurVec exhibits remarkable generalization capability.
arXiv Detail & Related papers (2022-08-07T09:02:18Z) - AdamNODEs: When Neural ODE Meets Adaptive Moment Estimation [19.909858354874547]
We propose adaptive momentum estimation neural ODEs (AdamNODEs) that adaptively control the acceleration of the classical momentum-based approach.
In evaluation, we show that AdamNODEs achieve the lowest training loss and efficacy over existing neural ODEs.
arXiv Detail & Related papers (2022-07-13T09:20:38Z) - Stabilized Neural Ordinary Differential Equations for Long-Time
Forecasting of Dynamical Systems [1.001737665513683]
We present a data-driven modeling method that accurately captures shocks and chaotic dynamics.
We learn the right-hand-side (SRH) of an ODE by adding the outputs of two NN together where one learns a linear term and the other a nonlinear term.
Specifically, we implement this by training a sparse linear convolutional NN to learn the linear term and a dense fully-connected nonlinear NN to learn the nonlinear term.
arXiv Detail & Related papers (2022-03-29T16:10:34Z) - Accelerating Neural ODEs Using Model Order Reduction [0.0]
We show that mathematical model order reduction methods can be used for compressing and accelerating Neural ODEs.
We implement our novel compression method by developing Neural ODEs that integrate the necessary subspace-projection and operations as layers of the neural network.
arXiv Detail & Related papers (2021-05-28T19:27:09Z) - Neural ODE Processes [64.10282200111983]
We introduce Neural ODE Processes (NDPs), a new class of processes determined by a distribution over Neural ODEs.
We show that our model can successfully capture the dynamics of low-dimensional systems from just a few data-points.
arXiv Detail & Related papers (2021-03-23T09:32:06Z) - STEER: Simple Temporal Regularization For Neural ODEs [80.80350769936383]
We propose a new regularization technique: randomly sampling the end time of the ODE during training.
The proposed regularization is simple to implement, has negligible overhead and is effective across a wide variety of tasks.
We show through experiments on normalizing flows, time series models and image recognition that the proposed regularization can significantly decrease training time and even improve performance over baseline models.
arXiv Detail & Related papers (2020-06-18T17:44:50Z) - Time Dependence in Non-Autonomous Neural ODEs [74.78386661760662]
We propose a novel family of Neural ODEs with time-varying weights.
We outperform previous Neural ODE variants in both speed and representational capacity.
arXiv Detail & Related papers (2020-05-05T01:41:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.