Heavy Ball Neural Ordinary Differential Equations
- URL: http://arxiv.org/abs/2110.04840v1
- Date: Sun, 10 Oct 2021 16:11:11 GMT
- Title: Heavy Ball Neural Ordinary Differential Equations
- Authors: Hedi Xia, Vai Suliafu, Hangjie Ji, Tan M. Nguyen, Andrea L. Bertozzi,
Stanley J. Osher, Bao Wang
- Abstract summary: We propose heavy ball neural ordinary differential equations (HBNODEs) to improve neural ODEs (NODEs) training and inference.
HBNODEs have two properties that imply practical advantages over NODEs.
We verify the advantages of HBNODEs over NODEs on benchmark tasks, including image classification, learning complex dynamics, and sequential modeling.
- Score: 12.861233366398162
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose heavy ball neural ordinary differential equations (HBNODEs),
leveraging the continuous limit of the classical momentum accelerated gradient
descent, to improve neural ODEs (NODEs) training and inference. HBNODEs have
two properties that imply practical advantages over NODEs: (i) The adjoint
state of an HBNODE also satisfies an HBNODE, accelerating both forward and
backward ODE solvers, thus significantly reducing the number of function
evaluations (NFEs) and improving the utility of the trained models. (ii) The
spectrum of HBNODEs is well structured, enabling effective learning of
long-term dependencies from complex sequential data. We verify the advantages
of HBNODEs over NODEs on benchmark tasks, including image classification,
learning complex dynamics, and sequential modeling. Our method requires
remarkably fewer forward and backward NFEs, is more accurate, and learns
long-term dependencies more effectively than the other ODE-based neural network
models. Code is available at \url{https://github.com/hedixia/HeavyBallNODE}.
Related papers
- Faster Training of Neural ODEs Using Gau{\ss}-Legendre Quadrature [68.9206193762751]
We propose an alternative way to speed up the training of neural ODEs.
We use Gauss-Legendre quadrature to solve integrals faster than ODE-based methods.
We also extend the idea to training SDEs using the Wong-Zakai theorem, by training a corresponding ODE and transferring the parameters.
arXiv Detail & Related papers (2023-08-21T11:31:15Z) - Learning Neural Constitutive Laws From Motion Observations for
Generalizable PDE Dynamics [97.38308257547186]
Many NN approaches learn an end-to-end model that implicitly models both the governing PDE and material models.
We argue that the governing PDEs are often well-known and should be explicitly enforced rather than learned.
We introduce a new framework termed "Neural Constitutive Laws" (NCLaw) which utilizes a network architecture that strictly guarantees standard priors.
arXiv Detail & Related papers (2023-04-27T17:42:24Z) - Neural Delay Differential Equations: System Reconstruction and Image
Classification [14.59919398960571]
We propose a new class of continuous-depth neural networks with delay, named Neural Delay Differential Equations (NDDEs)
Compared to NODEs, NDDEs have a stronger capacity of nonlinear representations.
We achieve lower loss and higher accuracy not only for the data produced synthetically but also for the CIFAR10, a well-known image dataset.
arXiv Detail & Related papers (2023-04-11T16:09:28Z) - Neural Laplace: Learning diverse classes of differential equations in
the Laplace domain [86.52703093858631]
We propose a unified framework for learning diverse classes of differential equations (DEs) including all the aforementioned ones.
Instead of modelling the dynamics in the time domain, we model it in the Laplace domain, where the history-dependencies and discontinuities in time can be represented as summations of complex exponentials.
In the experiments, Neural Laplace shows superior performance in modelling and extrapolating the trajectories of diverse classes of DEs.
arXiv Detail & Related papers (2022-06-10T02:14:59Z) - On the balance between the training time and interpretability of neural
ODE for time series modelling [77.34726150561087]
The paper shows that modern neural ODE cannot be reduced to simpler models for time-series modelling applications.
The complexity of neural ODE is compared to or exceeds the conventional time-series modelling tools.
We propose a new view on time-series modelling using combined neural networks and an ODE system approach.
arXiv Detail & Related papers (2022-06-07T13:49:40Z) - Neural Differential Equations for Learning to Program Neural Nets
Through Continuous Learning Rules [10.924226420146626]
We introduce a novel combination of learning rules and Neural ODEs to build continuous-time sequence processing nets.
This yields continuous-time counterparts of Fast Weight Programmers and linear Transformers.
arXiv Detail & Related papers (2022-06-03T15:48:53Z) - Learning POD of Complex Dynamics Using Heavy-ball Neural ODEs [7.388910452780173]
We leverage the recently proposed heavy-ball neural ODEs (HBNODEs) for learning data-driven reduced-order models.
HBNODE enjoys several practical advantages for learning POD-based ROMs with theoretical guarantees.
arXiv Detail & Related papers (2022-02-24T22:00:25Z) - Neural Delay Differential Equations [9.077775405204347]
We propose a new class of continuous-depth neural networks with delay, named as Neural Delay Differential Equations (NDDEs)
For computing the corresponding gradients, we use the adjoint sensitivity method to obtain the delayed dynamics of the adjoint.
Our results reveal that appropriately articulating the elements of dynamical systems into the network design is truly beneficial to promoting the network performance.
arXiv Detail & Related papers (2021-02-22T06:53:51Z) - On Second Order Behaviour in Augmented Neural ODEs [69.8070643951126]
We consider Second Order Neural ODEs (SONODEs)
We show how the adjoint sensitivity method can be extended to SONODEs.
We extend the theoretical understanding of the broader class of Augmented NODEs (ANODEs)
arXiv Detail & Related papers (2020-06-12T14:25:31Z) - Neural Ordinary Differential Equation based Recurrent Neural Network
Model [0.7233897166339269]
differential equations are a promising new member in the neural network family.
This paper explores the strength of the ordinary differential equation (ODE) is explored with a new extension.
Two new ODE-based RNN models (GRU-ODE model and LSTM-ODE) can compute the hidden state and cell state at any point of time using an ODE solver.
Experiments show that these new ODE based RNN models require less training time than Latent ODEs and conventional Neural ODEs.
arXiv Detail & Related papers (2020-05-20T01:02:29Z) - Stochasticity in Neural ODEs: An Empirical Study [68.8204255655161]
Regularization of neural networks (e.g. dropout) is a widespread technique in deep learning that allows for better generalization.
We show that data augmentation during the training improves the performance of both deterministic and versions of the same model.
However, the improvements obtained by the data augmentation completely eliminate the empirical regularization gains, making the performance of neural ODE and neural SDE negligible.
arXiv Detail & Related papers (2020-02-22T22:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.