Related papers: Faster Training of Neural ODEs Using Gau{\ss}-Legendre Quadrature

Faster Training of Neural ODEs Using Gau{\ss}-Legendre Quadrature

URL: http://arxiv.org/abs/2308.10644v1
Date: Mon, 21 Aug 2023 11:31:15 GMT
Title: Faster Training of Neural ODEs Using Gau{\ss}-Legendre Quadrature
Authors: Alexander Norcliffe, Marc Peter Deisenroth
Abstract summary: We propose an alternative way to speed up the training of neural ODEs. We use Gauss-Legendre quadrature to solve integrals faster than ODE-based methods. We also extend the idea to training SDEs using the Wong-Zakai theorem, by training a corresponding ODE and transferring the parameters.
Score: 68.9206193762751
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural ODEs demonstrate strong performance in generative and time-series modelling. However, training them via the adjoint method is slow compared to discrete models due to the requirement of numerically solving ODEs. To speed neural ODEs up, a common approach is to regularise the solutions. However, this approach may affect the expressivity of the model; when the trajectory itself matters, this is particularly important. In this paper, we propose an alternative way to speed up the training of neural ODEs. The key idea is to speed up the adjoint method by using Gau{\ss}-Legendre quadrature to solve integrals faster than ODE-based methods while remaining memory efficient. We also extend the idea to training SDEs using the Wong-Zakai theorem, by training a corresponding ODE and transferring the parameters. Our approach leads to faster training of neural ODEs, especially for large models. It also presents a new way to train SDE-based models.

Related papers

AdamNODEs: When Neural ODE Meets Adaptive Moment Estimation [19.909858354874547]
We propose adaptive momentum estimation neural ODEs (AdamNODEs) that adaptively control the acceleration of the classical momentum-based approach. In evaluation, we show that AdamNODEs achieve the lowest training loss and efficacy over existing neural ODEs.
arXiv Detail & Related papers (2022-07-13T09:20:38Z)
On the balance between the training time and interpretability of neural ODE for time series modelling [77.34726150561087]
The paper shows that modern neural ODE cannot be reduced to simpler models for time-series modelling applications. The complexity of neural ODE is compared to or exceeds the conventional time-series modelling tools. We propose a new view on time-series modelling using combined neural networks and an ODE system approach.
arXiv Detail & Related papers (2022-06-07T13:49:40Z)
Heavy Ball Neural Ordinary Differential Equations [12.861233366398162]
We propose heavy ball neural ordinary differential equations (HBNODEs) to improve neural ODEs (NODEs) training and inference. HBNODEs have two properties that imply practical advantages over NODEs. We verify the advantages of HBNODEs over NODEs on benchmark tasks, including image classification, learning complex dynamics, and sequential modeling.
arXiv Detail & Related papers (2021-10-10T16:11:11Z)
Meta-Solver for Neural Ordinary Differential Equations [77.8918415523446]
We investigate how the variability in solvers' space can improve neural ODEs performance. We show that the right choice of solver parameterization can significantly affect neural ODEs models in terms of robustness to adversarial attacks.
arXiv Detail & Related papers (2021-03-15T17:26:34Z)
STEER: Simple Temporal Regularization For Neural ODEs [80.80350769936383]
We propose a new regularization technique: randomly sampling the end time of the ODE during training. The proposed regularization is simple to implement, has negligible overhead and is effective across a wide variety of tasks. We show through experiments on normalizing flows, time series models and image recognition that the proposed regularization can significantly decrease training time and even improve performance over baseline models.
arXiv Detail & Related papers (2020-06-18T17:44:50Z)
Neural Ordinary Differential Equation based Recurrent Neural Network Model [0.7233897166339269]
differential equations are a promising new member in the neural network family. This paper explores the strength of the ordinary differential equation (ODE) is explored with a new extension. Two new ODE-based RNN models (GRU-ODE model and LSTM-ODE) can compute the hidden state and cell state at any point of time using an ODE solver. Experiments show that these new ODE based RNN models require less training time than Latent ODEs and conventional Neural ODEs.
arXiv Detail & Related papers (2020-05-20T01:02:29Z)
Interpolation Technique to Speed Up Gradients Propagation in Neural ODEs [71.26657499537366]
We propose a simple literature-based method for the efficient approximation of gradients in neural ODE models. We compare it with the reverse dynamic method to train neural ODEs on classification, density estimation, and inference approximation tasks.
arXiv Detail & Related papers (2020-03-11T13:15:57Z)
Stochasticity in Neural ODEs: An Empirical Study [68.8204255655161]
Regularization of neural networks (e.g. dropout) is a widespread technique in deep learning that allows for better generalization. We show that data augmentation during the training improves the performance of both deterministic and versions of the same model. However, the improvements obtained by the data augmentation completely eliminate the empirical regularization gains, making the performance of neural ODE and neural SDE negligible.
arXiv Detail & Related papers (2020-02-22T22:12:56Z)
How to train your neural ODE: the world of Jacobian and kinetic regularization [7.83405844354125]
Training neural ODEs on large datasets has not been tractable due to the necessity of allowing the adaptive numerical ODE solver to refine its step size to very small values. We introduce a theoretically-grounded combination of both optimal transport and stability regularizations which encourage neural ODEs to prefer simpler dynamics out of all the dynamics that solve a problem well.
arXiv Detail & Related papers (2020-02-07T14:15:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.