AdamNODEs: When Neural ODE Meets Adaptive Moment Estimation
- URL: http://arxiv.org/abs/2207.06066v1
- Date: Wed, 13 Jul 2022 09:20:38 GMT
- Title: AdamNODEs: When Neural ODE Meets Adaptive Moment Estimation
- Authors: Suneghyeon Cho, Sanghyun Hong, Kookjin Lee, Noseong Park
- Abstract summary: We propose adaptive momentum estimation neural ODEs (AdamNODEs) that adaptively control the acceleration of the classical momentum-based approach.
In evaluation, we show that AdamNODEs achieve the lowest training loss and efficacy over existing neural ODEs.
- Score: 19.909858354874547
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work by Xia et al. leveraged the continuous-limit of the classical
momentum accelerated gradient descent and proposed heavy-ball neural ODEs.
While this model offers computational efficiency and high utility over vanilla
neural ODEs, this approach often causes the overshooting of internal dynamics,
leading to unstable training of a model. Prior work addresses this issue by
using ad-hoc approaches, e.g., bounding the internal dynamics using specific
activation functions, but the resulting models do not satisfy the exact
heavy-ball ODE. In this work, we propose adaptive momentum estimation neural
ODEs (AdamNODEs) that adaptively control the acceleration of the classical
momentum-based approach. We find that its adjoint states also satisfy AdamODE
and do not require ad-hoc solutions that the prior work employs. In evaluation,
we show that AdamNODEs achieve the lowest training loss and efficacy over
existing neural ODEs. We also show that AdamNODEs have better training
stability than classical momentum-based neural ODEs. This result sheds some
light on adapting the techniques proposed in the optimization community to
improving the training and inference of neural ODEs further. Our code is
available at https://github.com/pmcsh04/AdamNODE.
Related papers
- Faster Training of Neural ODEs Using Gau{\ss}-Legendre Quadrature [68.9206193762751]
We propose an alternative way to speed up the training of neural ODEs.
We use Gauss-Legendre quadrature to solve integrals faster than ODE-based methods.
We also extend the idea to training SDEs using the Wong-Zakai theorem, by training a corresponding ODE and transferring the parameters.
arXiv Detail & Related papers (2023-08-21T11:31:15Z) - Experimental study of Neural ODE training with adaptive solver for
dynamical systems modeling [72.84259710412293]
Some ODE solvers called adaptive can adapt their evaluation strategy depending on the complexity of the problem at hand.
This paper describes a simple set of experiments to show why adaptive solvers cannot be seamlessly leveraged as a black-box for dynamical systems modelling.
arXiv Detail & Related papers (2022-11-13T17:48:04Z) - Standalone Neural ODEs with Sensitivity Analysis [5.565364597145569]
This paper presents a continuous-depth neural ODE model capable of describing a full deep neural network.
We present a general formulation of the neural sensitivity problem and show how it is used in the NCG training.
Our evaluations demonstrate that our novel formulations lead to increased robustness and performance as compared to ResNet models.
arXiv Detail & Related papers (2022-05-27T12:16:53Z) - Heavy Ball Neural Ordinary Differential Equations [12.861233366398162]
We propose heavy ball neural ordinary differential equations (HBNODEs) to improve neural ODEs (NODEs) training and inference.
HBNODEs have two properties that imply practical advantages over NODEs.
We verify the advantages of HBNODEs over NODEs on benchmark tasks, including image classification, learning complex dynamics, and sequential modeling.
arXiv Detail & Related papers (2021-10-10T16:11:11Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Meta-Solver for Neural Ordinary Differential Equations [77.8918415523446]
We investigate how the variability in solvers' space can improve neural ODEs performance.
We show that the right choice of solver parameterization can significantly affect neural ODEs models in terms of robustness to adversarial attacks.
arXiv Detail & Related papers (2021-03-15T17:26:34Z) - Time Dependence in Non-Autonomous Neural ODEs [74.78386661760662]
We propose a novel family of Neural ODEs with time-varying weights.
We outperform previous Neural ODE variants in both speed and representational capacity.
arXiv Detail & Related papers (2020-05-05T01:41:46Z) - Stochasticity in Neural ODEs: An Empirical Study [68.8204255655161]
Regularization of neural networks (e.g. dropout) is a widespread technique in deep learning that allows for better generalization.
We show that data augmentation during the training improves the performance of both deterministic and versions of the same model.
However, the improvements obtained by the data augmentation completely eliminate the empirical regularization gains, making the performance of neural ODE and neural SDE negligible.
arXiv Detail & Related papers (2020-02-22T22:12:56Z) - How to train your neural ODE: the world of Jacobian and kinetic
regularization [7.83405844354125]
Training neural ODEs on large datasets has not been tractable due to the necessity of allowing the adaptive numerical ODE solver to refine its step size to very small values.
We introduce a theoretically-grounded combination of both optimal transport and stability regularizations which encourage neural ODEs to prefer simpler dynamics out of all the dynamics that solve a problem well.
arXiv Detail & Related papers (2020-02-07T14:15:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.