Related papers: AdamNODEs: When Neural ODE Meets Adaptive Moment Estimation

AdamNODEs: When Neural ODE Meets Adaptive Moment Estimation

URL: http://arxiv.org/abs/2207.06066v1
Date: Wed, 13 Jul 2022 09:20:38 GMT
Title: AdamNODEs: When Neural ODE Meets Adaptive Moment Estimation
Authors: Suneghyeon Cho, Sanghyun Hong, Kookjin Lee, Noseong Park
Abstract summary: We propose adaptive momentum estimation neural ODEs (AdamNODEs) that adaptively control the acceleration of the classical momentum-based approach. In evaluation, we show that AdamNODEs achieve the lowest training loss and efficacy over existing neural ODEs.
Score: 19.909858354874547
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent work by Xia et al. leveraged the continuous-limit of the classical momentum accelerated gradient descent and proposed heavy-ball neural ODEs. While this model offers computational efficiency and high utility over vanilla neural ODEs, this approach often causes the overshooting of internal dynamics, leading to unstable training of a model. Prior work addresses this issue by using ad-hoc approaches, e.g., bounding the internal dynamics using specific activation functions, but the resulting models do not satisfy the exact heavy-ball ODE. In this work, we propose adaptive momentum estimation neural ODEs (AdamNODEs) that adaptively control the acceleration of the classical momentum-based approach. We find that its adjoint states also satisfy AdamODE and do not require ad-hoc solutions that the prior work employs. In evaluation, we show that AdamNODEs achieve the lowest training loss and efficacy over existing neural ODEs. We also show that AdamNODEs have better training stability than classical momentum-based neural ODEs. This result sheds some light on adapting the techniques proposed in the optimization community to improving the training and inference of neural ODEs further. Our code is available at https://github.com/pmcsh04/AdamNODE.

Related papers

Faster Training of Neural ODEs Using Gau{\ss}-Legendre Quadrature [68.9206193762751]
We propose an alternative way to speed up the training of neural ODEs. We use Gauss-Legendre quadrature to solve integrals faster than ODE-based methods. We also extend the idea to training SDEs using the Wong-Zakai theorem, by training a corresponding ODE and transferring the parameters.
arXiv Detail & Related papers (2023-08-21T11:31:15Z)
Experimental study of Neural ODE training with adaptive solver for dynamical systems modeling [72.84259710412293]
Some ODE solvers called adaptive can adapt their evaluation strategy depending on the complexity of the problem at hand. This paper describes a simple set of experiments to show why adaptive solvers cannot be seamlessly leveraged as a black-box for dynamical systems modelling.
arXiv Detail & Related papers (2022-11-13T17:48:04Z)
Standalone Neural ODEs with Sensitivity Analysis [5.565364597145569]
This paper presents a continuous-depth neural ODE model capable of describing a full deep neural network. We present a general formulation of the neural sensitivity problem and show how it is used in the NCG training. Our evaluations demonstrate that our novel formulations lead to increased robustness and performance as compared to ResNet models.
arXiv Detail & Related papers (2022-05-27T12:16:53Z)
Heavy Ball Neural Ordinary Differential Equations [12.861233366398162]
We propose heavy ball neural ordinary differential equations (HBNODEs) to improve neural ODEs (NODEs) training and inference. HBNODEs have two properties that imply practical advantages over NODEs. We verify the advantages of HBNODEs over NODEs on benchmark tasks, including image classification, learning complex dynamics, and sequential modeling.
arXiv Detail & Related papers (2021-10-10T16:11:11Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Meta-Solver for Neural Ordinary Differential Equations [77.8918415523446]
We investigate how the variability in solvers' space can improve neural ODEs performance. We show that the right choice of solver parameterization can significantly affect neural ODEs models in terms of robustness to adversarial attacks.
arXiv Detail & Related papers (2021-03-15T17:26:34Z)
Time Dependence in Non-Autonomous Neural ODEs [74.78386661760662]
We propose a novel family of Neural ODEs with time-varying weights. We outperform previous Neural ODE variants in both speed and representational capacity.
arXiv Detail & Related papers (2020-05-05T01:41:46Z)
Stochasticity in Neural ODEs: An Empirical Study [68.8204255655161]
Regularization of neural networks (e.g. dropout) is a widespread technique in deep learning that allows for better generalization. We show that data augmentation during the training improves the performance of both deterministic and versions of the same model. However, the improvements obtained by the data augmentation completely eliminate the empirical regularization gains, making the performance of neural ODE and neural SDE negligible.
arXiv Detail & Related papers (2020-02-22T22:12:56Z)
How to train your neural ODE: the world of Jacobian and kinetic regularization [7.83405844354125]
Training neural ODEs on large datasets has not been tractable due to the necessity of allowing the adaptive numerical ODE solver to refine its step size to very small values. We introduce a theoretically-grounded combination of both optimal transport and stability regularizations which encourage neural ODEs to prefer simpler dynamics out of all the dynamics that solve a problem well.
arXiv Detail & Related papers (2020-02-07T14:15:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.