ResNet After All? Neural ODEs and Their Numerical Solution
- URL: http://arxiv.org/abs/2007.15386v2
- Date: Sun, 10 Sep 2023 20:14:39 GMT
- Title: ResNet After All? Neural ODEs and Their Numerical Solution
- Authors: Katharina Ott, Prateek Katiyar, Philipp Hennig, Michael Tiemann
- Abstract summary: We show that trained Neural Ordinary Differential Equation models actually depend on the specific numerical method used during training.
We propose a method that monitors the behavior of the ODE solver during training to adapt its step size.
- Score: 28.954378025052925
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A key appeal of the recently proposed Neural Ordinary Differential Equation
(ODE) framework is that it seems to provide a continuous-time extension of
discrete residual neural networks. As we show herein, though, trained Neural
ODE models actually depend on the specific numerical method used during
training. If the trained model is supposed to be a flow generated from an ODE,
it should be possible to choose another numerical solver with equal or smaller
numerical error without loss of performance. We observe that if training relies
on a solver with overly coarse discretization, then testing with another solver
of equal or smaller numerical error results in a sharp drop in accuracy. In
such cases, the combination of vector field and numerical method cannot be
interpreted as a flow generated from an ODE, which arguably poses a fatal
breakdown of the Neural ODE concept. We observe, however, that there exists a
critical step size beyond which the training yields a valid ODE vector field.
We propose a method that monitors the behavior of the ODE solver during
training to adapt its step size, aiming to ensure a valid ODE without
unnecessarily increasing computational cost. We verify this adaptation
algorithm on a common bench mark dataset as well as a synthetic dataset.
Related papers
- Faster Training of Neural ODEs Using Gau{\ss}-Legendre Quadrature [68.9206193762751]
We propose an alternative way to speed up the training of neural ODEs.
We use Gauss-Legendre quadrature to solve integrals faster than ODE-based methods.
We also extend the idea to training SDEs using the Wong-Zakai theorem, by training a corresponding ODE and transferring the parameters.
arXiv Detail & Related papers (2023-08-21T11:31:15Z) - Implementation and (Inverse Modified) Error Analysis for
implicitly-templated ODE-nets [0.0]
We focus on learning unknown dynamics from data using ODE-nets templated on implicit numerical initial value problem solvers.
We perform Inverse Modified error analysis of the ODE-nets using unrolled implicit schemes for ease of interpretation.
We formulate an adaptive algorithm which monitors the level of error and adapts the number of (unrolled) implicit solution iterations.
arXiv Detail & Related papers (2023-03-31T06:47:02Z) - Eigen-informed NeuralODEs: Dealing with stability and convergence issues
of NeuralODEs [0.0]
We present a technique to add knowledge of ODE properties based on eigenvalues to the training objective of a NeuralODE.
We show, that the presented training process is far more robust against local minima, instabilities and sparse data samples and improves training convergence and performance.
arXiv Detail & Related papers (2023-02-07T14:45:39Z) - Experimental study of Neural ODE training with adaptive solver for
dynamical systems modeling [72.84259710412293]
Some ODE solvers called adaptive can adapt their evaluation strategy depending on the complexity of the problem at hand.
This paper describes a simple set of experiments to show why adaptive solvers cannot be seamlessly leveraged as a black-box for dynamical systems modelling.
arXiv Detail & Related papers (2022-11-13T17:48:04Z) - On Numerical Integration in Neural Ordinary Differential Equations [0.0]
We propose the inverse modified differential equations (IMDE) to clarify the influence of numerical integration on training Neural ODE models.
It is shown that training a Neural ODE model actually returns a close approximation of the IMDE, rather than the true ODE.
arXiv Detail & Related papers (2022-06-15T07:39:01Z) - Proximal Implicit ODE Solvers for Accelerating Learning Neural ODEs [16.516974867571175]
This paper considers learning neural ODEs using implicit ODE solvers of different orders leveraging proximal operators.
The proximal implicit solver guarantees superiority over explicit solvers in numerical stability and computational efficiency.
arXiv Detail & Related papers (2022-04-19T02:55:10Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - Meta-Solver for Neural Ordinary Differential Equations [77.8918415523446]
We investigate how the variability in solvers' space can improve neural ODEs performance.
We show that the right choice of solver parameterization can significantly affect neural ODEs models in terms of robustness to adversarial attacks.
arXiv Detail & Related papers (2021-03-15T17:26:34Z) - STEER: Simple Temporal Regularization For Neural ODEs [80.80350769936383]
We propose a new regularization technique: randomly sampling the end time of the ODE during training.
The proposed regularization is simple to implement, has negligible overhead and is effective across a wide variety of tasks.
We show through experiments on normalizing flows, time series models and image recognition that the proposed regularization can significantly decrease training time and even improve performance over baseline models.
arXiv Detail & Related papers (2020-06-18T17:44:50Z) - Stochasticity in Neural ODEs: An Empirical Study [68.8204255655161]
Regularization of neural networks (e.g. dropout) is a widespread technique in deep learning that allows for better generalization.
We show that data augmentation during the training improves the performance of both deterministic and versions of the same model.
However, the improvements obtained by the data augmentation completely eliminate the empirical regularization gains, making the performance of neural ODE and neural SDE negligible.
arXiv Detail & Related papers (2020-02-22T22:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.