On Tuning Neural ODE for Stability, Consistency and Faster Convergence
- URL: http://arxiv.org/abs/2312.01657v1
- Date: Mon, 4 Dec 2023 06:18:10 GMT
- Title: On Tuning Neural ODE for Stability, Consistency and Faster Convergence
- Authors: Sheikh Waqas Akhtar
- Abstract summary: We propose a first-order Nesterov's accelerated gradient (NAG) based ODE-solver which is proven to be tuned vis-a-vis CCS conditions.
We empirically demonstrate the efficacy of our approach by training faster, while achieving better or comparable performance against neural-ode.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural-ODE parameterize a differential equation using continuous depth neural
network and solve it using numerical ODE-integrator. These models offer a
constant memory cost compared to models with discrete sequence of hidden layers
in which memory cost increases linearly with the number of layers. In addition
to memory efficiency, other benefits of neural-ode include adaptability of
evaluation approach to input, and flexibility to choose numerical precision or
fast training. However, despite having all these benefits, it still has some
limitations. We identify the ODE-integrator (also called ODE-solver) as the
weakest link in the chain as it may have stability, consistency and convergence
(CCS) issues and may suffer from slower convergence or may not converge at all.
We propose a first-order Nesterov's accelerated gradient (NAG) based ODE-solver
which is proven to be tuned vis-a-vis CCS conditions. We empirically
demonstrate the efficacy of our approach by training faster, while achieving
better or comparable performance against neural-ode employing other fixed-step
explicit ODE-solvers as well discrete depth models such as ResNet in three
different tasks including supervised classification, density estimation, and
time-series modelling.
Related papers
- Efficient, Accurate and Stable Gradients for Neural ODEs [3.79830302036482]
We present a class of algebraically reversible solvers that are both high-order and numerically stable.
This construction naturally extends to numerical schemes for Neural CDEs and SDEs.
arXiv Detail & Related papers (2024-10-15T14:36:05Z) - Balanced Neural ODEs: nonlinear model order reduction and Koopman operator approximations [0.0]
Variational Autoencoders (VAEs) are a powerful framework for learning compact latent representations.
NeuralODEs excel in learning transient system dynamics.
This work combines the strengths of both to create fast surrogate models with adjustable complexity.
arXiv Detail & Related papers (2024-10-14T05:45:52Z) - Faster Training of Neural ODEs Using Gau{\ss}-Legendre Quadrature [68.9206193762751]
We propose an alternative way to speed up the training of neural ODEs.
We use Gauss-Legendre quadrature to solve integrals faster than ODE-based methods.
We also extend the idea to training SDEs using the Wong-Zakai theorem, by training a corresponding ODE and transferring the parameters.
arXiv Detail & Related papers (2023-08-21T11:31:15Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Eigen-informed NeuralODEs: Dealing with stability and convergence issues
of NeuralODEs [0.0]
We present a technique to add knowledge of ODE properties based on eigenvalues to the training objective of a NeuralODE.
We show, that the presented training process is far more robust against local minima, instabilities and sparse data samples and improves training convergence and performance.
arXiv Detail & Related papers (2023-02-07T14:45:39Z) - A memory-efficient neural ODE framework based on high-level adjoint
differentiation [4.063868707697316]
We present a new neural ODE framework, PNODE, based on high-level discrete algorithmic differentiation.
We show that PNODE achieves the highest memory efficiency when compared with other reverse-accurate methods.
arXiv Detail & Related papers (2022-06-02T20:46:26Z) - Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance.
distinct models are required to be trained to reach different points in the rate-distortion (R-D) space.
We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z) - Meta-Solver for Neural Ordinary Differential Equations [77.8918415523446]
We investigate how the variability in solvers' space can improve neural ODEs performance.
We show that the right choice of solver parameterization can significantly affect neural ODEs models in terms of robustness to adversarial attacks.
arXiv Detail & Related papers (2021-03-15T17:26:34Z) - ResNet After All? Neural ODEs and Their Numerical Solution [28.954378025052925]
We show that trained Neural Ordinary Differential Equation models actually depend on the specific numerical method used during training.
We propose a method that monitors the behavior of the ODE solver during training to adapt its step size.
arXiv Detail & Related papers (2020-07-30T11:24:05Z) - Time Dependence in Non-Autonomous Neural ODEs [74.78386661760662]
We propose a novel family of Neural ODEs with time-varying weights.
We outperform previous Neural ODE variants in both speed and representational capacity.
arXiv Detail & Related papers (2020-05-05T01:41:46Z) - Stochasticity in Neural ODEs: An Empirical Study [68.8204255655161]
Regularization of neural networks (e.g. dropout) is a widespread technique in deep learning that allows for better generalization.
We show that data augmentation during the training improves the performance of both deterministic and versions of the same model.
However, the improvements obtained by the data augmentation completely eliminate the empirical regularization gains, making the performance of neural ODE and neural SDE negligible.
arXiv Detail & Related papers (2020-02-22T22:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.