Related papers: On Tuning Neural ODE for Stability, Consistency and Faster Convergence

On Tuning Neural ODE for Stability, Consistency and Faster Convergence

URL: http://arxiv.org/abs/2312.01657v1
Date: Mon, 4 Dec 2023 06:18:10 GMT
Title: On Tuning Neural ODE for Stability, Consistency and Faster Convergence
Authors: Sheikh Waqas Akhtar
Abstract summary: We propose a first-order Nesterov's accelerated gradient (NAG) based ODE-solver which is proven to be tuned vis-a-vis CCS conditions. We empirically demonstrate the efficacy of our approach by training faster, while achieving better or comparable performance against neural-ode.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural-ODE parameterize a differential equation using continuous depth neural network and solve it using numerical ODE-integrator. These models offer a constant memory cost compared to models with discrete sequence of hidden layers in which memory cost increases linearly with the number of layers. In addition to memory efficiency, other benefits of neural-ode include adaptability of evaluation approach to input, and flexibility to choose numerical precision or fast training. However, despite having all these benefits, it still has some limitations. We identify the ODE-integrator (also called ODE-solver) as the weakest link in the chain as it may have stability, consistency and convergence (CCS) issues and may suffer from slower convergence or may not converge at all. We propose a first-order Nesterov's accelerated gradient (NAG) based ODE-solver which is proven to be tuned vis-a-vis CCS conditions. We empirically demonstrate the efficacy of our approach by training faster, while achieving better or comparable performance against neural-ode employing other fixed-step explicit ODE-solvers as well discrete depth models such as ResNet in three different tasks including supervised classification, density estimation, and time-series modelling.

Related papers

Efficient, Accurate and Stable Gradients for Neural ODEs [3.79830302036482]
We present a class of algebraically reversible solvers that are both high-order and numerically stable. This construction naturally extends to numerical schemes for Neural CDEs and SDEs.
arXiv Detail & Related papers (2024-10-15T14:36:05Z)
Balanced Neural ODEs: nonlinear model order reduction and Koopman operator approximations [0.0]
Variational Autoencoders (VAEs) are a powerful framework for learning compact latent representations. NeuralODEs excel in learning transient system dynamics. This work combines the strengths of both to create fast surrogate models with adjustable complexity.
arXiv Detail & Related papers (2024-10-14T05:45:52Z)
Faster Training of Neural ODEs Using Gau{\ss}-Legendre Quadrature [68.9206193762751]
We propose an alternative way to speed up the training of neural ODEs. We use Gauss-Legendre quadrature to solve integrals faster than ODE-based methods. We also extend the idea to training SDEs using the Wong-Zakai theorem, by training a corresponding ODE and transferring the parameters.
arXiv Detail & Related papers (2023-08-21T11:31:15Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
Eigen-informed NeuralODEs: Dealing with stability and convergence issues of NeuralODEs [0.0]
We present a technique to add knowledge of ODE properties based on eigenvalues to the training objective of a NeuralODE. We show, that the presented training process is far more robust against local minima, instabilities and sparse data samples and improves training convergence and performance.
arXiv Detail & Related papers (2023-02-07T14:45:39Z)
A memory-efficient neural ODE framework based on high-level adjoint differentiation [4.063868707697316]
We present a new neural ODE framework, PNODE, based on high-level discrete algorithmic differentiation. We show that PNODE achieves the highest memory efficiency when compared with other reverse-accurate methods.
arXiv Detail & Related papers (2022-06-02T20:46:26Z)
Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance. distinct models are required to be trained to reach different points in the rate-distortion (R-D) space. We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z)
Meta-Solver for Neural Ordinary Differential Equations [77.8918415523446]
We investigate how the variability in solvers' space can improve neural ODEs performance. We show that the right choice of solver parameterization can significantly affect neural ODEs models in terms of robustness to adversarial attacks.
arXiv Detail & Related papers (2021-03-15T17:26:34Z)
ResNet After All? Neural ODEs and Their Numerical Solution [28.954378025052925]
We show that trained Neural Ordinary Differential Equation models actually depend on the specific numerical method used during training. We propose a method that monitors the behavior of the ODE solver during training to adapt its step size.
arXiv Detail & Related papers (2020-07-30T11:24:05Z)
Go with the Flow: Adaptive Control for Neural ODEs [10.265713480189484]
We describe a new module called neurally controlled ODE (N-CODE) designed to improve the expressivity of NODEs. N-CODE modules are dynamic variables governed by a trainable map from initial or current activation state. A single module is sufficient for learning a distribution on non-autonomous flows that adaptively drive neural representations.
arXiv Detail & Related papers (2020-06-16T22:21:15Z)
Time Dependence in Non-Autonomous Neural ODEs [74.78386661760662]
We propose a novel family of Neural ODEs with time-varying weights. We outperform previous Neural ODE variants in both speed and representational capacity.
arXiv Detail & Related papers (2020-05-05T01:41:46Z)
Stochasticity in Neural ODEs: An Empirical Study [68.8204255655161]
Regularization of neural networks (e.g. dropout) is a widespread technique in deep learning that allows for better generalization. We show that data augmentation during the training improves the performance of both deterministic and versions of the same model. However, the improvements obtained by the data augmentation completely eliminate the empirical regularization gains, making the performance of neural ODE and neural SDE negligible.
arXiv Detail & Related papers (2020-02-22T22:12:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.