Interpolation Technique to Speed Up Gradients Propagation in Neural ODEs
- URL: http://arxiv.org/abs/2003.05271v2
- Date: Fri, 30 Oct 2020 21:44:17 GMT
- Title: Interpolation Technique to Speed Up Gradients Propagation in Neural ODEs
- Authors: Talgat Daulbaev and Alexandr Katrutsa and Larisa Markeeva and Julia
Gusak and Andrzej Cichocki and Ivan Oseledets
- Abstract summary: We propose a simple literature-based method for the efficient approximation of gradients in neural ODE models.
We compare it with the reverse dynamic method to train neural ODEs on classification, density estimation, and inference approximation tasks.
- Score: 71.26657499537366
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a simple interpolation-based method for the efficient
approximation of gradients in neural ODE models. We compare it with the reverse
dynamic method (known in the literature as "adjoint method") to train neural
ODEs on classification, density estimation, and inference approximation tasks.
We also propose a theoretical justification of our approach using logarithmic
norm formalism. As a result, our method allows faster model training than the
reverse dynamic method that was confirmed and validated by extensive numerical
experiments for several standard benchmarks.
Related papers
- Adaptive Feedforward Gradient Estimation in Neural ODEs [0.0]
We propose a novel approach that leverages adaptive feedforward gradient estimation to improve the efficiency, consistency, and interpretability of Neural ODEs.
Our method eliminates the need for backpropagation and the adjoint method, reducing computational overhead and memory usage while maintaining accuracy.
arXiv Detail & Related papers (2024-09-22T18:21:01Z) - Correcting auto-differentiation in neural-ODE training [19.472357078065194]
We find that when a neural network employs high-order forms to approximate the underlying ODE flows, brute-force computation using auto-differentiation often produces non-converging artificial oscillations.
We propose a straightforward post-processing technique that effectively eliminates these oscillations, rectifies the computation and thus respects the updates of the underlying flow.
arXiv Detail & Related papers (2023-06-03T20:34:14Z) - A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE.
We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z) - Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data.
Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z) - Implementation and (Inverse Modified) Error Analysis for
implicitly-templated ODE-nets [0.0]
We focus on learning unknown dynamics from data using ODE-nets templated on implicit numerical initial value problem solvers.
We perform Inverse Modified error analysis of the ODE-nets using unrolled implicit schemes for ease of interpretation.
We formulate an adaptive algorithm which monitors the level of error and adapts the number of (unrolled) implicit solution iterations.
arXiv Detail & Related papers (2023-03-31T06:47:02Z) - Extended dynamic mode decomposition with dictionary learning using
neural ordinary differential equations [0.8701566919381223]
We propose an algorithm to perform extended dynamic mode decomposition using NODEs.
We show the superiority of the parameter efficiency of the proposed method through numerical experiments.
arXiv Detail & Related papers (2021-10-01T06:56:14Z) - Distributional Gradient Matching for Learning Uncertain Neural Dynamics
Models [38.17499046781131]
We propose a novel approach towards estimating uncertain neural ODEs, avoiding the numerical integration bottleneck.
Our algorithm - distributional gradient matching (DGM) - jointly trains a smoother and a dynamics model and matches their gradients via minimizing a Wasserstein loss.
Our experiments show that, compared to traditional approximate inference methods based on numerical integration, our approach is faster to train, faster at predicting previously unseen trajectories, and in the context of neural ODEs, significantly more accurate.
arXiv Detail & Related papers (2021-06-22T08:40:51Z) - Meta-Solver for Neural Ordinary Differential Equations [77.8918415523446]
We investigate how the variability in solvers' space can improve neural ODEs performance.
We show that the right choice of solver parameterization can significantly affect neural ODEs models in terms of robustness to adversarial attacks.
arXiv Detail & Related papers (2021-03-15T17:26:34Z) - Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
In neural networks with binary activations and or binary weights the training by gradient descent is complicated.
We propose a new method for this estimation problem combining sampling and analytic approximation steps.
We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
arXiv Detail & Related papers (2020-06-04T21:51:21Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.