Related papers: Learning by solving differential equations

Learning by solving differential equations

URL: http://arxiv.org/abs/2505.13397v1
Date: Mon, 19 May 2025 17:34:32 GMT
Title: Learning by solving differential equations
Authors: Benoit Dherin, Michael Munn, Hanna Mazzawi, Michael Wunder, Sourabh Medapati, Javier Gonzalvo,
Abstract summary: Runge-Kutta (RK) methods provide a family of very powerful explicit and implicit high-order ODE solvers.<n>We evaluate the performance of RK solvers when applied in deep learning, study their limitations, and propose ways to overcome their drawbacks.
Score: 5.999724026544112
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modern deep learning algorithms use variations of gradient descent as their main learning methods. Gradient descent can be understood as the simplest Ordinary Differential Equation (ODE) solver; namely, the Euler method applied to the gradient flow differential equation. Since Euler, many ODE solvers have been devised that follow the gradient flow equation more precisely and more stably. Runge-Kutta (RK) methods provide a family of very powerful explicit and implicit high-order ODE solvers. However, these higher-order solvers have not found wide application in deep learning so far. In this work, we evaluate the performance of higher-order RK solvers when applied in deep learning, study their limitations, and propose ways to overcome these drawbacks. In particular, we explore how to improve their performance by naturally incorporating key ingredients of modern neural network optimizers such as preconditioning, adaptive learning rates, and momentum.

Related papers

Training Stiff Neural Ordinary Differential Equations with Explicit Exponential Integration Methods [3.941173292703699]
Stiff ordinary differential equations (ODEs) are common in many science and engineering fields.<n>Standard neural ODE approaches struggle to accurately learn stiff systems.<n>This paper expands on our earlier work by exploring explicit exponential integration methods.
arXiv Detail & Related papers (2024-12-02T06:40:08Z)
Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective. We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices. Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z)
Locally Regularized Neural Differential Equations: Some Black Boxes Were Meant to Remain Closed! [3.222802562733787]
Implicit layer deep learning techniques, like Neural Differential Equations, have become an important modeling framework. We develop two sampling strategies to trade off between performance and training time. Our method reduces the number of function evaluations to 0.556-0.733x and accelerates predictions by 1.3-2x.
arXiv Detail & Related papers (2023-03-03T23:31:15Z)
Learning Subgrid-scale Models with Neural Ordinary Differential Equations [0.39160947065896795]
We propose a new approach to learning the subgrid-scale model when simulating partial differential equations (PDEs) In this approach neural networks are used to learn the coarse- to fine-grid map, which can be viewed as subgrid-scale parameterization. Our method inherits the advantages of NODEs and can be used to parameterize subgrid scales, approximate coupling operators, and improve the efficiency of low-order solvers.
arXiv Detail & Related papers (2022-12-20T02:45:09Z)
Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks. We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights. Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z)
Neural Basis Functions for Accelerating Solutions to High Mach Euler Equations [63.8376359764052]
We propose an approach to solving partial differential equations (PDEs) using a set of neural networks. We regress a set of neural networks onto a reduced order Proper Orthogonal Decomposition (POD) basis. These networks are then used in combination with a branch network that ingests the parameters of the prescribed PDE to compute a reduced order approximation to the PDE.
arXiv Detail & Related papers (2022-08-02T18:27:13Z)
Efficiently Solving High-Order and Nonlinear ODEs with Rational Fraction Polynomial: the Ratio Net [3.155317790896023]
This study takes a different approach by introducing neural network architecture for constructing trial functions, known as ratio net. Through empirical trials, it demonstrated that the proposed method exhibits higher efficiency compared to existing approaches. The ratio net holds promise for advancing the efficiency and effectiveness of solving differential equations.
arXiv Detail & Related papers (2021-05-18T16:59:52Z)
Opening the Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics [0.0]
We describe a novel regularization method that uses the internal cost of adaptive differential equation solvers combined with discrete sensitivities to guide the training process. This approach opens up the blackbox numerical analysis behind the differential equation solver's algorithm and uses its local error estimates and stiffnesss as cheap and accurate cost estimates. We demonstrate how our approach can halve the prediction time and showcases how this can increase the training time by an order of magnitude.
arXiv Detail & Related papers (2021-05-09T12:03:03Z)
Meta-Solver for Neural Ordinary Differential Equations [77.8918415523446]
We investigate how the variability in solvers' space can improve neural ODEs performance. We show that the right choice of solver parameterization can significantly affect neural ODEs models in terms of robustness to adversarial attacks.
arXiv Detail & Related papers (2021-03-15T17:26:34Z)
Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem. We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent. Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)
Interpolation Technique to Speed Up Gradients Propagation in Neural ODEs [71.26657499537366]
We propose a simple literature-based method for the efficient approximation of gradients in neural ODE models. We compare it with the reverse dynamic method to train neural ODEs on classification, density estimation, and inference approximation tasks.
arXiv Detail & Related papers (2020-03-11T13:15:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.