Related papers: Training Neural ODEs Using Fully Discretized Simultaneous Optimization

Training Neural ODEs Using Fully Discretized Simultaneous Optimization

URL: http://arxiv.org/abs/2502.15642v1
Date: Fri, 21 Feb 2025 18:10:26 GMT
Title: Training Neural ODEs Using Fully Discretized Simultaneous Optimization
Authors: Mariia Shapovalova, Calvin Tsay,
Abstract summary: Training Neural Ordinary Differential Equations (Neural ODEs) requires solving differential equations at each epoch, leading to high computational costs.<n>In particular, we employ a collocation-based, fully discretized formulation and use IPOPT-a solver for large-scale nonlinear optimization.<n>Our results show significant potential for (collocation-based) simultaneous Neural ODE training pipelines.
Score: 2.290491821371513
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural Ordinary Differential Equations (Neural ODEs) represent continuous-time dynamics with neural networks, offering advancements for modeling and control tasks. However, training Neural ODEs requires solving differential equations at each epoch, leading to high computational costs. This work investigates simultaneous optimization methods as a faster training alternative. In particular, we employ a collocation-based, fully discretized formulation and use IPOPT--a solver for large-scale nonlinear optimization--to simultaneously optimize collocation coefficients and neural network parameters. Using the Van der Pol Oscillator as a case study, we demonstrate faster convergence compared to traditional training methods. Furthermore, we introduce a decomposition framework utilizing Alternating Direction Method of Multipliers (ADMM) to effectively coordinate sub-models among data batches. Our results show significant potential for (collocation-based) simultaneous Neural ODE training pipelines.

Related papers

Layerwise goal-oriented adaptivity for neural ODEs: an optimal control perspective [0.0]
We propose a novel layerwise adaptive construction method for neural network architectures.<n>We present results for a selection of well known examples from the literature.
arXiv Detail & Related papers (2026-01-12T10:32:37Z)
Fractional Spike Differential Equations Neural Network with Efficient Adjoint Parameters Training [63.3991315762955]
Spiking Neural Networks (SNNs) draw inspiration from biological neurons to create realistic models for brain-like computation.<n>Most existing SNNs assume a single time constant for neuronal membrane voltage dynamics, modeled by first-order ordinary differential equations (ODEs) with Markovian characteristics.<n>We propose the Fractional SPIKE Differential Equation neural network (fspikeDE), which captures long-term dependencies in membrane voltage and spike trains through fractional-order dynamics.
arXiv Detail & Related papers (2025-07-22T18:20:56Z)
Efficient Training of Physics-enhanced Neural ODEs via Direct Collocation and Nonlinear Programming [0.0]
We propose a novel approach for training Physics-enhanced Neural ODEs (PeN-ODEs) by expressing the training process as a dynamic optimization problem.<n>The full model, including neural components, is discretized using a high-order implicit Runge-Kutta method with flipped Legendre-Gauss-Radau points.<n>This formulation enables simultaneous optimization of network parameters and state trajectories, addressing key limitations of ODE solver-based training in terms of stability, runtime, and accuracy.
arXiv Detail & Related papers (2025-05-06T14:04:46Z)
A Simultaneous Approach for Training Neural Differential-Algebraic Systems of Equations [0.4935512063616847]
We study neural differential-algebraic systems of equations (DAEs), where some unknown relationships are learned from data. We apply the simultaneous approach to neural DAE problems, resulting in a fully discretized nonlinear optimization problem. We achieve promising results in terms of accuracy, model generalizability and computational cost, across different problem settings.
arXiv Detail & Related papers (2025-04-07T01:26:55Z)
PMNN:Physical Model-driven Neural Network for solving time-fractional differential equations [17.66402435033991]
An innovative Physical Model-driven Neural Network (PMNN) method is proposed to solve time-fractional differential equations. It effectively combines deep neural networks (DNNs) with approximation of fractional derivatives.
arXiv Detail & Related papers (2023-10-07T12:43:32Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Homotopy-based training of NeuralODEs for accurate dynamics discovery [0.0]
We develop a new training method for NeuralODEs, based on synchronization and homotopy optimization. We show that synchronizing the model dynamics and the training data tames the originally irregular loss landscape. Our method achieves competitive or better training loss while often requiring less than half the number of training epochs.
arXiv Detail & Related papers (2022-10-04T06:32:45Z)
TO-FLOW: Efficient Continuous Normalizing Flows with Temporal Optimization adjoint with Moving Speed [12.168241245313164]
Continuous normalizing flows (CNFs) construct invertible mappings between an arbitrary complex distribution and an isotropic Gaussian distribution. It has not been tractable on large datasets due to the incremental complexity of the neural ODE training. In this paper, a temporal optimization is proposed by optimizing the evolutionary time for forward propagation of the neural ODE training.
arXiv Detail & Related papers (2022-03-19T14:56:41Z)
Influence Estimation and Maximization via Neural Mean-Field Dynamics [60.91291234832546]
We propose a novel learning framework using neural mean-field (NMF) dynamics for inference and estimation problems. Our framework can simultaneously learn the structure of the diffusion network and the evolution of node infection probabilities.
arXiv Detail & Related papers (2021-06-03T00:02:05Z)
Accelerating Neural ODEs Using Model Order Reduction [0.0]
We show that mathematical model order reduction methods can be used for compressing and accelerating Neural ODEs. We implement our novel compression method by developing Neural ODEs that integrate the necessary subspace-projection and operations as layers of the neural network.
arXiv Detail & Related papers (2021-05-28T19:27:09Z)
Meta-Solver for Neural Ordinary Differential Equations [77.8918415523446]
We investigate how the variability in solvers' space can improve neural ODEs performance. We show that the right choice of solver parameterization can significantly affect neural ODEs models in terms of robustness to adversarial attacks.
arXiv Detail & Related papers (2021-03-15T17:26:34Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
Interpolation Technique to Speed Up Gradients Propagation in Neural ODEs [71.26657499537366]
We propose a simple literature-based method for the efficient approximation of gradients in neural ODE models. We compare it with the reverse dynamic method to train neural ODEs on classification, density estimation, and inference approximation tasks.
arXiv Detail & Related papers (2020-03-11T13:15:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.