Delay Differential Neural Networks
- URL: http://arxiv.org/abs/2012.06800v1
- Date: Sat, 12 Dec 2020 12:20:54 GMT
- Title: Delay Differential Neural Networks
- Authors: Srinivas Anumasa, P.K. Srijith
- Abstract summary: We propose a novel model, delay differential neural networks (DDNN), inspired by delay differential equations (DDEs)
For training DDNNs, we provide a memory-efficient adjoint method for computing gradients and back-propagate through the network.
Experiments conducted on synthetic and real-world image classification datasets such as Cifar10 and Cifar100 show the effectiveness of the proposed models.
- Score: 0.2538209532048866
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural ordinary differential equations (NODEs) treat computation of
intermediate feature vectors as trajectories of ordinary differential equation
parameterized by a neural network. In this paper, we propose a novel model,
delay differential neural networks (DDNN), inspired by delay differential
equations (DDEs). The proposed model considers the derivative of the hidden
feature vector as a function of the current feature vector and past feature
vectors (history). The function is modelled as a neural network and
consequently, it leads to continuous depth alternatives to many recent ResNet
variants. We propose two different DDNN architectures, depending on the way
current and past feature vectors are considered. For training DDNNs, we provide
a memory-efficient adjoint method for computing gradients and back-propagate
through the network. DDNN improves the data efficiency of NODE by further
reducing the number of parameters without affecting the generalization
performance. Experiments conducted on synthetic and real-world image
classification datasets such as Cifar10 and Cifar100 show the effectiveness of
the proposed models.
Related papers
- PMNN:Physical Model-driven Neural Network for solving time-fractional
differential equations [17.66402435033991]
An innovative Physical Model-driven Neural Network (PMNN) method is proposed to solve time-fractional differential equations.
It effectively combines deep neural networks (DNNs) with approximation of fractional derivatives.
arXiv Detail & Related papers (2023-10-07T12:43:32Z) - A predictive physics-aware hybrid reduced order model for reacting flows [65.73506571113623]
A new hybrid predictive Reduced Order Model (ROM) is proposed to solve reacting flow problems.
The number of degrees of freedom is reduced from thousands of temporal points to a few POD modes with their corresponding temporal coefficients.
Two different deep learning architectures have been tested to predict the temporal coefficients.
arXiv Detail & Related papers (2023-01-24T08:39:20Z) - Semantic Segmentation using Neural Ordinary Differential Equations [3.7588109000722727]
In residual networks, instead of having a discrete sequence of hidden layers, the derivative of the continuous dynamics of hidden state can be parameterized by an ODE.
We show that our neural ODE is able to achieve the state-of-the-art results using 57% less memory for training, 42% less memory for testing, and 68% less number of parameters.
arXiv Detail & Related papers (2022-09-18T22:13:55Z) - Training High-Performance Low-Latency Spiking Neural Networks by
Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware.
It is a challenge to efficiently train SNNs due to their non-differentiability.
We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - Scaling Properties of Deep Residual Networks [2.6763498831034043]
We investigate the properties of weights trained by gradient descent and their scaling with network depth through numerical experiments.
We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature.
These findings cast doubts on the validity of the neural ODE model as an adequate description of deep ResNets.
arXiv Detail & Related papers (2021-05-25T22:31:30Z) - A Novel Neural Network Training Framework with Data Assimilation [2.948167339160823]
A gradient-free training framework based on data assimilation is proposed to avoid the calculation of gradients.
The results show that the proposed training framework performed better than the gradient decent method.
arXiv Detail & Related papers (2020-10-06T11:12:23Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models.
Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems.
These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z) - Fractional Deep Neural Network via Constrained Optimization [0.0]
This paper introduces a novel algorithmic framework for a deep neural network (DNN)
Fractional-DNN can be viewed as a time-discretization of a fractional in time nonlinear ordinary differential equation (ODE)
arXiv Detail & Related papers (2020-04-01T21:58:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.