Related papers: Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and (gradient) stable architecture for learning long time dependencies

Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and (gradient) stable architecture for learning long time dependencies

URL: http://arxiv.org/abs/2010.00951v2
Date: Sun, 14 Mar 2021 19:12:57 GMT
Title: Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and (gradient) stable architecture for learning long time dependencies
Authors: T. Konstantin Rusch, Siddhartha Mishra
Abstract summary: We propose a novel architecture for recurrent neural networks. Our proposed RNN is based on a time-discretization of a system of second-order ordinary differential equations. Experiments show that the proposed RNN is comparable in performance to the state of the art on a variety of benchmarks.
Score: 15.2292571922932
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Circuits of biological neurons, such as in the functional parts of the brain can be modeled as networks of coupled oscillators. Inspired by the ability of these systems to express a rich set of outputs while keeping (gradients of) state variables bounded, we propose a novel architecture for recurrent neural networks. Our proposed RNN is based on a time-discretization of a system of second-order ordinary differential equations, modeling networks of controlled nonlinear oscillators. We prove precise bounds on the gradients of the hidden states, leading to the mitigation of the exploding and vanishing gradient problem for this RNN. Experiments show that the proposed RNN is comparable in performance to the state of the art on a variety of benchmarks, demonstrating the potential of this architecture to provide stable and accurate RNNs for processing complex sequential data.

Related papers

Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval. A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed. The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
Scalable Mechanistic Neural Networks [52.28945097811129]
We propose an enhanced neural network framework designed for scientific machine learning applications involving long temporal sequences. By reformulating the original Mechanistic Neural Network (MNN) we reduce the computational time and space complexities from cubic and quadratic with respect to the sequence length, respectively, to linear. Extensive experiments demonstrate that S-MNN matches the original MNN in precision while substantially reducing computational resources.
arXiv Detail & Related papers (2024-10-08T14:27:28Z)
Graph Neural Reaction Diffusion Models [14.164952387868341]
We propose a novel family of Reaction GNNs based on neural RD systems. We discuss the theoretical properties of our RDGNN, its implementation, and show that it improves or offers competitive performance to state-of-the-art methods.
arXiv Detail & Related papers (2024-06-16T09:46:58Z)
Synchronized Stepwise Control of Firing and Learning Thresholds in a Spiking Randomly Connected Neural Network toward Hardware Implementation [0.0]
We propose hardware-oriented models of intrinsic plasticity (IP) and synaptic plasticity (SP) for spiking randomly connected neural network (RNN) We demonstrate the effectiveness of our model through simulations of temporal data learning and anomaly detection with a spiking RNN using publicly available electrocardiograms.
arXiv Detail & Related papers (2024-04-26T08:26:10Z)
A predictive physics-aware hybrid reduced order model for reacting flows [65.73506571113623]
A new hybrid predictive Reduced Order Model (ROM) is proposed to solve reacting flow problems. The number of degrees of freedom is reduced from thousands of temporal points to a few POD modes with their corresponding temporal coefficients. Two different deep learning architectures have been tested to predict the temporal coefficients.
arXiv Detail & Related papers (2023-01-24T08:39:20Z)
Lyapunov-Guided Representation of Recurrent Neural Network Performance [9.449520199858952]
Recurrent Neural Networks (RNN) are ubiquitous computing systems for sequences and time series data. We propose to treat RNN as dynamical systems and to correlate hyperparameters with accuracy through Lyapunov spectral analysis. Our studies of various RNN architectures show that AeLLE successfully correlates RNN Lyapunov spectrum with accuracy.
arXiv Detail & Related papers (2022-04-11T05:38:38Z)
A novel Deep Neural Network architecture for non-linear system identification [78.69776924618505]
We present a novel Deep Neural Network (DNN) architecture for non-linear system identification. Inspired by fading memory systems, we introduce inductive bias (on the architecture) and regularization (on the loss function) This architecture allows for automatic complexity selection based solely on available data.
arXiv Detail & Related papers (2021-06-06T10:06:07Z)
A unified framework for Hamiltonian deep neural networks [3.0934684265555052]
Training deep neural networks (DNNs) can be difficult due to vanishing/exploding gradients during weight optimization. We propose a class of DNNs stemming from the time discretization of Hamiltonian systems. The proposed Hamiltonian framework, besides encompassing existing networks inspired by marginally stable ODEs, allows one to derive new and more expressive architectures.
arXiv Detail & Related papers (2021-04-27T13:20:24Z)
UnICORNN: A recurrent model for learning very long time dependencies [0.0]
We propose a novel RNN architecture based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations. The resulting RNN is fast, invertible (in time), memory efficient and we derive rigorous bounds on the hidden state gradients to prove the mitigation of the exploding and vanishing gradient problem.
arXiv Detail & Related papers (2021-03-09T15:19:59Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Lipschitz Recurrent Neural Networks [100.72827570987992]
We show that our Lipschitz recurrent unit is more robust with respect to input and parameter perturbations as compared to other continuous-time RNNs. Our experiments demonstrate that the Lipschitz RNN can outperform existing recurrent units on a range of benchmark tasks.
arXiv Detail & Related papers (2020-06-22T08:44:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.