Learning Various Length Dependence by Dual Recurrent Neural Networks
- URL: http://arxiv.org/abs/2005.13867v1
- Date: Thu, 28 May 2020 09:30:01 GMT
- Title: Learning Various Length Dependence by Dual Recurrent Neural Networks
- Authors: Chenpeng Zhang (1), Shuai Li (2), Mao Ye (1), Ce Zhu (2), Xue Li (3)
((1) School of Computer Science and Engineering, University of Electronic
Science and Technology of China, (2) School of Information and Communication
Engineering, University of Electronic Science and Technology of China, (3)
School of Information Technology and Electronic Engineering, The University
of Queensland)
- Abstract summary: We propose a new model named Dual Recurrent Neural Networks (DuRNN)
DuRNN consists of two parts to learn the short-term dependence and progressively learn the long-term dependence.
Our contributions are: 1) a new recurrent model developed based on the divide-and-conquer strategy to learn long and short-term dependence separately, and 2) a selection mechanism to enhance the separating and learning of different temporal scales of dependence.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recurrent neural networks (RNNs) are widely used as a memory model for
sequence-related problems. Many variants of RNN have been proposed to solve the
gradient problems of training RNNs and process long sequences. Although some
classical models have been proposed, capturing long-term dependence while
responding to short-term changes remains a challenge. To this problem, we
propose a new model named Dual Recurrent Neural Networks (DuRNN). The DuRNN
consists of two parts to learn the short-term dependence and progressively
learn the long-term dependence. The first part is a recurrent neural network
with constrained full recurrent connections to deal with short-term dependence
in sequence and generate short-term memory. Another part is a recurrent neural
network with independent recurrent connections which helps to learn long-term
dependence and generate long-term memory. A selection mechanism is added
between two parts to help the needed long-term information transfer to the
independent neurons. Multiple modules can be stacked to form a multi-layer
model for better performance. Our contributions are: 1) a new recurrent model
developed based on the divide-and-conquer strategy to learn long and short-term
dependence separately, and 2) a selection mechanism to enhance the separating
and learning of different temporal scales of dependence. Both theoretical
analysis and extensive experiments are conducted to validate the performance of
our model, and we also conduct simple visualization experiments and ablation
analyses for the model interpretability. Experimental results indicate that the
proposed DuRNN model can handle not only very long sequences (over 5000 time
steps), but also short sequences very well. Compared with many state-of-the-art
RNN models, our model has demonstrated efficient and better performance.
Related papers
- TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential
Modelling [54.97005925277638]
The identification of sensory cues associated with potential opportunities and dangers is frequently complicated by unrelated events that separate useful cues by long delays.
It remains a challenging task for state-of-the-art spiking neural networks (SNNs) to establish long-term temporal dependency between distant cues.
We propose a novel biologically inspired Two-Compartment Leaky Integrate-and-Fire spiking neuron model, dubbed TC-LIF.
arXiv Detail & Related papers (2023-08-25T08:54:41Z) - Long Short-term Memory with Two-Compartment Spiking Neuron [64.02161577259426]
We propose a novel biologically inspired Long Short-Term Memory Leaky Integrate-and-Fire spiking neuron model, dubbed LSTM-LIF.
Our experimental results, on a diverse range of temporal classification tasks, demonstrate superior temporal classification capability, rapid training convergence, strong network generalizability, and high energy efficiency of the proposed LSTM-LIF model.
This work, therefore, opens up a myriad of opportunities for resolving challenging temporal processing tasks on emerging neuromorphic computing machines.
arXiv Detail & Related papers (2023-07-14T08:51:03Z) - Continuous time recurrent neural networks: overview and application to
forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations.
We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z) - An Improved Time Feedforward Connections Recurrent Neural Networks [3.0965505512285967]
Recurrent Neural Networks (RNNs) have been widely applied to deal with temporal problems, such as flood forecasting and financial data processing.
Traditional RNNs models amplify the gradient issue due to the strict time serial dependency.
An improved Time Feedforward Connections Recurrent Neural Networks (TFC-RNNs) model was first proposed to address the gradient issue.
A novel cell structure named Single Gate Recurrent Unit (SGRU) was presented to reduce the number of parameters for RNNs cell.
arXiv Detail & Related papers (2022-11-03T09:32:39Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Oscillatory Fourier Neural Network: A Compact and Efficient Architecture
for Sequential Processing [16.69710555668727]
We propose a novel neuron model that has cosine activation with a time varying component for sequential processing.
The proposed neuron provides an efficient building block for projecting sequential inputs into spectral domain.
Applying the proposed model to sentiment analysis on IMDB dataset reaches 89.4% test accuracy within 5 epochs.
arXiv Detail & Related papers (2021-09-14T19:08:07Z) - CARRNN: A Continuous Autoregressive Recurrent Neural Network for Deep
Representation Learning from Sporadic Temporal Data [1.8352113484137622]
In this paper, a novel deep learning-based model is developed for modeling multiple temporal features in sporadic data.
The proposed model, called CARRNN, uses a generalized discrete-time autoregressive model that is trainable end-to-end using neural networks modulated by time lags.
It is applied to multivariate time-series regression tasks using data provided for Alzheimer's disease progression modeling and intensive care unit (ICU) mortality rate prediction.
arXiv Detail & Related papers (2021-04-08T12:43:44Z) - UnICORNN: A recurrent model for learning very long time dependencies [0.0]
We propose a novel RNN architecture based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations.
The resulting RNN is fast, invertible (in time), memory efficient and we derive rigorous bounds on the hidden state gradients to prove the mitigation of the exploding and vanishing gradient problem.
arXiv Detail & Related papers (2021-03-09T15:19:59Z) - On the Memory Mechanism of Tensor-Power Recurrent Models [25.83531612758211]
We investigate the memory mechanism of TP recurrent models.
We show that a large degree p is an essential condition to achieve the long memory effect.
New model is expected to benefit from the long memory effect in a stable manner.
arXiv Detail & Related papers (2021-03-02T07:07:47Z) - Incremental Training of a Recurrent Neural Network Exploiting a
Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning.
We show how to extend the architecture of a simple RNN by separating its hidden state into different modules.
We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z) - Neural Additive Models: Interpretable Machine Learning with Neural Nets [77.66871378302774]
Deep neural networks (DNNs) are powerful black-box predictors that have achieved impressive performance on a wide variety of tasks.
We propose Neural Additive Models (NAMs) which combine some of the expressivity of DNNs with the inherent intelligibility of generalized additive models.
NAMs learn a linear combination of neural networks that each attend to a single input feature.
arXiv Detail & Related papers (2020-04-29T01:28:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.