Delayed Memory Unit: Modelling Temporal Dependency Through Delay Gate
- URL: http://arxiv.org/abs/2310.14982v2
- Date: Sun, 10 Nov 2024 12:03:37 GMT
- Title: Delayed Memory Unit: Modelling Temporal Dependency Through Delay Gate
- Authors: Pengfei Sun, Jibin Wu, Malu Zhang, Paul Devos, Dick Botteldooren,
- Abstract summary: Recurrent Neural Networks (RNNs) are widely recognized for their proficiency in modeling temporal dependencies.
This paper proposes a novel Delayed Memory Unit (DMU) for gated RNNs.
The DMU incorporates a delay line structure along with delay gates into vanilla RNN, thereby enhancing temporal interaction and facilitating temporal credit assignment.
- Score: 16.4160685571157
- License:
- Abstract: Recurrent Neural Networks (RNNs) are widely recognized for their proficiency in modeling temporal dependencies, making them highly prevalent in sequential data processing applications. Nevertheless, vanilla RNNs are confronted with the well-known issue of gradient vanishing and exploding, posing a significant challenge for learning and establishing long-range dependencies. Additionally, gated RNNs tend to be over-parameterized, resulting in poor computational efficiency and network generalization. To address these challenges, this paper proposes a novel Delayed Memory Unit (DMU). The DMU incorporates a delay line structure along with delay gates into vanilla RNN, thereby enhancing temporal interaction and facilitating temporal credit assignment. Specifically, the DMU is designed to directly distribute the input information to the optimal time instant in the future, rather than aggregating and redistributing it over time through intricate network dynamics. Our proposed DMU demonstrates superior temporal modeling capabilities across a broad range of sequential modeling tasks, utilizing considerably fewer parameters than other state-of-the-art gated RNN models in applications such as speech recognition, radar gesture recognition, ECG waveform segmentation, and permuted sequential image classification.
Related papers
- Multi-Loss Convolutional Network with Time-Frequency Attention for
Speech Enhancement [16.701596804113553]
We explore self-attention in the DPCRN module and design a model called Multi-Loss Convolutional Network with Time-Frequency Attention(MNTFA) for speech enhancement.
Compared to DPRNN, axial self-attention greatly reduces the need for memory and computation.
We propose a joint training method of a multi-resolution STFT loss and a WavLM loss using a pre-trained WavLM network.
arXiv Detail & Related papers (2023-06-15T08:48:19Z) - Gated Recurrent Neural Networks with Weighted Time-Delay Feedback [59.125047512495456]
We introduce a novel gated recurrent unit (GRU) with a weighted time-delay feedback mechanism.
We show that $tau$-GRU can converge faster and generalize better than state-of-the-art recurrent units and gated recurrent architectures.
arXiv Detail & Related papers (2022-12-01T02:26:34Z) - An Improved Time Feedforward Connections Recurrent Neural Networks [3.0965505512285967]
Recurrent Neural Networks (RNNs) have been widely applied to deal with temporal problems, such as flood forecasting and financial data processing.
Traditional RNNs models amplify the gradient issue due to the strict time serial dependency.
An improved Time Feedforward Connections Recurrent Neural Networks (TFC-RNNs) model was first proposed to address the gradient issue.
A novel cell structure named Single Gate Recurrent Unit (SGRU) was presented to reduce the number of parameters for RNNs cell.
arXiv Detail & Related papers (2022-11-03T09:32:39Z) - Learning Sequence Representations by Non-local Recurrent Neural Memory [61.65105481899744]
We propose a Non-local Recurrent Neural Memory (NRNM) for supervised sequence representation learning.
Our model is able to capture long-range dependencies and latent high-level features can be distilled by our model.
Our model compares favorably against other state-of-the-art methods specifically designed for each of these sequence applications.
arXiv Detail & Related papers (2022-07-20T07:26:15Z) - TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding [60.292702363839716]
Current SOTA backbone networks for speaker embedding are designed to aggregate multi-scale features from an utterance with multi-branch network architectures for speaker representation.
We propose an effective temporal multi-scale (TMS) model where multi-scale branches could be efficiently designed in a speaker embedding network almost without increasing computational costs.
arXiv Detail & Related papers (2022-03-17T05:49:35Z) - Recurrence-in-Recurrence Networks for Video Deblurring [58.49075799159015]
State-of-the-art video deblurring methods often adopt recurrent neural networks to model the temporal dependency between the frames.
In this paper, we propose recurrence-in-recurrence network architecture to cope with the limitations of short-ranged memory.
arXiv Detail & Related papers (2022-03-12T11:58:13Z) - CARRNN: A Continuous Autoregressive Recurrent Neural Network for Deep
Representation Learning from Sporadic Temporal Data [1.8352113484137622]
In this paper, a novel deep learning-based model is developed for modeling multiple temporal features in sporadic data.
The proposed model, called CARRNN, uses a generalized discrete-time autoregressive model that is trainable end-to-end using neural networks modulated by time lags.
It is applied to multivariate time-series regression tasks using data provided for Alzheimer's disease progression modeling and intensive care unit (ICU) mortality rate prediction.
arXiv Detail & Related papers (2021-04-08T12:43:44Z) - UnICORNN: A recurrent model for learning very long time dependencies [0.0]
We propose a novel RNN architecture based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations.
The resulting RNN is fast, invertible (in time), memory efficient and we derive rigorous bounds on the hidden state gradients to prove the mitigation of the exploding and vanishing gradient problem.
arXiv Detail & Related papers (2021-03-09T15:19:59Z) - Deep Cellular Recurrent Network for Efficient Analysis of Time-Series
Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information.
The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z) - Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks [61.76338096980383]
A range of neural architecture search (NAS) techniques are used to automatically learn two types of hyper- parameters of state-of-the-art factored time delay neural networks (TDNNs)
These include the DARTS method integrating architecture selection with lattice-free MMI (LF-MMI) TDNN training.
Experiments conducted on a 300-hour Switchboard corpus suggest the auto-configured systems consistently outperform the baseline LF-MMI TDNN systems.
arXiv Detail & Related papers (2020-07-17T08:32:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.