Related papers: Gates Are Not What You Need in RNNs

Gates Are Not What You Need in RNNs

URL: http://arxiv.org/abs/2108.00527v3
Date: Wed, 22 Nov 2023 01:11:46 GMT
Title: Gates Are Not What You Need in RNNs
Authors: Ronalds Zakovskis, Andis Draguns, Eliza Gaile, Emils Ozolins, Karlis Freivalds
Abstract summary: We propose a new recurrent cell called Residual Recurrent Unit (RRU) which beats traditional cells and does not employ a single gate. It is based on the residual shortcut connection, linear transformations, ReLU, and normalization. Our experiments show that RRU outperforms the traditional gated units on most of these tasks.
Score: 2.6199029802346754
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recurrent neural networks have flourished in many areas. Consequently, we can see new RNN cells being developed continuously, usually by creating or using gates in a new, original way. But what if we told you that gates in RNNs are redundant? In this paper, we propose a new recurrent cell called Residual Recurrent Unit (RRU) which beats traditional cells and does not employ a single gate. It is based on the residual shortcut connection, linear transformations, ReLU, and normalization. To evaluate our cell's effectiveness, we compare its performance against the widely-used GRU and LSTM cells and the recently proposed Mogrifier LSTM on several tasks including, polyphonic music modeling, language modeling, and sentiment analysis. Our experiments show that RRU outperforms the traditional gated units on most of these tasks. Also, it has better robustness to parameter selection, allowing immediate application in new tasks without much tuning. We have implemented the RRU in TensorFlow, and the code is made available at https://github.com/LUMII-Syslab/RRU .

Related papers

Were RNNs All We Needed? [53.393497486332]
We revisit traditional recurrent neural networks (RNNs) from over a decade ago. We show that by removing their hidden state dependencies from their input, forget, and update gates, LSTMs and GRUs no longer need to BPTT and can be efficiently trained in parallel.
arXiv Detail & Related papers (2024-10-02T03:06:49Z)
Attention as an RNN [66.5420926480473]
We show that attention can be viewed as a special Recurrent Neural Network (RNN) with the ability to compute its textitmany-to-one RNN output efficiently. We introduce a new efficient method of computing attention's textitmany-to-many RNN output based on the parallel prefix scan algorithm. We show Aarens achieve comparable performance to Transformers on $38$ datasets spread across four popular sequential problem settings.
arXiv Detail & Related papers (2024-05-22T19:45:01Z)
Hierarchically Gated Recurrent Neural Network for Sequence Modeling [36.14544998133578]
We propose a gated linear RNN model dubbed Hierarchically Gated Recurrent Neural Network (HGRN) Experiments on language modeling, image classification, and long-range arena benchmarks showcase the efficiency and effectiveness of our proposed model.
arXiv Detail & Related papers (2023-11-08T16:50:05Z)
RigLSTM: Recurrent Independent Grid LSTM for Generalizable Sequence Learning [75.61681328968714]
We propose recurrent independent Grid LSTM (RigLSTM) to exploit the underlying modular structure of the target task. Our model adopts cell selection, input feature selection, hidden state selection, and soft state updating to achieve a better generalization ability.
arXiv Detail & Related papers (2023-11-03T07:40:06Z)
DartsReNet: Exploring new RNN cells in ReNet architectures [4.266320191208303]
We present new Recurrent Neural Network (RNN) cells for image classification using a Neural Architecture Search (NAS) approach called DARTS. We are interested in the ReNet architecture, which is a RNN based approach presented as an alternative for convolutional and pooling steps.
arXiv Detail & Related papers (2023-04-11T09:42:10Z)
An Improved Time Feedforward Connections Recurrent Neural Networks [3.0965505512285967]
Recurrent Neural Networks (RNNs) have been widely applied to deal with temporal problems, such as flood forecasting and financial data processing. Traditional RNNs models amplify the gradient issue due to the strict time serial dependency. An improved Time Feedforward Connections Recurrent Neural Networks (TFC-RNNs) model was first proposed to address the gradient issue. A novel cell structure named Single Gate Recurrent Unit (SGRU) was presented to reduce the number of parameters for RNNs cell.
arXiv Detail & Related papers (2022-11-03T09:32:39Z)
Working Memory Connections for LSTM [51.742526187978726]
We show that Working Memory Connections constantly improve the performance of LSTMs on a variety of tasks. Numerical results suggest that the cell state contains useful information that is worth including in the gate structure.
arXiv Detail & Related papers (2021-08-31T18:01:30Z)
Towards Evaluating and Training Verifiably Robust Neural Networks [81.39994285743555]
We study the relationship between IBP and CROWN, and prove that CROWN is always tighter than IBP when choosing appropriate bounding lines. We propose a relaxed version of CROWN, linear bound propagation (LBP), that can be used to verify large networks to obtain lower verified errors.
arXiv Detail & Related papers (2021-04-01T13:03:48Z)
Fusion Recurrent Neural Network [88.5550074808201]
We propose a novel, succinct and promising RNN - Fusion Recurrent Neural Network (Fusion RNN) Fusion RNN is composed of Fusion module and Transport module every time step. In order to evaluate Fusion RNN's sequence feature extraction capability, we choose a representative data mining task for sequence data, estimated time of arrival (ETA) and present a novel model based on Fusion RNN.
arXiv Detail & Related papers (2020-06-07T07:39:49Z)
SiTGRU: Single-Tunnelled Gated Recurrent Unit for Abnormality Detection [29.500392184282518]
We propose a novel version of Gated Recurrent Unit (GRU) called Single Tunnelled GRU for abnormality detection. Our proposed optimized GRU model outperforms standard GRU and Long Short Term Memory (LSTM) networks on most metrics for detection and generalization tasks.
arXiv Detail & Related papers (2020-03-30T14:58:13Z)
Refined Gate: A Simple and Effective Gating Mechanism for Recurrent Units [68.30422112784355]
We propose a new gating mechanism within general gated recurrent neural networks to handle this issue. The proposed gates directly short connect the extracted input features to the outputs of vanilla gates. We verify the proposed gating mechanism on three popular types of gated RNNs including LSTM, GRU and MGU.
arXiv Detail & Related papers (2020-02-26T07:51:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.