Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial
and Survey
- URL: http://arxiv.org/abs/2304.11461v1
- Date: Sat, 22 Apr 2023 18:22:10 GMT
- Title: Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial
and Survey
- Authors: Benyamin Ghojogh, Ali Ghodsi
- Abstract summary: This tutorial paper is on Recurrent Neural Network (RNN), Long Short-Term Memory Network (LSTM), and their variants.
We start with a dynamical system and backpropagation through time for RNN.
We discuss the problems of gradient vanishing and explosion in long-term dependencies.
Then, we introduce LSTM gates and cells, history and variants of LSTM, and Gated Recurrent Units (GRU)
- Score: 9.092591746522483
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This is a tutorial paper on Recurrent Neural Network (RNN), Long Short-Term
Memory Network (LSTM), and their variants. We start with a dynamical system and
backpropagation through time for RNN. Then, we discuss the problems of gradient
vanishing and explosion in long-term dependencies. We explain close-to-identity
weight matrix, long delays, leaky units, and echo state networks for solving
this problem. Then, we introduce LSTM gates and cells, history and variants of
LSTM, and Gated Recurrent Units (GRU). Finally, we introduce bidirectional RNN,
bidirectional LSTM, and the Embeddings from Language Model (ELMo) network, for
processing a sequence in both directions.
Related papers
- An Improved Time Feedforward Connections Recurrent Neural Networks [3.0965505512285967]
Recurrent Neural Networks (RNNs) have been widely applied to deal with temporal problems, such as flood forecasting and financial data processing.
Traditional RNNs models amplify the gradient issue due to the strict time serial dependency.
An improved Time Feedforward Connections Recurrent Neural Networks (TFC-RNNs) model was first proposed to address the gradient issue.
A novel cell structure named Single Gate Recurrent Unit (SGRU) was presented to reduce the number of parameters for RNNs cell.
arXiv Detail & Related papers (2022-11-03T09:32:39Z) - Bayesian Neural Network Language Modeling for Speech Recognition [59.681758762712754]
State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex.
In this paper, an overarching full Bayesian learning framework is proposed to account for the underlying uncertainty in LSTM-RNN and Transformer LMs.
arXiv Detail & Related papers (2022-08-28T17:50:19Z) - Music Generation Using an LSTM [52.77024349608834]
Long Short-Term Memory (LSTM) network structures have proven to be very useful for making predictions for the next output in a series.
We demonstrate an approach of music generation using Recurrent Neural Networks (RNN)
We provide a brief synopsis of the intuition, theory, and application of LSTMs in music generation, develop and present the network we found to best achieve this goal, identify and address issues and challenges faced, and include potential future improvements for our network.
arXiv Detail & Related papers (2022-03-23T00:13:41Z) - Working Memory Connections for LSTM [51.742526187978726]
We show that Working Memory Connections constantly improve the performance of LSTMs on a variety of tasks.
Numerical results suggest that the cell state contains useful information that is worth including in the gate structure.
arXiv Detail & Related papers (2021-08-31T18:01:30Z) - Online learning of windmill time series using Long Short-term Cognitive
Networks [58.675240242609064]
The amount of data generated on windmill farms makes online learning the most viable strategy to follow.
We use Long Short-term Cognitive Networks (LSTCNs) to forecast windmill time series in online settings.
Our approach reported the lowest forecasting errors with respect to a simple RNN, a Long Short-term Memory, a Gated Recurrent Unit, and a Hidden Markov Model.
arXiv Detail & Related papers (2021-07-01T13:13:24Z) - Overcoming Catastrophic Forgetting in Graph Neural Networks [50.900153089330175]
Catastrophic forgetting refers to the tendency that a neural network "forgets" the previous learned knowledge upon learning new tasks.
We propose a novel scheme dedicated to overcoming this problem and hence strengthen continual learning in graph neural networks (GNNs)
At the heart of our approach is a generic module, termed as topology-aware weight preserving(TWP)
arXiv Detail & Related papers (2020-12-10T22:30:25Z) - Tensor train decompositions on recurrent networks [60.334946204107446]
Matrix product state (MPS) tensor trains have more attractive features than MPOs, in terms of storage reduction and computing time at inference.
We show that MPS tensor trains should be at the forefront of LSTM network compression through a theoretical analysis and practical experiments on NLP task.
arXiv Detail & Related papers (2020-06-09T18:25:39Z) - Learning Long-Term Dependencies in Irregularly-Sampled Time Series [16.762335749650717]
Recurrent neural networks (RNNs) with continuous-time hidden states are a natural fit for modeling irregularly-sampled time series.
We prove that similar to standard RNNs, the underlying reason for this issue is the vanishing or exploding of the gradient during training.
We provide a solution by designing a new algorithm based on the long short-term memory (LSTM) that separates its memory from its time-continuous state.
arXiv Detail & Related papers (2020-06-08T08:46:58Z) - Do RNN and LSTM have Long Memory? [15.072891084847647]
We prove that RNN and LSTM do not have long memory from a statistical perspective.
A new definition for long memory networks is introduced, and it requires the model weights to decay at a rate.
To verify our theory, we convert RNN and LSTM into long memory networks by making a minimal modification, and their superiority is illustrated in modeling long-term dependence of various datasets.
arXiv Detail & Related papers (2020-06-06T13:30:03Z) - Achieving Online Regression Performance of LSTMs with Simple RNNs [0.0]
We introduce a first-order training algorithm with a linear time complexity in the number of parameters.
We show that when SRNNs are trained with our algorithm, they provide very similar regression performance with the LSTMs in two to three times shorter training time.
arXiv Detail & Related papers (2020-05-16T11:41:13Z) - Sentiment Analysis Using Simplified Long Short-term Memory Recurrent
Neural Networks [1.5146765382501612]
We perform sentiment analysis on a GOP Debate Twitter dataset.
To speed up training and reduce the computational cost and time, six different parameter reduced slim versions of the LSTM model are proposed.
arXiv Detail & Related papers (2020-05-08T12:50:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.