RotLSTM: Rotating Memories in Recurrent Neural Networks
- URL: http://arxiv.org/abs/2105.00357v1
- Date: Sat, 1 May 2021 23:48:58 GMT
- Title: RotLSTM: Rotating Memories in Recurrent Neural Networks
- Authors: Vlad Velici, Adam Pr\"ugel-Bennett
- Abstract summary: We introduce the concept of modifying the cell state (memory) of LSTMs using rotation matrices parametrised by a new set of trainable weights.
This addition shows significant increases of performance on some of the tasks from the bAbI dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Long Short-Term Memory (LSTM) units have the ability to memorise and use
long-term dependencies between inputs to generate predictions on time series
data. We introduce the concept of modifying the cell state (memory) of LSTMs
using rotation matrices parametrised by a new set of trainable weights. This
addition shows significant increases of performance on some of the tasks from
the bAbI dataset.
Related papers
- HOPE for a Robust Parameterization of Long-memory State Space Models [51.66430224089725]
State-space models (SSMs) that utilize linear, time-invariant (LTI) systems are known for their effectiveness in learning long sequences.
We develop a new parameterization scheme, called HOPE, for LTI systems that utilize Markov parameters within Hankel operators.
Our new parameterization endows the SSM with non-decaying memory within a fixed time window, which is empirically corroborated by a sequential CIFAR-10 task with padded noise.
arXiv Detail & Related papers (2024-05-22T20:20:14Z) - B-LSTM-MIONet: Bayesian LSTM-based Neural Operators for Learning the
Response of Complex Dynamical Systems to Length-Variant Multiple Input
Functions [6.75867828529733]
Multiple-input deep neural operators (MIONet) extended DeepONet to allow multiple input functions in different Banach spaces.
MIONet offers flexibility in training dataset grid spacing, without constraints on output location.
This work redesigns MIONet, integrating Long Short Term Memory (LSTM) to learn neural operators from time-dependent data.
arXiv Detail & Related papers (2023-11-28T04:58:17Z) - Recurrent Neural Networks with more flexible memory: better predictions
than rough volatility [0.0]
We compare the ability of vanilla and extended long short term memory networks to predict asset price volatility.
We show that the model with the smallest validation loss systemically outperforms rough volatility predictions by about 20% when trained and tested on a dataset with multiple time series.
arXiv Detail & Related papers (2023-08-04T14:24:57Z) - Self-Gated Memory Recurrent Network for Efficient Scalable HDR
Deghosting [59.04604001936661]
We propose a novel recurrent network-based HDR deghosting method for fusing arbitrary length dynamic sequences.
We introduce a new recurrent cell architecture, namely Self-Gated Memory (SGM) cell, that outperforms the standard LSTM cell.
The proposed approach achieves state-of-the-art performance compared to existing HDR deghosting methods quantitatively across three publicly available datasets.
arXiv Detail & Related papers (2021-12-24T12:36:33Z) - Working Memory Connections for LSTM [51.742526187978726]
We show that Working Memory Connections constantly improve the performance of LSTMs on a variety of tasks.
Numerical results suggest that the cell state contains useful information that is worth including in the gate structure.
arXiv Detail & Related papers (2021-08-31T18:01:30Z) - Slower is Better: Revisiting the Forgetting Mechanism in LSTM for Slower
Information Decay [4.414729427965163]
We propose a power law forget gate, which learns to forget information along a slower power law decay function.
We show that LSTM with the proposed forget gate can learn long-term dependencies, outperforming other recurrent networks in multiple domains.
arXiv Detail & Related papers (2021-05-12T20:21:16Z) - Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling.
Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z) - Object Tracking through Residual and Dense LSTMs [67.98948222599849]
Deep learning-based trackers based on LSTMs (Long Short-Term Memory) recurrent neural networks have emerged as a powerful alternative.
DenseLSTMs outperform Residual and regular LSTM, and offer a higher resilience to nuisances.
Our case study supports the adoption of residual-based RNNs for enhancing the robustness of other trackers.
arXiv Detail & Related papers (2020-06-22T08:20:17Z) - Do RNN and LSTM have Long Memory? [15.072891084847647]
We prove that RNN and LSTM do not have long memory from a statistical perspective.
A new definition for long memory networks is introduced, and it requires the model weights to decay at a rate.
To verify our theory, we convert RNN and LSTM into long memory networks by making a minimal modification, and their superiority is illustrated in modeling long-term dependence of various datasets.
arXiv Detail & Related papers (2020-06-06T13:30:03Z) - Sentiment Analysis Using Simplified Long Short-term Memory Recurrent
Neural Networks [1.5146765382501612]
We perform sentiment analysis on a GOP Debate Twitter dataset.
To speed up training and reduce the computational cost and time, six different parameter reduced slim versions of the LSTM model are proposed.
arXiv Detail & Related papers (2020-05-08T12:50:10Z) - Encoding-based Memory Modules for Recurrent Neural Networks [79.42778415729475]
We study the memorization subtask from the point of view of the design and training of recurrent neural networks.
We propose a new model, the Linear Memory Network, which features an encoding-based memorization component built with a linear autoencoder for sequences.
arXiv Detail & Related papers (2020-01-31T11:14:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.