Empirical Analysis of Limits for Memory Distance in Recurrent Neural
Networks
- URL: http://arxiv.org/abs/2212.11085v1
- Date: Tue, 20 Dec 2022 08:20:48 GMT
- Title: Empirical Analysis of Limits for Memory Distance in Recurrent Neural
Networks
- Authors: Steffen Illium, Thore Schillman, Robert M\"uller, Thomas Gabor and
Claudia Linnhoff-Popien
- Abstract summary: We show that RNNs are still able to remember a few data points back into the sequence by memorizing them by heart using standard backpropagation.
For classical RNNs, LSTM and GRU networks the distance of data points between recurrent calls that can be reproduced this way is highly limited.
- Score: 10.09712608508383
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Common to all different kinds of recurrent neural networks (RNNs) is the
intention to model relations between data points through time. When there is no
immediate relationship between subsequent data points (like when the data
points are generated at random, e.g.), we show that RNNs are still able to
remember a few data points back into the sequence by memorizing them by heart
using standard backpropagation. However, we also show that for classical RNNs,
LSTM and GRU networks the distance of data points between recurrent calls that
can be reproduced this way is highly limited (compared to even a loose
connection between data points) and subject to various constraints imposed by
the type and size of the RNN in question. This implies the existence of a hard
limit (way below the information-theoretic one) for the distance between
related data points within which RNNs are still able to recognize said
relation.
Related papers
- Echoes of the past: A unified perspective on fading memory and echo states [4.595000276111106]
Recurrent neural networks (RNNs) have become increasingly popular in information processing tasks involving time series and temporal data.<n>Various notions have been proposed to conceptualize the behavior of memory in RNNs, including steady states, echo states, state forgetting, input forgetting, and fading memory.<n>This work aims to unify these notions in a common language, derive new implications and equivalences between them, and provide alternative proofs to some existing results.
arXiv Detail & Related papers (2025-08-26T15:55:14Z) - Delay Neural Networks (DeNN) for exploiting temporal information in event-based datasets [49.1574468325115]
Delay Neural Networks (DeNN) are designed to explicitly use exact continuous temporal information of spikes in both forward and backward passes.
Good performances are obtained, especially for datasets where temporal information is important.
arXiv Detail & Related papers (2025-01-10T14:58:15Z) - Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations [54.17275171325324]
We present a counterexample to the Linear Representation Hypothesis (LRH)
When trained to repeat an input token sequence, neural networks learn to represent the token at each position with a particular order of magnitude, rather than a direction.
These findings strongly indicate that interpretability research should not be confined to the LRH.
arXiv Detail & Related papers (2024-08-20T15:04:37Z) - Time Series Imputation with Multivariate Radial Basis Function Neural Network [1.6804613362826175]
We propose a time series imputation model based on the Radial Basis Functions Neural Network (RBFNN)
Our imputation model learns local information from timestamps to create a continuous function.
We propose an extension called the Missing Value Imputation Recurrent Neural Network with Continuous Function (MIRNN-CF) using the continuous function generated by MIM-RBFNN.
arXiv Detail & Related papers (2024-07-24T07:02:16Z) - Kernel Limit of Recurrent Neural Networks Trained on Ergodic Data Sequences [0.0]
We characterize the tangents of recurrent neural networks (RNNs) as the number of hidden units, data samples in the sequence, hidden state updates, and training steps simultaneously grow to infinity.
These methods give rise to the neural kernel (NTK) limits for RNNs trained on data sequences as the number of data samples and size of the neural network grow to infinity.
arXiv Detail & Related papers (2023-08-28T13:17:39Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Convolutional Neural Networks on Manifolds: From Graphs and Back [122.06927400759021]
We propose a manifold neural network (MNN) composed of a bank of manifold convolutional filters and point-wise nonlinearities.
To sum up, we focus on the manifold model as the limit of large graphs and construct MNNs, while we can still bring back graph neural networks by the discretization of MNNs.
arXiv Detail & Related papers (2022-10-01T21:17:39Z) - Task-Synchronized Recurrent Neural Networks [0.0]
Recurrent Neural Networks (RNNs) traditionally involve ignoring the fact, feeding the time differences as additional inputs, or resampling the data.
We propose an elegant straightforward alternative approach where instead the RNN is in effect resampled in time to match the time of the data or the task at hand.
We confirm empirically that our models can effectively compensate for the time-non-uniformity of the data and demonstrate that they compare favorably to data resampling, classical RNN methods, and alternative RNN models.
arXiv Detail & Related papers (2022-04-11T15:27:40Z) - Recurrence-in-Recurrence Networks for Video Deblurring [58.49075799159015]
State-of-the-art video deblurring methods often adopt recurrent neural networks to model the temporal dependency between the frames.
In this paper, we propose recurrence-in-recurrence network architecture to cope with the limitations of short-ranged memory.
arXiv Detail & Related papers (2022-03-12T11:58:13Z) - Recurrent Neural Networks for Learning Long-term Temporal Dependencies
with Reanalysis of Time Scale Representation [16.32068729107421]
We argue that the interpretation of a forget gate as a temporal representation is valid when the gradient of loss with respect to the state decreases exponentially as time goes back.
We propose an approach to construct new RNNs that can represent a longer time scale than conventional models.
arXiv Detail & Related papers (2021-11-05T06:22:58Z) - Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs)
We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs.
We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z) - Frequentist Uncertainty in Recurrent Neural Networks via Blockwise
Influence Functions [121.10450359856242]
Recurrent neural networks (RNNs) are instrumental in modelling sequential and time-series data.
Existing approaches for uncertainty quantification in RNNs are based predominantly on Bayesian methods.
We develop a frequentist alternative that: (a) does not interfere with model training or compromise its accuracy, (b) applies to any RNN architecture, and (c) provides theoretical coverage guarantees on the estimated uncertainty intervals.
arXiv Detail & Related papers (2020-06-20T22:45:32Z) - Assessing the Memory Ability of Recurrent Neural Networks [21.88086102298848]
Recurrent Neural Networks (RNNs) can remember, in their hidden layers, part of the semantic information expressed by a sequence.
Different types of recurrent units have been designed to enable RNNs to remember information over longer time spans.
The memory abilities of different recurrent units are still theoretically and empirically unclear.
arXiv Detail & Related papers (2020-02-18T08:07:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.