Related papers: Assessing the Memory Ability of Recurrent Neural Networks

Assessing the Memory Ability of Recurrent Neural Networks

URL: http://arxiv.org/abs/2002.07422v1
Date: Tue, 18 Feb 2020 08:07:23 GMT
Title: Assessing the Memory Ability of Recurrent Neural Networks
Authors: Cheng Zhang, Qiuchi Li, Lingyu Hua and Dawei Song
Abstract summary: Recurrent Neural Networks (RNNs) can remember, in their hidden layers, part of the semantic information expressed by a sequence. Different types of recurrent units have been designed to enable RNNs to remember information over longer time spans. The memory abilities of different recurrent units are still theoretically and empirically unclear.
Score: 21.88086102298848
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It is known that Recurrent Neural Networks (RNNs) can remember, in their hidden layers, part of the semantic information expressed by a sequence (e.g., a sentence) that is being processed. Different types of recurrent units have been designed to enable RNNs to remember information over longer time spans. However, the memory abilities of different recurrent units are still theoretically and empirically unclear, thus limiting the development of more effective and explainable RNNs. To tackle the problem, in this paper, we identify and analyze the internal and external factors that affect the memory ability of RNNs, and propose a Semantic Euclidean Space to represent the semantics expressed by a sequence. Based on the Semantic Euclidean Space, a series of evaluation indicators are defined to measure the memory abilities of different recurrent units and analyze their limitations. These evaluation indicators also provide a useful guidance to select suitable sequence lengths for different RNNs during training.

Related papers

Echoes of the past: A unified perspective on fading memory and echo states [4.595000276111106]
Recurrent neural networks (RNNs) have become increasingly popular in information processing tasks involving time series and temporal data.<n>Various notions have been proposed to conceptualize the behavior of memory in RNNs, including steady states, echo states, state forgetting, input forgetting, and fading memory.<n>This work aims to unify these notions in a common language, derive new implications and equivalences between them, and provide alternative proofs to some existing results.
arXiv Detail & Related papers (2025-08-26T15:55:14Z)
Geometry of naturalistic object representations in recurrent neural network models of working memory [2.028720028008411]
We show how naturalistic object information is maintained in working memory in neural networks. Our findings indicate that goal-driven RNNs employ chronological memory subspaces to track information over short time spans.
arXiv Detail & Related papers (2024-11-04T23:57:46Z)
Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations [54.17275171325324]
We present a counterexample to the Linear Representation Hypothesis (LRH) When trained to repeat an input token sequence, neural networks learn to represent the token at each position with a particular order of magnitude, rather than a direction. These findings strongly indicate that interpretability research should not be confined to the LRH.
arXiv Detail & Related papers (2024-08-20T15:04:37Z)
Episodic Memory Theory for the Mechanistic Interpretation of Recurrent Neural Networks [3.683202928838613]
We propose the Episodic Memory Theory (EMT), illustrating that RNNs can be conceptualized as discrete-time analogs of the recently proposed General Sequential Episodic Memory Model. We introduce a novel set of algorithmic tasks tailored to probe the variable binding behavior in RNNs. Our empirical investigations reveal that trained RNNs consistently converge to the variable binding circuit, thus indicating universality in the dynamics of RNNs.
arXiv Detail & Related papers (2023-10-03T20:52:37Z)
Empirical Analysis of Limits for Memory Distance in Recurrent Neural Networks [10.09712608508383]
We show that RNNs are still able to remember a few data points back into the sequence by memorizing them by heart using standard backpropagation. For classical RNNs, LSTM and GRU networks the distance of data points between recurrent calls that can be reproduced this way is highly limited.
arXiv Detail & Related papers (2022-12-20T08:20:48Z)
Selective Memory Recursive Least Squares: Recast Forgetting into Memory in RBF Neural Network Based Real-Time Learning [2.31120983784623]
In radial basis function neural network (RBFNN) based real-time learning tasks, forgetting mechanisms are widely used. This paper proposes a real-time training method named selective memory recursive least squares (SMRLS) in which the classical forgetting mechanisms are recast into a memory mechanism. With SMRLS, the input space of the RBFNN is evenly divided into a finite number of partitions and a synthesized objective function is developed using synthesized samples from each partition.
arXiv Detail & Related papers (2022-11-15T05:29:58Z)
Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory. Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z)
Measures of Information Reflect Memorization Patterns [53.71420125627608]
We show that the diversity in the activation patterns of different neurons is reflective of model generalization and memorization. Importantly, we discover that information organization points to the two forms of memorization, even for neural activations computed on unlabelled in-distribution examples.
arXiv Detail & Related papers (2022-10-17T20:15:24Z)
Recurrence-in-Recurrence Networks for Video Deblurring [58.49075799159015]
State-of-the-art video deblurring methods often adopt recurrent neural networks to model the temporal dependency between the frames. In this paper, we propose recurrence-in-recurrence network architecture to cope with the limitations of short-ranged memory.
arXiv Detail & Related papers (2022-03-12T11:58:13Z)
Reducing Catastrophic Forgetting in Self Organizing Maps with Internally-Induced Generative Replay [67.50637511633212]
A lifelong learning agent is able to continually learn from potentially infinite streams of pattern sensory data. One major historic difficulty in building agents that adapt is that neural systems struggle to retain previously-acquired knowledge when learning from new samples. This problem is known as catastrophic forgetting (interference) and remains an unsolved problem in the domain of machine learning to this day.
arXiv Detail & Related papers (2021-12-09T07:11:14Z)
Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. We show how to extend the architecture of a simple RNN by separating its hidden state into different modules. We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.