Learning to Rehearse in Long Sequence Memorization
- URL: http://arxiv.org/abs/2106.01096v1
- Date: Wed, 2 Jun 2021 11:58:30 GMT
- Title: Learning to Rehearse in Long Sequence Memorization
- Authors: Zhu Zhang, Chang Zhou, Jianxin Ma, Zhijie Lin, Jingren Zhou, Hongxia
Yang and Zhou Zhao
- Abstract summary: Existing reasoning tasks often have an important assumption that the input contents can be always accessed while reasoning.
Memory augmented neural networks introduce a human-like write-read memory to compress and memorize the long input sequence in one pass.
But they have two serious drawbacks: 1) they continually update the memory from current information and inevitably forget the early contents; 2) they do not distinguish what information is important and treat all contents equally.
We propose the Rehearsal Memory to enhance long-sequence memorization by self-supervised rehearsal with a history sampler.
- Score: 107.14601197043308
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing reasoning tasks often have an important assumption that the input
contents can be always accessed while reasoning, requiring unlimited storage
resources and suffering from severe time delay on long sequences. To achieve
efficient reasoning on long sequences with limited storage resources, memory
augmented neural networks introduce a human-like write-read memory to compress
and memorize the long input sequence in one pass, trying to answer subsequent
queries only based on the memory. But they have two serious drawbacks: 1) they
continually update the memory from current information and inevitably forget
the early contents; 2) they do not distinguish what information is important
and treat all contents equally. In this paper, we propose the Rehearsal Memory
(RM) to enhance long-sequence memorization by self-supervised rehearsal with a
history sampler. To alleviate the gradual forgetting of early information, we
design self-supervised rehearsal training with recollection and familiarity
tasks. Further, we design a history sampler to select informative fragments for
rehearsal training, making the memory focus on the crucial information. We
evaluate the performance of our rehearsal memory by the synthetic bAbI task and
several downstream tasks, including text/video question answering and
recommendation on long sequences.
Related papers
- LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory [68.97819665784442]
This paper introduces LongMemEval, a benchmark designed to evaluate five core long-term memory abilities of chat assistants.
LongMemEval presents a significant challenge to existing long-term memory systems.
We present a unified framework that breaks down the long-term memory design into four design choices.
arXiv Detail & Related papers (2024-10-14T17:59:44Z) - Assessing Episodic Memory in LLMs with Sequence Order Recall Tasks [42.22616978679253]
We introduce Sequence Order Recall Tasks (SORT), which we adapt from tasks used to study episodic memory in cognitive psychology.
SORT requires LLMs to recall the correct order of text segments, and provides a general framework that is both easily extendable and does not require any additional annotations.
Based on a human experiment with 155 participants, we show that humans can recall sequence order based on long-term memory of a book.
arXiv Detail & Related papers (2024-10-10T17:17:38Z) - Exploring Memorization in Fine-tuned Language Models [53.52403444655213]
We conduct the first comprehensive analysis to explore language models' memorization during fine-tuning across tasks.
Our studies with open-sourced and our own fine-tuned LMs across various tasks indicate that memorization presents a strong disparity among different fine-tuning tasks.
We provide an intuitive explanation of this task disparity via sparse coding theory and unveil a strong correlation between memorization and attention score distribution.
arXiv Detail & Related papers (2023-10-10T15:41:26Z) - Saliency-Augmented Memory Completion for Continual Learning [8.243137410556495]
How to forget is a problem continual learning must address.
Our paper proposes a new saliency-augmented memory completion framework for continual learning.
arXiv Detail & Related papers (2022-12-26T18:06:39Z) - Evaluating Long-Term Memory in 3D Mazes [10.224858246626171]
Memory Maze is a 3D domain of randomized mazes designed for evaluating long-term memory in agents.
Unlike existing benchmarks, Memory Maze measures long-term memory separate from confounding agent abilities.
We find that current algorithms benefit from training with truncated backpropagation through time and succeed on small mazes, but fall short of human performance on the large mazes.
arXiv Detail & Related papers (2022-10-24T16:32:28Z) - Keep Me Updated! Memory Management in Long-term Conversations [14.587940208778843]
We present a novel task and a dataset of memory management in long-term conversations.
We propose a new mechanism of memory management that eliminates invalidated or redundant information.
Experimental results show that our approach outperforms the baselines in terms of engagingness and humanness.
arXiv Detail & Related papers (2022-10-17T05:06:38Z) - Not All Memories are Created Equal: Learning to Forget by Expiring [49.053569908417636]
We propose Expire-Span, a method that learns to retain the most important information and expire the irrelevant information.
This forgetting of memories enables Transformers to scale to attend over tens of thousands of previous timesteps efficiently.
We show that Expire-Span can scale to memories that are tens of thousands in size, setting a new state of the art on incredibly long context tasks.
arXiv Detail & Related papers (2021-05-13T20:50:13Z) - Sequential Recommender via Time-aware Attentive Memory Network [67.26862011527986]
We propose a temporal gating methodology to improve attention mechanism and recurrent units.
We also propose a Multi-hop Time-aware Attentive Memory network to integrate long-term and short-term preferences.
Our approach is scalable for candidate retrieval tasks and can be viewed as a non-linear generalization of latent factorization for dot-product based Top-K recommendation.
arXiv Detail & Related papers (2020-05-18T11:29:38Z) - Encoding-based Memory Modules for Recurrent Neural Networks [79.42778415729475]
We study the memorization subtask from the point of view of the design and training of recurrent neural networks.
We propose a new model, the Linear Memory Network, which features an encoding-based memorization component built with a linear autoencoder for sequences.
arXiv Detail & Related papers (2020-01-31T11:14:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.