Information-Theoretic Dual Memory System for Continual Learning
- URL: http://arxiv.org/abs/2501.07382v1
- Date: Mon, 13 Jan 2025 15:01:12 GMT
- Title: Information-Theoretic Dual Memory System for Continual Learning
- Authors: RunQing Wu, KaiHui Huang, HanYi Zhang, QiHe Liu, GuoJin Yu, JingSong Deng, Fei Ye,
- Abstract summary: We propose an innovative dual memory system called the Information-Theoretic Dual Memory System (ITDMS)
This system comprises a fast memory buffer designed to retain temporary and novel samples, alongside a slow memory buffer dedicated to preserving critical and informative samples.
Our methodology is rigorously assessed through a series of continual learning experiments, with empirical results underscoring the effectiveness of the proposed system.
- Score: 8.803516528821161
- License:
- Abstract: Continuously acquiring new knowledge from a dynamic environment is a fundamental capability for animals, facilitating their survival and ability to address various challenges. This capability is referred to as continual learning, which focuses on the ability to learn a sequence of tasks without the detriment of previous knowledge. A prevalent strategy to tackle continual learning involves selecting and storing numerous essential data samples from prior tasks within a fixed-size memory buffer. However, the majority of current memory-based techniques typically utilize a single memory buffer, which poses challenges in concurrently managing newly acquired and previously learned samples. Drawing inspiration from the Complementary Learning Systems (CLS) theory, which defines rapid and gradual learning mechanisms for processing information, we propose an innovative dual memory system called the Information-Theoretic Dual Memory System (ITDMS). This system comprises a fast memory buffer designed to retain temporary and novel samples, alongside a slow memory buffer dedicated to preserving critical and informative samples. The fast memory buffer is optimized employing an efficient reservoir sampling process. Furthermore, we introduce a novel information-theoretic memory optimization strategy that selectively identifies and retains diverse and informative data samples for the slow memory buffer. Additionally, we propose a novel balanced sample selection procedure that automatically identifies and eliminates redundant memorized samples, thus freeing up memory capacity for new data acquisitions, which can deal with a growing array of tasks. Our methodology is rigorously assessed through a series of continual learning experiments, with empirical results underscoring the effectiveness of the proposed system.
Related papers
- Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term.
We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents.
Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z) - Holistic Memory Diversification for Incremental Learning in Growing Graphs [16.483780704430405]
The goal is to continually train a graph model to handle new tasks while retaining its inference ability on previous tasks.
Existing methods usually neglect the importance of memory diversity, limiting in effectively selecting high-quality memory from previous tasks.
We introduce a novel holistic Diversified Memory Selection and Generation framework for incremental learning in graphs.
arXiv Detail & Related papers (2024-06-11T16:18:15Z) - Lifelong Event Detection with Embedding Space Separation and Compaction [30.05158209938146]
Existing lifelong event detection methods typically maintain a memory module and replay the stored memory data during the learning of a new task.
The simple combination of memory data and new-task samples can still result in substantial forgetting of previously acquired knowledge.
We propose a novel method based on embedding space separation and compaction.
arXiv Detail & Related papers (2024-04-03T06:51:49Z) - Summarizing Stream Data for Memory-Constrained Online Continual Learning [17.40956484727636]
We propose to Summarize the knowledge from the Stream Data (SSD) into more informative samples by distilling the training characteristics of real images.
We demonstrate that with limited extra computational overhead, SSD provides more than 3% accuracy boost for sequential CIFAR-100 under extremely restricted memory buffer.
arXiv Detail & Related papers (2023-05-26T05:31:51Z) - Saliency-Augmented Memory Completion for Continual Learning [8.243137410556495]
How to forget is a problem continual learning must address.
Our paper proposes a new saliency-augmented memory completion framework for continual learning.
arXiv Detail & Related papers (2022-12-26T18:06:39Z) - A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from.
Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes.
One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z) - Memory-Based Label-Text Tuning for Few-Shot Class-Incremental Learning [20.87638654650383]
We propose leveraging the label-text information by adopting the memory prompt.
The memory prompt can learn new data sequentially, and meanwhile store the previous knowledge.
Experiments show that our proposed method outperforms all prior state-of-the-art approaches.
arXiv Detail & Related papers (2022-07-03T13:15:45Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Memory Replay with Data Compression for Continual Learning [80.95444077825852]
We propose memory replay with data compression to reduce the storage cost of old training samples.
We extensively validate this across several benchmarks of class-incremental learning and in a realistic scenario of object detection for autonomous driving.
arXiv Detail & Related papers (2022-02-14T10:26:23Z) - Learning to Rehearse in Long Sequence Memorization [107.14601197043308]
Existing reasoning tasks often have an important assumption that the input contents can be always accessed while reasoning.
Memory augmented neural networks introduce a human-like write-read memory to compress and memorize the long input sequence in one pass.
But they have two serious drawbacks: 1) they continually update the memory from current information and inevitably forget the early contents; 2) they do not distinguish what information is important and treat all contents equally.
We propose the Rehearsal Memory to enhance long-sequence memorization by self-supervised rehearsal with a history sampler.
arXiv Detail & Related papers (2021-06-02T11:58:30Z) - Learning to Learn Variational Semantic Memory [132.39737669936125]
We introduce variational semantic memory into meta-learning to acquire long-term knowledge for few-shot learning.
The semantic memory is grown from scratch and gradually consolidated by absorbing information from tasks it experiences.
We formulate memory recall as the variational inference of a latent memory variable from addressed contents.
arXiv Detail & Related papers (2020-10-20T15:05:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.