Related papers: Carousel Memory: Rethinking the Design of Episodic Memory for Continual Learning

Carousel Memory: Rethinking the Design of Episodic Memory for Continual Learning

URL: http://arxiv.org/abs/2110.07276v2
Date: Fri, 15 Oct 2021 03:49:25 GMT
Title: Carousel Memory: Rethinking the Design of Episodic Memory for Continual Learning
Authors: Soobee Lee, Minindu Weerakoon, Jonghyun Choi, Minjia Zhang, Di Wang, Myeongjae Jeon
Abstract summary: Continual Learning (CL) aims to learn from a continuous stream of tasks without forgetting knowledge learned from the previous tasks. Previous studies exploit episodic memory (EM), which stores a subset of the past observed samples while learning from new non-i.i.d. data. We propose to exploit the abundant storage to preserve past experiences and alleviate the forgetting by allowing CL to efficiently migrate samples between memory and storage.
Score: 19.260402028696916
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Continual Learning (CL) is an emerging machine learning paradigm that aims to learn from a continuous stream of tasks without forgetting knowledge learned from the previous tasks. To avoid performance decrease caused by forgetting, prior studies exploit episodic memory (EM), which stores a subset of the past observed samples while learning from new non-i.i.d. data. Despite the promising results, since CL is often assumed to execute on mobile or IoT devices, the EM size is bounded by the small hardware memory capacity and makes it infeasible to meet the accuracy requirements for real-world applications. Specifically, all prior CL methods discard samples overflowed from the EM and can never retrieve them back for subsequent training steps, incurring loss of information that would exacerbate catastrophic forgetting. We explore a novel hierarchical EM management strategy to address the forgetting issue. In particular, in mobile and IoT devices, real-time data can be stored not just in high-speed RAMs but in internal storage devices as well, which offer significantly larger capacity than the RAMs. Based on this insight, we propose to exploit the abundant storage to preserve past experiences and alleviate the forgetting by allowing CL to efficiently migrate samples between memory and storage without being interfered by the slow access speed of the storage. We call it Carousel Memory (CarM). As CarM is complementary to existing CL methods, we conduct extensive evaluations of our method with seven popular CL methods and show that CarM significantly improves the accuracy of the methods across different settings by large margins in final average accuracy (up to 28.4%) while retaining the same training efficiency.

Related papers

MemOS: A Memory OS for AI System [116.87568350346537]
Large Language Models (LLMs) have become an essential infrastructure for Artificial General Intelligence (AGI)<n>Existing models mainly rely on static parameters and short-lived contextual states, limiting their ability to track user preferences or update knowledge over extended periods.<n>MemOS is a memory operating system that treats memory as a manageable system resource.
arXiv Detail & Related papers (2025-07-04T17:21:46Z)
Leveraging Lightweight Generators for Memory Efficient Continual Learning [0.01874930567916036]
Catastrophic forgetting can be trivially alleviated by keeping all data from previous tasks in memory.<n>This paper aims to decrease required memory for memory-based continuous learning algorithms.
arXiv Detail & Related papers (2025-06-24T14:59:52Z)
From Offline to Online Memory-Free and Task-Free Continual Learning via Fine-Grained Hypergradients [24.963242232471426]
Continual Learning (CL) aims to learn from a non-stationary data stream where the underlying distribution changes over time.<n>Online CL (onCL) remains dominated by memory-based approaches.<n>We introduce Fine-Grained Hypergradients, an online mechanism for rebalancing gradient updates during training.
arXiv Detail & Related papers (2025-02-26T02:43:54Z)
Cost-Efficient Continual Learning with Sufficient Exemplar Memory [55.77835198580209]
Continual learning (CL) research typically assumes highly constrained exemplar memory resources. In this work, we investigate CL in a novel setting where exemplar memory is ample. Our method achieves state-of-the-art performance while reducing the computational cost to a quarter or third of existing methods.
arXiv Detail & Related papers (2025-02-11T05:40:52Z)
Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term. We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents. Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z)
Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs [24.066283519769968]
Large Language Models (LLMs) have been trained using extended context lengths to foster more creative applications. We propose MEMO, a novel framework for fine-grained activation memory management. We show that MEMO achieves an average of 2.42x and 2.26x MFU compared to Megatron-LM and DeepSpeed.
arXiv Detail & Related papers (2024-07-16T18:59:49Z)
Adversarially Diversified Rehearsal Memory (ADRM): Mitigating Memory Overfitting Challenge in Continual Learning [0.0]
Continual learning focuses on learning non-stationary data distribution without forgetting previous knowledge. Rehearsal-based approaches are commonly used to combat catastrophic forgetting. We introduce the Adversarially Diversified Rehearsal Memory to address the memory overfitting challenge.
arXiv Detail & Related papers (2024-05-20T06:56:43Z)
MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory [49.96019697955383]
We introduce MemLLM, a novel method of enhancing knowledge capabilities by integrating a structured and explicit read-and-write memory module. Our experiments indicate that MemLLM enhances performance and interpretability, in language modeling general and in particular. We see MemLLM as an important step towards making LLMs more grounded and factual through memory augmentation.
arXiv Detail & Related papers (2024-04-17T18:13:16Z)
RMM: Reinforced Memory Management for Class-Incremental Learning [102.20140790771265]
Class-Incremental Learning (CIL) trains classifiers under a strict memory budget. Existing methods use a static and ad hoc strategy for memory allocation, which is often sub-optimal. We propose a dynamic memory management strategy that is optimized for the incremental phases and different object classes.
arXiv Detail & Related papers (2023-01-14T00:07:47Z)
Improving information retention in large scale online continual learning [99.73847522194549]
Online continual learning aims to adapt efficiently to new data while retaining existing knowledge. Recent work suggests that information retention remains a problem in large scale OCL even when the replay buffer is unlimited. We propose using a moving average family of methods to improve optimization for non-stationary objectives.
arXiv Detail & Related papers (2022-10-12T16:59:43Z)
A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from. Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes. One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z)
Improving Task-free Continual Learning by Distributionally Robust Memory Evolution [9.345559196495746]
Task-free continual learning aims to learn a non-stationary data stream without explicit task definitions and not forget previous knowledge. Existing methods overlook the high uncertainty in the memory data distribution. We propose a principled memory evolution framework to dynamically evolve the memory data distribution.
arXiv Detail & Related papers (2022-07-15T02:16:09Z)
Recurrent Dynamic Embedding for Video Object Segmentation [54.52527157232795]
We propose a Recurrent Dynamic Embedding (RDE) to build a memory bank of constant size. We propose an unbiased guidance loss during the training stage, which makes SAM more robust in long videos. We also design a novel self-correction strategy so that the network can repair the embeddings of masks with different qualities in the memory bank.
arXiv Detail & Related papers (2022-05-08T02:24:43Z)
Prototypes-Guided Memory Replay for Continual Learning [13.459792148030717]
Continual learning (CL) refers to a machine learning paradigm that using only a small account of training samples and previously learned knowledge to enhance learning performance. The major difficulty in CL is catastrophic forgetting of previously learned tasks, caused by shifts in data distributions. We propose a memory-efficient CL method, incorporating it into an online meta-learning model.
arXiv Detail & Related papers (2021-08-28T13:00:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.