Carousel Memory: Rethinking the Design of Episodic Memory for Continual
Learning
- URL: http://arxiv.org/abs/2110.07276v2
- Date: Fri, 15 Oct 2021 03:49:25 GMT
- Title: Carousel Memory: Rethinking the Design of Episodic Memory for Continual
Learning
- Authors: Soobee Lee, Minindu Weerakoon, Jonghyun Choi, Minjia Zhang, Di Wang,
Myeongjae Jeon
- Abstract summary: Continual Learning (CL) aims to learn from a continuous stream of tasks without forgetting knowledge learned from the previous tasks.
Previous studies exploit episodic memory (EM), which stores a subset of the past observed samples while learning from new non-i.i.d. data.
We propose to exploit the abundant storage to preserve past experiences and alleviate the forgetting by allowing CL to efficiently migrate samples between memory and storage.
- Score: 19.260402028696916
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual Learning (CL) is an emerging machine learning paradigm that aims to
learn from a continuous stream of tasks without forgetting knowledge learned
from the previous tasks. To avoid performance decrease caused by forgetting,
prior studies exploit episodic memory (EM), which stores a subset of the past
observed samples while learning from new non-i.i.d. data. Despite the promising
results, since CL is often assumed to execute on mobile or IoT devices, the EM
size is bounded by the small hardware memory capacity and makes it infeasible
to meet the accuracy requirements for real-world applications. Specifically,
all prior CL methods discard samples overflowed from the EM and can never
retrieve them back for subsequent training steps, incurring loss of information
that would exacerbate catastrophic forgetting. We explore a novel hierarchical
EM management strategy to address the forgetting issue. In particular, in
mobile and IoT devices, real-time data can be stored not just in high-speed
RAMs but in internal storage devices as well, which offer significantly larger
capacity than the RAMs. Based on this insight, we propose to exploit the
abundant storage to preserve past experiences and alleviate the forgetting by
allowing CL to efficiently migrate samples between memory and storage without
being interfered by the slow access speed of the storage. We call it Carousel
Memory (CarM). As CarM is complementary to existing CL methods, we conduct
extensive evaluations of our method with seven popular CL methods and show that
CarM significantly improves the accuracy of the methods across different
settings by large margins in final average accuracy (up to 28.4%) while
retaining the same training efficiency.
Related papers
- Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs [24.066283519769968]
Large Language Models (LLMs) have been trained using extended context lengths to foster more creative applications.
We propose MEMO, a novel framework for fine-grained activation memory management.
We show that MEMO achieves an average of 2.42x and 2.26x MFU compared to Megatron-LM and DeepSpeed.
arXiv Detail & Related papers (2024-07-16T18:59:49Z) - Adversarially Diversified Rehearsal Memory (ADRM): Mitigating Memory Overfitting Challenge in Continual Learning [0.0]
Continual learning focuses on learning non-stationary data distribution without forgetting previous knowledge.
Rehearsal-based approaches are commonly used to combat catastrophic forgetting.
We introduce the Adversarially Diversified Rehearsal Memory to address the memory overfitting challenge.
arXiv Detail & Related papers (2024-05-20T06:56:43Z) - MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory [49.96019697955383]
We introduce MemLLM, a novel method of enhancing knowledge capabilities by integrating a structured and explicit read-and-write memory module.
Our experiments indicate that MemLLM enhances performance and interpretability, in language modeling general and in particular.
We see MemLLM as an important step towards making LLMs more grounded and factual through memory augmentation.
arXiv Detail & Related papers (2024-04-17T18:13:16Z) - Beyond Memorization: The Challenge of Random Memory Access in Language Models [56.525691003233554]
We investigate whether a generative Language Model (LM) is able to access its memory sequentially or randomly.
We find that techniques including recitation and permutation improve the random memory access capability of LMs.
arXiv Detail & Related papers (2024-03-12T16:42:44Z) - Cost-effective On-device Continual Learning over Memory Hierarchy with
Miro [32.93163587457259]
Miro is a novel system runtime that dynamically configures the CL system based on resource states for the best cost-effectiveness.
Miro significantly outperforms baseline systems we build for comparison, consistently achieving higher cost-effectiveness.
arXiv Detail & Related papers (2023-08-11T10:05:53Z) - RMM: Reinforced Memory Management for Class-Incremental Learning [102.20140790771265]
Class-Incremental Learning (CIL) trains classifiers under a strict memory budget.
Existing methods use a static and ad hoc strategy for memory allocation, which is often sub-optimal.
We propose a dynamic memory management strategy that is optimized for the incremental phases and different object classes.
arXiv Detail & Related papers (2023-01-14T00:07:47Z) - Improving information retention in large scale online continual learning [99.73847522194549]
Online continual learning aims to adapt efficiently to new data while retaining existing knowledge.
Recent work suggests that information retention remains a problem in large scale OCL even when the replay buffer is unlimited.
We propose using a moving average family of methods to improve optimization for non-stationary objectives.
arXiv Detail & Related papers (2022-10-12T16:59:43Z) - A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from.
Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes.
One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z) - Improving Task-free Continual Learning by Distributionally Robust Memory
Evolution [9.345559196495746]
Task-free continual learning aims to learn a non-stationary data stream without explicit task definitions and not forget previous knowledge.
Existing methods overlook the high uncertainty in the memory data distribution.
We propose a principled memory evolution framework to dynamically evolve the memory data distribution.
arXiv Detail & Related papers (2022-07-15T02:16:09Z) - Recurrent Dynamic Embedding for Video Object Segmentation [54.52527157232795]
We propose a Recurrent Dynamic Embedding (RDE) to build a memory bank of constant size.
We propose an unbiased guidance loss during the training stage, which makes SAM more robust in long videos.
We also design a novel self-correction strategy so that the network can repair the embeddings of masks with different qualities in the memory bank.
arXiv Detail & Related papers (2022-05-08T02:24:43Z) - Prototypes-Guided Memory Replay for Continual Learning [13.459792148030717]
Continual learning (CL) refers to a machine learning paradigm that using only a small account of training samples and previously learned knowledge to enhance learning performance.
The major difficulty in CL is catastrophic forgetting of previously learned tasks, caused by shifts in data distributions.
We propose a memory-efficient CL method, incorporating it into an online meta-learning model.
arXiv Detail & Related papers (2021-08-28T13:00:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.