Related papers: Leveraging Lightweight Generators for Memory Efficient Continual Learning

Leveraging Lightweight Generators for Memory Efficient Continual Learning

URL: http://arxiv.org/abs/2506.19692v1
Date: Tue, 24 Jun 2025 14:59:52 GMT
Title: Leveraging Lightweight Generators for Memory Efficient Continual Learning
Authors: Christiaan Lamers, Ahmed Nabil Belbachir, Thomas Bäck, Niki van Stein,
Abstract summary: Catastrophic forgetting can be trivially alleviated by keeping all data from previous tasks in memory.<n>This paper aims to decrease required memory for memory-based continuous learning algorithms.
Score: 0.01874930567916036
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Catastrophic forgetting can be trivially alleviated by keeping all data from previous tasks in memory. Therefore, minimizing the memory footprint while maximizing the amount of relevant information is crucial to the challenge of continual learning. This paper aims to decrease required memory for memory-based continuous learning algorithms. We explore the options of extracting a minimal amount of information, while maximally alleviating forgetting. We propose the usage of lightweight generators based on Singular Value Decomposition to enhance existing continual learning methods, such as A-GEM and Experience Replay. These generators need a minimal amount of memory while being maximally effective. They require no training time, just a single linear-time fitting step, and can capture a distribution effectively from a small number of data samples. Depending on the dataset and network architecture, our results show a significant increase in average accuracy compared to the original methods. Our method shows great potential in minimizing the memory footprint of memory-based continual learning algorithms.

Related papers

Information-Theoretic Dual Memory System for Continual Learning [8.803516528821161]
We propose an innovative dual memory system called the Information-Theoretic Dual Memory System (ITDMS)<n>This system comprises a fast memory buffer designed to retain temporary and novel samples, alongside a slow memory buffer dedicated to preserving critical and informative samples.<n>Our methodology is rigorously assessed through a series of continual learning experiments, with empirical results underscoring the effectiveness of the proposed system.
arXiv Detail & Related papers (2025-01-13T15:01:12Z)
An Efficient Procedure for Computing Bayesian Network Structure Learning [0.9208007322096532]
We propose a globally optimal Bayesian network structure discovery algorithm based on a progressively leveled scoring approach. Experimental results indicate that our method, when using only memory, not only reduces peak memory usage but also improves computational efficiency.
arXiv Detail & Related papers (2024-07-24T07:59:18Z)
Towards Continuous Reuse of Graph Models via Holistic Memory Diversification [18.66123763295736]
This paper addresses the challenge of incremental learning in growing graphs with increasingly complex tasks.<n>The goal is to continuously train a graph model to handle new tasks while retaining proficiency in previous tasks via memory replay.<n>Existing methods usually overlook the importance of memory diversity, limiting in selecting high-quality memory from previous tasks.
arXiv Detail & Related papers (2024-06-11T16:18:15Z)
AdaLomo: Low-memory Optimization with Adaptive Learning Rate [59.64965955386855]
We introduce low-memory optimization with adaptive learning rate (AdaLomo) for large language models. AdaLomo results on par with AdamW, while significantly reducing memory requirements, thereby lowering the hardware barrier to training large language models.
arXiv Detail & Related papers (2023-10-16T09:04:28Z)
Vocabulary-level Memory Efficiency for Language Model Fine-tuning [36.1039389951318]
We show that a significant proportion of the vocabulary remains unused during fine-tuning.<n>We propose a simple yet effective approach that leverages this finding to minimize memory usage.<n>Our approach does not impact downstream task performance, while allowing more efficient use of computational resources.
arXiv Detail & Related papers (2023-09-15T19:00:00Z)
Saliency-Augmented Memory Completion for Continual Learning [8.243137410556495]
How to forget is a problem continual learning must address. Our paper proposes a new saliency-augmented memory completion framework for continual learning.
arXiv Detail & Related papers (2022-12-26T18:06:39Z)
A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from. Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes. One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z)
Memory-Based Label-Text Tuning for Few-Shot Class-Incremental Learning [20.87638654650383]
We propose leveraging the label-text information by adopting the memory prompt. The memory prompt can learn new data sequentially, and meanwhile store the previous knowledge. Experiments show that our proposed method outperforms all prior state-of-the-art approaches.
arXiv Detail & Related papers (2022-07-03T13:15:45Z)
Memory Replay with Data Compression for Continual Learning [80.95444077825852]
We propose memory replay with data compression to reduce the storage cost of old training samples. We extensively validate this across several benchmarks of class-incremental learning and in a realistic scenario of object detection for autonomous driving.
arXiv Detail & Related papers (2022-02-14T10:26:23Z)
SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory. We propose StreaMRAK - a streaming version of KRR. We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z)
Continual Learning via Bit-Level Information Preserving [88.32450740325005]
We study the continual learning process through the lens of information theory. We propose Bit-Level Information Preserving (BLIP) that preserves the information gain on model parameters. BLIP achieves close to zero forgetting while only requiring constant memory overheads throughout continual learning.
arXiv Detail & Related papers (2021-05-10T15:09:01Z)
TinyTL: Reduce Activations, Not Trainable Parameters for Efficient On-Device Learning [78.80707950262214]
On-device learning enables edge devices to continually adapt the AI models to new data. Existing work solves this problem by reducing the number of trainable parameters. We present Tiny-Transfer-Learning (TinyTL) for memory-efficient on-device learning.
arXiv Detail & Related papers (2020-07-22T18:39:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.