Cost-Efficient Continual Learning with Sufficient Exemplar Memory
- URL: http://arxiv.org/abs/2502.07274v1
- Date: Tue, 11 Feb 2025 05:40:52 GMT
- Title: Cost-Efficient Continual Learning with Sufficient Exemplar Memory
- Authors: Dongkyu Cho, Taesup Moon, Rumi Chunara, Kyunghyun Cho, Sungmin Cha,
- Abstract summary: Continual learning (CL) research typically assumes highly constrained exemplar memory resources.
In this work, we investigate CL in a novel setting where exemplar memory is ample.
Our method achieves state-of-the-art performance while reducing the computational cost to a quarter or third of existing methods.
- Score: 55.77835198580209
- License:
- Abstract: Continual learning (CL) research typically assumes highly constrained exemplar memory resources. However, in many real-world scenarios-especially in the era of large foundation models-memory is abundant, while GPU computational costs are the primary bottleneck. In this work, we investigate CL in a novel setting where exemplar memory is ample (i.e., sufficient exemplar memory). Unlike prior methods designed for strict exemplar memory constraints, we propose a simple yet effective approach that directly operates in the model's weight space through a combination of weight resetting and averaging techniques. Our method achieves state-of-the-art performance while reducing the computational cost to a quarter or third of existing methods. These findings challenge conventional CL assumptions and provide a practical baseline for computationally efficient CL applications.
Related papers
- Budgeted Online Continual Learning by Adaptive Layer Freezing and Frequency-based Sampling [19.447914903112366]
We propose to use floating point operations and total memory size in Byte as a metric for computational and memory budgets.
To improve a CL method in a limited total budget, we propose adaptive layer freezing that does not update the layers for less informative batches.
In addition, we propose a memory retrieval method that allows the model to learn the same amount of knowledge as using random retrieval in fewer iterations.
arXiv Detail & Related papers (2024-10-19T16:00:00Z) - Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation [123.4883806344334]
We study a realistic Continual Learning setting where learning algorithms are granted a restricted computational budget per time step while training.
We apply this setting to large-scale semi-supervised Continual Learning scenarios with sparse label rates.
Our extensive analysis and ablations demonstrate that DietCL is stable under a full spectrum of label sparsity, computational budget, and various other ablations.
arXiv Detail & Related papers (2024-04-19T10:10:39Z) - An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning [55.467047686093025]
A common approach to alleviate such forgetting is to rehearse samples from prior tasks during fine-tuning.
We propose a sampling scheme, textttbf mix-cd, that prioritizes rehearsal of collateral damage'' samples.
Our approach is computationally efficient, easy to implement, and outperforms several leading continual learning methods in compute-constrained settings.
arXiv Detail & Related papers (2024-02-12T22:32:12Z) - Computationally Budgeted Continual Learning: What Does Matter? [128.0827987414154]
Continual Learning (CL) aims to sequentially train models on streams of incoming data that vary in distribution by preserving previous knowledge while adapting to new data.
Current CL literature focuses on restricted access to previously seen data, while imposing no constraints on the computational budget for training.
We revisit this problem with a large-scale benchmark and analyze the performance of traditional CL approaches in a compute-constrained setting.
arXiv Detail & Related papers (2023-03-20T14:50:27Z) - Real-Time Evaluation in Online Continual Learning: A New Hope [104.53052316526546]
We evaluate current Continual Learning (CL) methods with respect to their computational costs.
A simple baseline outperforms state-of-the-art CL methods under this evaluation.
This surprisingly suggests that the majority of existing CL literature is tailored to a specific class of streams that is not practical.
arXiv Detail & Related papers (2023-02-02T12:21:10Z) - Neural SDEs for Conditional Time Series Generation and the
Signature-Wasserstein-1 metric [0.0]
(Conditional) Generative Adversarial Networks (GANs) have found great success in recent years, due to their ability to approximate (conditional) distributions over extremely high dimensional spaces.
They are highly unstable and computationally expensive to train, especially in the time series setting.
Recently, it has been proposed the use of a key object in rough path theory, called the signature of a path, which is able to convert the min-max formulation given by the (conditional) GAN framework into a classical minimization problem.
This method is extremely expensive in terms of memory cost, sometimes becoming prohibitive.
arXiv Detail & Related papers (2023-01-03T19:08:01Z) - A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental
Learning [56.450090618578]
Class-Incremental Learning (CIL) aims to train a model with limited memory size to meet this requirement.
We show that when counting the model size into the total budget and comparing methods with aligned memory size, saving models do not consistently work.
We propose a simple yet effective baseline, denoted as MEMO for Memory-efficient Expandable MOdel.
arXiv Detail & Related papers (2022-05-26T08:24:01Z) - Semantically Constrained Memory Allocation (SCMA) for Embedding in
Efficient Recommendation Systems [27.419109620575313]
A key challenge for deep learning models is to work with millions of categorical classes or tokens.
We propose a novel formulation of memory shared embedding, where memory is shared in proportion to the overlap in semantic information.
We demonstrate a significant reduction in the memory footprint while maintaining performance.
arXiv Detail & Related papers (2021-02-24T19:55:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.