Related papers: Unlocking the Power of Rehearsal in Continual Learning: A Theoretical Perspective

Unlocking the Power of Rehearsal in Continual Learning: A Theoretical Perspective

URL: http://arxiv.org/abs/2506.00205v1
Date: Fri, 30 May 2025 20:23:15 GMT
Title: Unlocking the Power of Rehearsal in Continual Learning: A Theoretical Perspective
Authors: Junze Deng, Qinhang Wu, Peizhong Ju, Sen Lin, Yingbin Liang, Ness Shroff,
Abstract summary: We study whether sequential rehearsal can offer greater benefits for continual learning compared to standard concurrent rehearsal.<n>By explicitly characterizing forgetting and generalization error, we show that sequential rehearsal performs better when tasks are less similar.<n>We further motivate a novel Hybrid Rehearsal method, which trains similar tasks concurrently and revisits dissimilar tasks sequentially.
Score: 43.91046454332906
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Rehearsal-based methods have shown superior performance in addressing catastrophic forgetting in continual learning (CL) by storing and training on a subset of past data alongside new data in current task. While such a concurrent rehearsal strategy is widely used, it remains unclear if this approach is always optimal. Inspired by human learning, where sequentially revisiting tasks helps mitigate forgetting, we explore whether sequential rehearsal can offer greater benefits for CL compared to standard concurrent rehearsal. To address this question, we conduct a theoretical analysis of rehearsal-based CL in overparameterized linear models, comparing two strategies: 1) Concurrent Rehearsal, where past and new data are trained together, and 2) Sequential Rehearsal, where new data is trained first, followed by revisiting past data sequentially. By explicitly characterizing forgetting and generalization error, we show that sequential rehearsal performs better when tasks are less similar. These insights further motivate a novel Hybrid Rehearsal method, which trains similar tasks concurrently and revisits dissimilar tasks sequentially. We characterize its forgetting and generalization performance, and our experiments with deep neural networks further confirm that the hybrid approach outperforms standard concurrent rehearsal. This work provides the first comprehensive theoretical analysis of rehearsal-based CL.

Related papers

Sample Compression for Self Certified Continual Learning [4.354838732412981]
Continual learning algorithms aim to learn from a sequence of tasks, making the training distribution non-stationary.<n>We present a new method called Continual Pick-to-Learn (CoP2L), which is able to retain the most representative samples for each task in an efficient way.
arXiv Detail & Related papers (2025-03-13T16:05:56Z)
Incremental Learning with Repetition via Pseudo-Feature Projection [3.4734633097581815]
We investigate how exemplar-free incremental learning strategies are affected by data repetition.<n>Our proposed exemplar-free method achieves competitive results in the classic scenario without repetition, and state-of-the-art performance in the one with repetition.
arXiv Detail & Related papers (2025-02-27T09:43:35Z)
IMEX-Reg: Implicit-Explicit Regularization in the Function Space for Continual Learning [17.236861687708096]
Continual learning (CL) remains one of the long-standing challenges for deep neural networks due to catastrophic forgetting of previously acquired knowledge. Inspired by how humans learn using strong inductive biases, we propose IMEX-Reg to improve the generalization performance of experience rehearsal in CL under low buffer regimes.
arXiv Detail & Related papers (2024-04-28T12:25:09Z)
Hierarchical Decomposition of Prompt-Based Continual Learning: Rethinking Obscured Sub-optimality [55.88910947643436]
Self-supervised pre-training is essential for handling vast quantities of unlabeled data in practice. HiDe-Prompt is an innovative approach that explicitly optimize the hierarchical components with an ensemble of task-specific prompts and statistics. Our experiments demonstrate the superior performance of HiDe-Prompt and its robustness to pre-training paradigms in continual learning.
arXiv Detail & Related papers (2023-10-11T06:51:46Z)
History Repeats: Overcoming Catastrophic Forgetting For Event-Centric Temporal Knowledge Graph Completion [33.38304336898247]
Temporal knowledge graph (TKG) completion models rely on having access to the entire graph during training. TKG data is often received incrementally as events unfold, leading to a dynamic non-stationary data distribution over time. We propose a general continual training framework that is applicable to any TKG completion method.
arXiv Detail & Related papers (2023-05-30T01:21:36Z)
Towards Out-of-Distribution Sequential Event Prediction: A Causal Treatment [72.50906475214457]
The goal of sequential event prediction is to estimate the next event based on a sequence of historical events. In practice, the next-event prediction models are trained with sequential data collected at one time. We propose a framework with hierarchical branching structures for learning context-specific representations.
arXiv Detail & Related papers (2022-10-24T07:54:13Z)
Learning Dynamics and Generalization in Reinforcement Learning [59.530058000689884]
We show theoretically that temporal difference learning encourages agents to fit non-smooth components of the value function early in training. We show that neural networks trained using temporal difference algorithms on dense reward tasks exhibit weaker generalization between states than randomly networks and gradient networks trained with policy methods.
arXiv Detail & Related papers (2022-06-05T08:49:16Z)
A Closer Look at Rehearsal-Free Continual Learning [26.09061715039747]
We show how to achieve strong continual learning performance without rehearsal. We first disprove the common assumption that parameter regularization techniques fail for rehearsal-free continual learning of a single, expanding task. Next, we explore how to leverage knowledge from a pre-trained model in rehearsal-free continual learning and find that vanilla L2 parameter regularization outperforms EWC parameter regularization and feature distillation.
arXiv Detail & Related papers (2022-03-31T17:59:00Z)
An Investigation of Replay-based Approaches for Continual Learning [79.0660895390689]
Continual learning (CL) is a major challenge of machine learning (ML) and describes the ability to learn several tasks sequentially without catastrophic forgetting (CF) Several solution classes have been proposed, of which so-called replay-based approaches seem very promising due to their simplicity and robustness. We empirically investigate replay-based approaches of continual learning and assess their potential for applications.
arXiv Detail & Related papers (2021-08-15T15:05:02Z)
Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs) We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs. We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.