Continual evaluation for lifelong learning: Identifying the stability
gap
- URL: http://arxiv.org/abs/2205.13452v2
- Date: Thu, 30 Mar 2023 19:44:21 GMT
- Title: Continual evaluation for lifelong learning: Identifying the stability
gap
- Authors: Matthias De Lange, Gido van de Ven, Tinne Tuytelaars
- Abstract summary: We show that a set of common state-of-the-art methods still suffers from substantial forgetting upon starting to learn new tasks.
We refer to this intriguing but potentially problematic phenomenon as the stability gap.
We establish a framework for continual evaluation that uses per-iteration evaluation and we define a new set of metrics to quantify worst-case performance.
- Score: 35.99653845083381
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Time-dependent data-generating distributions have proven to be difficult for
gradient-based training of neural networks, as the greedy updates result in
catastrophic forgetting of previously learned knowledge. Despite the progress
in the field of continual learning to overcome this forgetting, we show that a
set of common state-of-the-art methods still suffers from substantial
forgetting upon starting to learn new tasks, except that this forgetting is
temporary and followed by a phase of performance recovery. We refer to this
intriguing but potentially problematic phenomenon as the stability gap. The
stability gap had likely remained under the radar due to standard practice in
the field of evaluating continual learning models only after each task.
Instead, we establish a framework for continual evaluation that uses
per-iteration evaluation and we define a new set of metrics to quantify
worst-case performance. Empirically we show that experience replay,
constraint-based replay, knowledge-distillation, and parameter regularization
methods are all prone to the stability gap; and that the stability gap can be
observed in class-, task-, and domain-incremental learning benchmarks.
Additionally, a controlled experiment shows that the stability gap increases
when tasks are more dissimilar. Finally, by disentangling gradients into
plasticity and stability components, we propose a conceptual explanation for
the stability gap.
Related papers
- Exploring the Stability Gap in Continual Learning: The Role of the Classification Head [0.6749750044497732]
The stability gap is a phenomenon where models initially lose performance on previously learned tasks before partially recovering during training.
We introduce the nearest-mean classifier (NMC) as a tool to attribute the influence of the backbone and the classification head on the stability gap.
Our experiments demonstrate that NMC not only improves final performance, but also significantly enhances training stability across various continual learning benchmarks.
arXiv Detail & Related papers (2024-11-06T15:45:01Z) - Temporal-Difference Variational Continual Learning [89.32940051152782]
A crucial capability of Machine Learning models in real-world applications is the ability to continuously learn new tasks.
In Continual Learning settings, models often struggle to balance learning new tasks with retaining previous knowledge.
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.
arXiv Detail & Related papers (2024-10-10T10:58:41Z) - The Expanding Scope of the Stability Gap: Unveiling its Presence in Joint Incremental Learning of Homogeneous Tasks [14.325370691984345]
Recent research identified a temporary performance drop on previously learned tasks when transitioning to a new one.
We show that the stability gap also occurs when applying joint incremental training of homogeneous tasks.
arXiv Detail & Related papers (2024-06-07T17:44:48Z) - Stability Evaluation via Distributional Perturbation Analysis [28.379994938809133]
We propose a stability evaluation criterion based on distributional perturbations.
Our stability evaluation criterion can address both emphdata corruptions and emphsub-population shifts.
Empirically, we validate the practical utility of our stability evaluation criterion across a host of real-world applications.
arXiv Detail & Related papers (2024-05-06T06:47:14Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - New metrics for analyzing continual learners [27.868967961503962]
Continual Learning (CL) poses challenges to standard learning algorithms.
This stability-plasticity dilemma remains central to CL and multiple metrics have been proposed to adequately measure stability and plasticity separately.
We propose new metrics that account for the task's increasing difficulty.
arXiv Detail & Related papers (2023-09-01T13:53:33Z) - Balancing Stability and Plasticity through Advanced Null Space in
Continual Learning [77.94570903726856]
We propose a new continual learning approach, Advanced Null Space (AdNS), to balance the stability and plasticity without storing any old data of previous tasks.
We also present a simple but effective method, intra-task distillation, to improve the performance of the current task.
Experimental results show that the proposed method can achieve better performance compared to state-of-the-art continual learning approaches.
arXiv Detail & Related papers (2022-07-25T11:04:22Z) - Bayesian Algorithms Learn to Stabilize Unknown Continuous-Time Systems [0.0]
Linear dynamical systems are canonical models for learning-based control of plants with uncertain dynamics.
A reliable stabilization procedure for this purpose that can effectively learn from unstable data to stabilize the system in a finite time is not currently available.
In this work, we propose a novel learning algorithm that stabilizes unknown continuous-time linear systems.
arXiv Detail & Related papers (2021-12-30T15:31:35Z) - Training Generative Adversarial Networks by Solving Ordinary
Differential Equations [54.23691425062034]
We study the continuous-time dynamics induced by GAN training.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error.
We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
arXiv Detail & Related papers (2020-10-28T15:23:49Z) - Fine-Grained Analysis of Stability and Generalization for Stochastic
Gradient Descent [55.85456985750134]
We introduce a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates.
This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting.
To our best knowledge, this gives the firstever-known stability and generalization for SGD with even non-differentiable loss functions.
arXiv Detail & Related papers (2020-06-15T06:30:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.