Forgetting Order of Continual Learning: Examples That are Learned First are Forgotten Last
- URL: http://arxiv.org/abs/2406.09935v1
- Date: Fri, 14 Jun 2024 11:31:12 GMT
- Title: Forgetting Order of Continual Learning: Examples That are Learned First are Forgotten Last
- Authors: Guy Hacohen, Tinne Tuytelaars,
- Abstract summary: Catastrophic forgetting poses a significant challenge in continual learning.
examples learned early are rarely forgotten, while those learned later are more susceptible to forgetting.
We introduce Goldilocks, a novel replay buffer sampling method that filters out examples learned too quickly or too slowly, keeping those learned at an intermediate speed.
- Score: 44.31831689984837
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Catastrophic forgetting poses a significant challenge in continual learning, where models often forget previous tasks when trained on new data. Our empirical analysis reveals a strong correlation between catastrophic forgetting and the learning speed of examples: examples learned early are rarely forgotten, while those learned later are more susceptible to forgetting. We demonstrate that replay-based continual learning methods can leverage this phenomenon by focusing on mid-learned examples for rehearsal. We introduce Goldilocks, a novel replay buffer sampling method that filters out examples learned too quickly or too slowly, keeping those learned at an intermediate speed. Goldilocks improves existing continual learning algorithms, leading to state-of-the-art performance across several image classification tasks.
Related papers
- Accelerated Online Reinforcement Learning using Auxiliary Start State Distributions [50.44719434877687]
Expert demonstrations and simulators can reset to arbitrary states.<n>We find that using a notion of safety to inform the choice of this auxiliary distribution significantly accelerates learning.
arXiv Detail & Related papers (2025-07-07T01:54:05Z) - CSTA: Spatial-Temporal Causal Adaptive Learning for Exemplar-Free Video Class-Incremental Learning [62.69917996026769]
A class-incremental learning task requires learning and preserving both spatial appearance and temporal action involvement.<n>We propose a framework that equips separate adapters to learn new class patterns, accommodating the incremental information requirements unique to each class.<n>A causal compensation mechanism is proposed to reduce the conflicts during increment and memorization for between different types of information.
arXiv Detail & Related papers (2025-01-13T11:34:55Z) - Reducing Catastrophic Forgetting in Online Class Incremental Learning Using Self-Distillation [3.8506666685467343]
In continual learning, previous knowledge is forgotten when a model learns new tasks.
In this paper, we tried to solve this problem by acquiring transferable knowledge through self-distillation.
Our proposed method outperformed conventional methods by experiments on CIFAR10, CIFAR100, and MiniimageNet datasets.
arXiv Detail & Related papers (2024-09-17T16:26:33Z) - Normalization and effective learning rates in reinforcement learning [52.59508428613934]
Normalization layers have recently experienced a renaissance in the deep reinforcement learning and continual learning literature.
We show that normalization brings with it a subtle but important side effect: an equivalence between growth in the norm of the network parameters and decay in the effective learning rate.
We propose to make the learning rate schedule explicit with a simple re- parameterization which we call Normalize-and-Project.
arXiv Detail & Related papers (2024-07-01T20:58:01Z) - Contrastive Continual Learning with Importance Sampling and
Prototype-Instance Relation Distillation [14.25441464051506]
We propose Contrastive Continual Learning via Importance Sampling (CCLIS) to preserve knowledge by recovering previous data distributions.
We also present the Prototype-instance Relation Distillation (PRD) loss, a technique designed to maintain the relationship between prototypes and sample representations.
arXiv Detail & Related papers (2024-03-07T15:47:52Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - AdaER: An Adaptive Experience Replay Approach for Continual Lifelong
Learning [16.457330925212606]
We present adaptive-experience replay (AdaER) to address the challenge of continual lifelong learning.
AdaER consists of two stages: memory replay and memory update.
Results: AdaER outperforms existing continual lifelong learning baselines.
arXiv Detail & Related papers (2023-08-07T01:25:45Z) - Detachedly Learn a Classifier for Class-Incremental Learning [11.865788374587734]
We present an analysis that the failure of vanilla experience replay (ER) comes from unnecessary re-learning of previous tasks and incompetence to distinguish current task from the previous ones.
We propose a novel replay strategy task-aware experience replay.
Experimental results show our method outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2023-02-23T01:35:44Z) - A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from.
Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes.
One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z) - Measuring Forgetting of Memorized Training Examples [80.9188503645436]
We show machine learning models exhibit two seemingly contradictory phenomena: training data memorization and various forms of memorization.
In specific examples, models overfit specific training and become susceptible to privacy attacks by the end.
We identify deterministically forgetting examples as a potential explanation, showing that models empirically do not forget trained examples over time.
arXiv Detail & Related papers (2022-06-30T20:48:26Z) - Decoupling Knowledge from Memorization: Retrieval-augmented Prompt
Learning [113.58691755215663]
We develop RetroPrompt to help a model strike a balance between generalization and memorization.
In contrast with vanilla prompt learning, RetroPrompt constructs an open-book knowledge-store from training instances.
Extensive experiments demonstrate that RetroPrompt can obtain better performance in both few-shot and zero-shot settings.
arXiv Detail & Related papers (2022-05-29T16:07:30Z) - Relational Experience Replay: Continual Learning by Adaptively Tuning
Task-wise Relationship [54.73817402934303]
We propose Experience Continual Replay (ERR), a bi-level learning framework to adaptively tune task-wise to achieve a better stability plasticity' tradeoff.
ERR can consistently improve the performance of all baselines and surpass current state-of-the-art methods.
arXiv Detail & Related papers (2021-12-31T12:05:22Z) - Representation Memorization for Fast Learning New Knowledge without
Forgetting [36.55736909586313]
The ability to quickly learn new knowledge is a big step towards human-level intelligence.
We consider scenarios that require learning new classes or data distributions quickly and incrementally over time.
We propose "Memory-based Hebbian Adaptation" to tackle the two major challenges.
arXiv Detail & Related papers (2021-08-28T07:54:53Z) - An Investigation of Replay-based Approaches for Continual Learning [79.0660895390689]
Continual learning (CL) is a major challenge of machine learning (ML) and describes the ability to learn several tasks sequentially without catastrophic forgetting (CF)
Several solution classes have been proposed, of which so-called replay-based approaches seem very promising due to their simplicity and robustness.
We empirically investigate replay-based approaches of continual learning and assess their potential for applications.
arXiv Detail & Related papers (2021-08-15T15:05:02Z) - Rehearsal revealed: The limits and merits of revisiting samples in
continual learning [43.40531878205344]
We provide insight into the limits and merits of rehearsal, one of continual learning's most established methods.
We show that models trained sequentially with rehearsal tend to stay in the same low-loss region after a task has finished, but are at risk of overfitting on its sample memory.
arXiv Detail & Related papers (2021-04-15T13:28:14Z) - A Theoretical Analysis of Learning with Noisily Labeled Data [62.946840431501855]
We first show that in the first epoch training, the examples with clean labels will be learned first.
We then show that after the learning from clean data stage, continuously training model can achieve further improvement in testing error.
arXiv Detail & Related papers (2021-04-08T23:40:02Z) - Learning to Continually Learn Rapidly from Few and Noisy Data [19.09933805011466]
Continual learning could be achieved via replay -- by concurrently training externally stored old data while learning a new task.
By employing a meta-learner, which textitlearns a learning rate per parameter per past task, we found that base learners produced strong results when less memory was available.
arXiv Detail & Related papers (2021-03-06T08:29:47Z) - Using Hindsight to Anchor Past Knowledge in Continual Learning [36.271906785418864]
In continual learning, the learner faces a stream of data whose distribution changes over time.
Modern neural networks are known to suffer under this setting, as they quickly forget previously acquired knowledge.
In this work, we call anchoring, where the learner uses bilevel optimization to update its knowledge on the current task, while keeping intact the predictions on past tasks.
arXiv Detail & Related papers (2020-02-19T13:21:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.