Forgetting Order of Continual Learning: Examples That are Learned First are Forgotten Last
- URL: http://arxiv.org/abs/2406.09935v1
- Date: Fri, 14 Jun 2024 11:31:12 GMT
- Title: Forgetting Order of Continual Learning: Examples That are Learned First are Forgotten Last
- Authors: Guy Hacohen, Tinne Tuytelaars,
- Abstract summary: Catastrophic forgetting poses a significant challenge in continual learning.
examples learned early are rarely forgotten, while those learned later are more susceptible to forgetting.
We introduce Goldilocks, a novel replay buffer sampling method that filters out examples learned too quickly or too slowly, keeping those learned at an intermediate speed.
- Score: 44.31831689984837
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Catastrophic forgetting poses a significant challenge in continual learning, where models often forget previous tasks when trained on new data. Our empirical analysis reveals a strong correlation between catastrophic forgetting and the learning speed of examples: examples learned early are rarely forgotten, while those learned later are more susceptible to forgetting. We demonstrate that replay-based continual learning methods can leverage this phenomenon by focusing on mid-learned examples for rehearsal. We introduce Goldilocks, a novel replay buffer sampling method that filters out examples learned too quickly or too slowly, keeping those learned at an intermediate speed. Goldilocks improves existing continual learning algorithms, leading to state-of-the-art performance across several image classification tasks.
Related papers
- Reducing Catastrophic Forgetting in Online Class Incremental Learning Using Self-Distillation [3.8506666685467343]
In continual learning, previous knowledge is forgotten when a model learns new tasks.
In this paper, we tried to solve this problem by acquiring transferable knowledge through self-distillation.
Our proposed method outperformed conventional methods by experiments on CIFAR10, CIFAR100, and MiniimageNet datasets.
arXiv Detail & Related papers (2024-09-17T16:26:33Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Detachedly Learn a Classifier for Class-Incremental Learning [11.865788374587734]
We present an analysis that the failure of vanilla experience replay (ER) comes from unnecessary re-learning of previous tasks and incompetence to distinguish current task from the previous ones.
We propose a novel replay strategy task-aware experience replay.
Experimental results show our method outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2023-02-23T01:35:44Z) - Measuring Forgetting of Memorized Training Examples [80.9188503645436]
We show machine learning models exhibit two seemingly contradictory phenomena: training data memorization and various forms of memorization.
In specific examples, models overfit specific training and become susceptible to privacy attacks by the end.
We identify deterministically forgetting examples as a potential explanation, showing that models empirically do not forget trained examples over time.
arXiv Detail & Related papers (2022-06-30T20:48:26Z) - Decoupling Knowledge from Memorization: Retrieval-augmented Prompt
Learning [113.58691755215663]
We develop RetroPrompt to help a model strike a balance between generalization and memorization.
In contrast with vanilla prompt learning, RetroPrompt constructs an open-book knowledge-store from training instances.
Extensive experiments demonstrate that RetroPrompt can obtain better performance in both few-shot and zero-shot settings.
arXiv Detail & Related papers (2022-05-29T16:07:30Z) - Relational Experience Replay: Continual Learning by Adaptively Tuning
Task-wise Relationship [54.73817402934303]
We propose Experience Continual Replay (ERR), a bi-level learning framework to adaptively tune task-wise to achieve a better stability plasticity' tradeoff.
ERR can consistently improve the performance of all baselines and surpass current state-of-the-art methods.
arXiv Detail & Related papers (2021-12-31T12:05:22Z) - Rehearsal revealed: The limits and merits of revisiting samples in
continual learning [43.40531878205344]
We provide insight into the limits and merits of rehearsal, one of continual learning's most established methods.
We show that models trained sequentially with rehearsal tend to stay in the same low-loss region after a task has finished, but are at risk of overfitting on its sample memory.
arXiv Detail & Related papers (2021-04-15T13:28:14Z) - A Theoretical Analysis of Learning with Noisily Labeled Data [62.946840431501855]
We first show that in the first epoch training, the examples with clean labels will be learned first.
We then show that after the learning from clean data stage, continuously training model can achieve further improvement in testing error.
arXiv Detail & Related papers (2021-04-08T23:40:02Z) - Learning to Continually Learn Rapidly from Few and Noisy Data [19.09933805011466]
Continual learning could be achieved via replay -- by concurrently training externally stored old data while learning a new task.
By employing a meta-learner, which textitlearns a learning rate per parameter per past task, we found that base learners produced strong results when less memory was available.
arXiv Detail & Related papers (2021-03-06T08:29:47Z) - Using Hindsight to Anchor Past Knowledge in Continual Learning [36.271906785418864]
In continual learning, the learner faces a stream of data whose distribution changes over time.
Modern neural networks are known to suffer under this setting, as they quickly forget previously acquired knowledge.
In this work, we call anchoring, where the learner uses bilevel optimization to update its knowledge on the current task, while keeping intact the predictions on past tasks.
arXiv Detail & Related papers (2020-02-19T13:21:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.