Reducing catastrophic forgetting of incremental learning in the absence of rehearsal memory with task-specific token
- URL: http://arxiv.org/abs/2411.05846v1
- Date: Wed, 06 Nov 2024 16:13:50 GMT
- Title: Reducing catastrophic forgetting of incremental learning in the absence of rehearsal memory with task-specific token
- Authors: Young Jo Choi, Min Kyoon Yoo, Yu Rang Park,
- Abstract summary: Deep learning models display catastrophic forgetting when learning new data continuously.
We present a novel method that preserves previous knowledge without storing previous data.
This method is inspired by the architecture of a vision transformer and employs a unique token capable of encapsulating the compressed knowledge of each task.
- Score: 0.6144680854063939
- License:
- Abstract: Deep learning models generally display catastrophic forgetting when learning new data continuously. Many incremental learning approaches address this problem by reusing data from previous tasks while learning new tasks. However, the direct access to past data generates privacy and security concerns. To address these issues, we present a novel method that preserves previous knowledge without storing previous data. This method is inspired by the architecture of a vision transformer and employs a unique token capable of encapsulating the compressed knowledge of each task. This approach generates task-specific embeddings by directing attention differently based on the task associated with the data, thereby effectively mimicking the impact of having multiple models through tokens. Our method incorporates a distillation process that ensures efficient interactions even after multiple additional learning steps, thereby optimizing the model against forgetting. We measured the performance of our model in terms of accuracy and backward transfer using a benchmark dataset for different task-incremental learning scenarios. Our results demonstrate the superiority of our approach, which achieved the highest accuracy and lowest backward transfer among the compared methods. In addition to presenting a new model, our approach lays the foundation for various extensions within the spectrum of vision-transformer architectures.
Related papers
- Low-Rank Mixture-of-Experts for Continual Medical Image Segmentation [18.984447545932706]
"catastrophic forgetting" problem occurs when model forgets previously learned features when it is extended to new categories or tasks.
We propose a network by introducing the data-specific Mixture of Experts structure to handle the new tasks or categories.
We validate our method on both class-level and task-level continual learning challenges.
arXiv Detail & Related papers (2024-06-19T14:19:50Z) - Class incremental learning with probability dampening and cascaded gated classifier [4.285597067389559]
We propose a novel incremental regularisation approach called Margin Dampening and Cascaded Scaling.
The first combines a soft constraint and a knowledge distillation approach to preserve past knowledge while allowing forgetting new patterns.
We empirically show that our approach performs well on multiple benchmarks well-established baselines.
arXiv Detail & Related papers (2024-02-02T09:33:07Z) - Unlearn What You Want to Forget: Efficient Unlearning for LLMs [92.51670143929056]
Large language models (LLMs) have achieved significant progress from pre-training on and memorizing a wide range of textual data.
This process might suffer from privacy issues and violations of data protection regulations.
We propose an efficient unlearning framework that could efficiently update LLMs without having to retrain the whole model after data removals.
arXiv Detail & Related papers (2023-10-31T03:35:59Z) - Prototype-Sample Relation Distillation: Towards Replay-Free Continual
Learning [14.462797749666992]
We propose a holistic approach to jointly learn the representation and class prototypes.
We propose a novel distillation loss that constrains class prototypes to maintain relative similarities as compared to new task data.
This method yields state-of-the-art performance in the task-incremental setting.
arXiv Detail & Related papers (2023-03-26T16:35:45Z) - PIVOT: Prompting for Video Continual Learning [50.80141083993668]
We introduce PIVOT, a novel method that leverages extensive knowledge in pre-trained models from the image domain.
Our experiments show that PIVOT improves state-of-the-art methods by a significant 27% on the 20-task ActivityNet setup.
arXiv Detail & Related papers (2022-12-09T13:22:27Z) - A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from.
Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes.
One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z) - Relational Experience Replay: Continual Learning by Adaptively Tuning
Task-wise Relationship [54.73817402934303]
We propose Experience Continual Replay (ERR), a bi-level learning framework to adaptively tune task-wise to achieve a better stability plasticity' tradeoff.
ERR can consistently improve the performance of all baselines and surpass current state-of-the-art methods.
arXiv Detail & Related papers (2021-12-31T12:05:22Z) - DIODE: Dilatable Incremental Object Detection [15.59425584971872]
Conventional deep learning models lack the capability of preserving previously learned knowledge.
We propose a dilatable incremental object detector (DIODE) for multi-step incremental detection tasks.
Our method achieves up to 6.4% performance improvement by increasing the number of parameters by just 1.2% for each newly learned task.
arXiv Detail & Related papers (2021-08-12T09:45:57Z) - Continual Learning via Bit-Level Information Preserving [88.32450740325005]
We study the continual learning process through the lens of information theory.
We propose Bit-Level Information Preserving (BLIP) that preserves the information gain on model parameters.
BLIP achieves close to zero forgetting while only requiring constant memory overheads throughout continual learning.
arXiv Detail & Related papers (2021-05-10T15:09:01Z) - Incremental Object Detection via Meta-Learning [77.55310507917012]
We propose a meta-learning approach that learns to reshape model gradients, such that information across incremental tasks is optimally shared.
In comparison to existing meta-learning methods, our approach is task-agnostic, allows incremental addition of new-classes and scales to high-capacity models for object detection.
arXiv Detail & Related papers (2020-03-17T13:40:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.