Enhancing knowledge retention for continual learning with domain-specific adapters and features gating
- URL: http://arxiv.org/abs/2504.08613v1
- Date: Fri, 11 Apr 2025 15:20:08 GMT
- Title: Enhancing knowledge retention for continual learning with domain-specific adapters and features gating
- Authors: Mohamed Abbas Hedjazi, Oussama Hadjerci, Adel Hafiane,
- Abstract summary: Continual learning empowers models to learn from a continuous stream of data while preserving previously acquired knowledge.<n>We propose a new approach that integrates adapters within the self-attention mechanisms of Vision Transformers to enhance knowledge retention when sequentially adding datasets from different domains.
- Score: 4.637185817866919
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual learning empowers models to learn from a continuous stream of data while preserving previously acquired knowledge, effectively addressing the challenge of catastrophic forgetting. In this study, we propose a new approach that integrates adapters within the self-attention mechanisms of Vision Transformers to enhance knowledge retention when sequentially adding datasets from different domains. Unlike previous methods that continue learning with only one dataset, our approach introduces domain-specific output heads and feature gating, allowing the model to maintain high accuracy on previously learned tasks while incorporating only the essential information from multiple domains. The proposed method is compared to prominent parameter-efficient fine-tuning methods in the current state of the art. The results provide evidence that our method effectively alleviates the limitations of previous works. Furthermore, we conduct a comparative analysis using three datasets, CIFAR-100, Flowers102, and DTD, each representing a distinct domain, to investigate the impact of task order on model performance. Our findings underscore the critical role of dataset sequencing in shaping learning outcomes, demonstrating that strategic ordering can significantly improve the model's ability to adapt to evolving data distributions over time while preserving the integrity of previously learned knowledge.
Related papers
- Incrementally Learning Multiple Diverse Data Domains via Multi-Source Dynamic Expansion Model [16.035374682124846]
Continual Learning seeks to develop a model capable of incrementally assimilating new information while retaining prior knowledge.<n>This paper shifts focus to a more complex and realistic learning environment, characterized by data samples sourced from multiple distinct domains.
arXiv Detail & Related papers (2025-01-15T15:49:46Z) - Research on the Online Update Method for Retrieval-Augmented Generation (RAG) Model with Incremental Learning [13.076087281398813]
The proposed method is better than the existing mainstream comparison models in terms of knowledge retention and inference accuracy.
Experimental results show that the proposed method is better than the existing mainstream comparison models in terms of knowledge retention and inference accuracy.
arXiv Detail & Related papers (2025-01-13T05:16:14Z) - Adaptive Rank, Reduced Forgetting: Knowledge Retention in Continual Learning Vision-Language Models with Dynamic Rank-Selective LoRA [19.982853959240497]
We investigate whether pre-trained knowledge in vision-language models (VLMs) can be retained -- or even enhanced -- in continual learning (CL)<n>We propose a universal and efficient Continual Learning approach for VLM based on Dynamic Rank-Selective LoRA (CoDyRA)
arXiv Detail & Related papers (2024-12-01T23:41:42Z) - Multi-Stage Knowledge Integration of Vision-Language Models for Continual Learning [79.46570165281084]
We propose a Multi-Stage Knowledge Integration network (MulKI) to emulate the human learning process in distillation methods.
MulKI achieves this through four stages, including Eliciting Ideas, Adding New Ideas, Distinguishing Ideas, and Making Connections.
Our method demonstrates significant improvements in maintaining zero-shot capabilities while supporting continual learning across diverse downstream tasks.
arXiv Detail & Related papers (2024-11-11T07:36:19Z) - Reducing catastrophic forgetting of incremental learning in the absence of rehearsal memory with task-specific token [0.6144680854063939]
Deep learning models display catastrophic forgetting when learning new data continuously.
We present a novel method that preserves previous knowledge without storing previous data.
This method is inspired by the architecture of a vision transformer and employs a unique token capable of encapsulating the compressed knowledge of each task.
arXiv Detail & Related papers (2024-11-06T16:13:50Z) - Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence [60.37934652213881]
Domain Adaptation (DA) facilitates knowledge transfer from a source domain to a related target domain.
This paper investigates a practical DA paradigm, namely Source data-Free Active Domain Adaptation (SFADA), where source data becomes inaccessible during adaptation.
We present learn from the learnt (LFTL), a novel paradigm for SFADA to leverage the learnt knowledge from the source pretrained model and actively iterated models without extra overhead.
arXiv Detail & Related papers (2024-07-26T17:51:58Z) - Combining inherent knowledge of vision-language models with unsupervised domain adaptation through strong-weak guidance [44.1830188215271]
Unsupervised domain adaptation (UDA) tries to overcome the tedious work of labeling data by leveraging a labeled source dataset.<n>Current vision-language models exhibit remarkable zero-shot prediction capabilities.<n>We introduce a strong-weak guidance learning scheme that employs zero-shot predictions to help align the source and target dataset.
arXiv Detail & Related papers (2023-12-07T06:16:39Z) - Adversarial Auto-Augment with Label Preservation: A Representation
Learning Principle Guided Approach [95.74102207187545]
We show that a prior-free autonomous data augmentation's objective can be derived from a representation learning principle.
We then propose a practical surrogate to the objective that can be efficiently optimized and integrated seamlessly into existing methods.
arXiv Detail & Related papers (2022-11-02T02:02:51Z) - Beyond Transfer Learning: Co-finetuning for Action Localisation [64.07196901012153]
We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks.
We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data.
We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
arXiv Detail & Related papers (2022-07-08T10:25:47Z) - Unified Instance and Knowledge Alignment Pretraining for Aspect-based
Sentiment Analysis [96.53859361560505]
Aspect-based Sentiment Analysis (ABSA) aims to determine the sentiment polarity towards an aspect.
There always exists severe domain shift between the pretraining and downstream ABSA datasets.
We introduce a unified alignment pretraining framework into the vanilla pretrain-finetune pipeline.
arXiv Detail & Related papers (2021-10-26T04:03:45Z) - Contrastive Neural Processes for Self-Supervised Learning [1.8059331230167266]
We propose a novel self-supervised learning framework that combines contrastive learning with neural processes.
It relies on recent advances in neural processes to perform time series forecasting.
Unlike previous self-supervised methods, our augmentation pipeline is task-agnostic, enabling our method to perform well across various applications.
arXiv Detail & Related papers (2021-10-24T21:01:27Z) - S^3-Rec: Self-Supervised Learning for Sequential Recommendation with
Mutual Information Maximization [104.87483578308526]
We propose the model S3-Rec, which stands for Self-Supervised learning for Sequential Recommendation.
For our task, we devise four auxiliary self-supervised objectives to learn the correlations among attribute, item, subsequence, and sequence.
Extensive experiments conducted on six real-world datasets demonstrate the superiority of our proposed method over existing state-of-the-art methods.
arXiv Detail & Related papers (2020-08-18T11:44:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.