Continual Learners are Incremental Model Generalizers
- URL: http://arxiv.org/abs/2306.12026v1
- Date: Wed, 21 Jun 2023 05:26:28 GMT
- Title: Continual Learners are Incremental Model Generalizers
- Authors: Jaehong Yoon, Sung Ju Hwang, Yue Cao
- Abstract summary: This paper extensively studies the impact of Continual Learning (CL) models as pre-trainers.
We find that the transfer quality of the representation often increases gradually without noticeable degradation in fine-tuning performance.
We propose a new fine-tuning scheme, GLobal Attention Discretization (GLAD), that preserves rich task-generic representation during solving downstream tasks.
- Score: 70.34479702177988
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by the efficiency and rapid convergence of pre-trained models for
solving downstream tasks, this paper extensively studies the impact of
Continual Learning (CL) models as pre-trainers. In both supervised and
unsupervised CL, we find that the transfer quality of the representation often
increases gradually without noticeable degradation in fine-tuning performance.
This is because CL models can learn improved task-general features when easily
forgetting task-specific knowledge. Based on this observation, we suggest a new
unsupervised CL framework with masked modeling, which aims to capture fluent
task-generic representation during training. Furthermore, we propose a new
fine-tuning scheme, GLobal Attention Discretization (GLAD), that preserves rich
task-generic representation during solving downstream tasks. The model
fine-tuned with GLAD achieves competitive performance and can also be used as a
good pre-trained model itself. We believe this paper breaks the barriers
between pre-training and fine-tuning steps and leads to a sustainable learning
framework in which the continual learner incrementally improves model
generalization, yielding better transfer to unseen tasks.
Related papers
- ICL-TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models [103.45785408116146]
Continual learning (CL) aims to train a model that can solve multiple tasks presented sequentially.
Recent CL approaches have achieved strong performance by leveraging large pre-trained models that generalize well to downstream tasks.
However, such methods lack theoretical guarantees, making them prone to unexpected failures.
We bridge this gap by integrating an empirically strong approach into a principled framework, designed to prevent forgetting.
arXiv Detail & Related papers (2024-10-01T12:58:37Z) - Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones.
We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z) - Momentum-based Weight Interpolation of Strong Zero-Shot Models for
Continual Learning [46.80199921638615]
Large pre-trained, zero-shot capable models have shown considerable success both for standard transfer and adaptation tasks.
However, through naive fine-tuning, these zero-shot models lose their generalizability and robustness towards distribution shifts.
In this work, we showcase that where fine-tuning falls short to adapt such zero-shot capable models, simple momentum-based weight can provide consistent improvements.
arXiv Detail & Related papers (2022-11-06T17:41:39Z) - Task Agnostic Representation Consolidation: a Self-supervised based
Continual Learning Approach [14.674494335647841]
We propose a two-stage training paradigm for CL that intertwines task-agnostic and task-specific learning.
We show that our training paradigm can be easily added to memory- or regularization-based approaches.
arXiv Detail & Related papers (2022-07-13T15:16:51Z) - Self-Supervised Models are Continual Learners [79.70541692930108]
We show that self-supervised loss functions can be seamlessly converted into distillation mechanisms for Continual Learning.
We devise a framework for Continual self-supervised visual representation Learning that significantly improves the quality of the learned representations.
arXiv Detail & Related papers (2021-12-08T10:39:13Z) - When Does Contrastive Learning Preserve Adversarial Robustness from
Pretraining to Finetuning? [99.4914671654374]
We propose AdvCL, a novel adversarial contrastive pretraining framework.
We show that AdvCL is able to enhance cross-task robustness transferability without loss of model accuracy and finetuning efficiency.
arXiv Detail & Related papers (2021-11-01T17:59:43Z) - Continual Learning From Unlabeled Data Via Deep Clustering [7.704949298975352]
Continual learning aims to learn new tasks incrementally using less computation and memory resources instead of retraining the model from scratch whenever new task arrives.
We introduce a new framework to make continual learning feasible in unsupervised mode by using pseudo label obtained from cluster assignments to update model.
arXiv Detail & Related papers (2021-04-14T23:46:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.