Continual Learners are Incremental Model Generalizers
- URL: http://arxiv.org/abs/2306.12026v1
- Date: Wed, 21 Jun 2023 05:26:28 GMT
- Title: Continual Learners are Incremental Model Generalizers
- Authors: Jaehong Yoon, Sung Ju Hwang, Yue Cao
- Abstract summary: This paper extensively studies the impact of Continual Learning (CL) models as pre-trainers.
We find that the transfer quality of the representation often increases gradually without noticeable degradation in fine-tuning performance.
We propose a new fine-tuning scheme, GLobal Attention Discretization (GLAD), that preserves rich task-generic representation during solving downstream tasks.
- Score: 70.34479702177988
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by the efficiency and rapid convergence of pre-trained models for
solving downstream tasks, this paper extensively studies the impact of
Continual Learning (CL) models as pre-trainers. In both supervised and
unsupervised CL, we find that the transfer quality of the representation often
increases gradually without noticeable degradation in fine-tuning performance.
This is because CL models can learn improved task-general features when easily
forgetting task-specific knowledge. Based on this observation, we suggest a new
unsupervised CL framework with masked modeling, which aims to capture fluent
task-generic representation during training. Furthermore, we propose a new
fine-tuning scheme, GLobal Attention Discretization (GLAD), that preserves rich
task-generic representation during solving downstream tasks. The model
fine-tuned with GLAD achieves competitive performance and can also be used as a
good pre-trained model itself. We believe this paper breaks the barriers
between pre-training and fine-tuning steps and leads to a sustainable learning
framework in which the continual learner incrementally improves model
generalization, yielding better transfer to unseen tasks.
Related papers
- CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning [17.614980614656407]
We propose Continual Generative training for Incremental prompt-Learning, a novel approach to mitigate forgetting while adapting a VLM.
We demonstrate the effectiveness of our framework in adapting to new tasks while improving zero-shot capabilities.
arXiv Detail & Related papers (2024-07-22T16:51:28Z) - Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Cross-Regularization [78.61621802973262]
We introduce an Orthogonal finetuning method for efficiently updating pretrained weights.
A cross-regularization strategy is also exploited to maintain the stability in terms of zero-shot generalization.
We conduct extensive experiments to demonstrate that our method explicitly steers pretrained weight space to represent the task-specific knowledge.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones.
We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z) - Momentum-based Weight Interpolation of Strong Zero-Shot Models for
Continual Learning [46.80199921638615]
Large pre-trained, zero-shot capable models have shown considerable success both for standard transfer and adaptation tasks.
However, through naive fine-tuning, these zero-shot models lose their generalizability and robustness towards distribution shifts.
In this work, we showcase that where fine-tuning falls short to adapt such zero-shot capable models, simple momentum-based weight can provide consistent improvements.
arXiv Detail & Related papers (2022-11-06T17:41:39Z) - Task Agnostic Representation Consolidation: a Self-supervised based
Continual Learning Approach [14.674494335647841]
We propose a two-stage training paradigm for CL that intertwines task-agnostic and task-specific learning.
We show that our training paradigm can be easily added to memory- or regularization-based approaches.
arXiv Detail & Related papers (2022-07-13T15:16:51Z) - Self-Supervised Models are Continual Learners [79.70541692930108]
We show that self-supervised loss functions can be seamlessly converted into distillation mechanisms for Continual Learning.
We devise a framework for Continual self-supervised visual representation Learning that significantly improves the quality of the learned representations.
arXiv Detail & Related papers (2021-12-08T10:39:13Z) - When Does Contrastive Learning Preserve Adversarial Robustness from
Pretraining to Finetuning? [99.4914671654374]
We propose AdvCL, a novel adversarial contrastive pretraining framework.
We show that AdvCL is able to enhance cross-task robustness transferability without loss of model accuracy and finetuning efficiency.
arXiv Detail & Related papers (2021-11-01T17:59:43Z) - Continual Learning From Unlabeled Data Via Deep Clustering [7.704949298975352]
Continual learning aims to learn new tasks incrementally using less computation and memory resources instead of retraining the model from scratch whenever new task arrives.
We introduce a new framework to make continual learning feasible in unsupervised mode by using pseudo label obtained from cluster assignments to update model.
arXiv Detail & Related papers (2021-04-14T23:46:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.