Revisiting Class-Incremental Learning with Pre-Trained Models:
Generalizability and Adaptivity are All You Need
- URL: http://arxiv.org/abs/2303.07338v1
- Date: Mon, 13 Mar 2023 17:59:02 GMT
- Title: Revisiting Class-Incremental Learning with Pre-Trained Models:
Generalizability and Adaptivity are All You Need
- Authors: Da-Wei Zhou, Han-Jia Ye, De-Chuan Zhan, Ziwei Liu
- Abstract summary: Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones.
Recent pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL.
We argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring.
- Score: 76.10635571879762
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Class-incremental learning (CIL) aims to adapt to emerging new classes
without forgetting old ones. Traditional CIL models are trained from scratch to
continually acquire knowledge as data evolves. Recently, pre-training has
achieved substantial progress, making vast pre-trained models (PTMs) accessible
for CIL. Contrary to traditional methods, PTMs possess generalizable
embeddings, which can be easily transferred. In this work, we revisit CIL with
PTMs and argue that the core factors in CIL are adaptivity for model updating
and generalizability for knowledge transferring. 1) We first reveal that frozen
PTM can already provide generalizable embeddings for CIL. Surprisingly, a
simple baseline (SimpleCIL) which continually sets the classifiers of PTM to
prototype features can beat state-of-the-art even without training on the
downstream task. 2) Due to the distribution gap between pre-trained and
downstream datasets, PTM can be further cultivated with adaptivity via model
adapting. We propose ADapt And Merge (ADAM), which aggregates the embeddings of
PTM and adapted models for classifier construction. ADAM is a general framework
that can be orthogonally combined with any parameter-efficient tuning method,
which holds the advantages of PTM's generalizability and adapted model's
adaptivity. 3) Additionally, we find previous benchmarks are unsuitable in the
era of PTM due to data overlapping and propose four new benchmarks for
assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive
experiments validate the effectiveness of ADAM with a unified and concise
framework.
Related papers
- Continual Learning with Pre-Trained Models: A Survey [61.97613090666247]
Continual Learning aims to overcome the catastrophic forgetting of former knowledge when learning new ones.
This paper presents a comprehensive survey of the latest advancements in PTM-based CL.
arXiv Detail & Related papers (2024-01-29T18:27:52Z) - Rethinking Class-incremental Learning in the Era of Large Pre-trained Models via Test-Time Adaptation [20.62749699589017]
Class-incremental learning (CIL) is a challenging task that involves sequentially learning to categorize classes from new tasks.
We propose Test-Time Adaptation for Class-Incremental Learning (TTACIL) that first fine-tunes PTMs using Adapters on the first task.
Our TTACIL does not undergo any forgetting, while benefiting each task with the rich PTM features.
arXiv Detail & Related papers (2023-10-17T13:06:39Z) - RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones.
We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - CLIPood: Generalizing CLIP to Out-of-Distributions [73.86353105017076]
Contrastive language-image pre-training (CLIP) models have shown impressive zero-shot ability, but the further adaptation of CLIP on downstream tasks undesirably degrades OOD performances.
We propose CLIPood, a fine-tuning method that can adapt CLIP models to OOD situations where both domain shifts and open classes may occur on unseen test data.
Experiments on diverse datasets with different OOD scenarios show that CLIPood consistently outperforms existing generalization techniques.
arXiv Detail & Related papers (2023-02-02T04:27:54Z) - Ranking and Tuning Pre-trained Models: A New Paradigm of Exploiting
Model Hubs [136.4492678691406]
We propose a new paradigm of exploiting model hubs by ranking and tuning pre-trained models.
The best ranked PTM can be fine-tuned and deployed if we have no preference for the model's architecture.
The tuning part introduces a novel method for multiple PTMs tuning, which surpasses dedicated methods.
arXiv Detail & Related papers (2021-10-20T12:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.