Related papers: Sculpting [CLS] Features for Pre-Trained Model-Based Class-Incremental Learning

Sculpting [CLS] Features for Pre-Trained Model-Based Class-Incremental Learning

URL: http://arxiv.org/abs/2502.14762v1
Date: Thu, 20 Feb 2025 17:37:08 GMT
Title: Sculpting [CLS] Features for Pre-Trained Model-Based Class-Incremental Learning
Authors: Murat Onur Yildirim, Elif Ceren Gok Yildirim, Joaquin Vanschoren,
Abstract summary: Class-incremental learning requires models to continually acquire knowledge of new classes without forgetting old ones.<n>Although pre-trained models have demonstrated strong performance in class-incremental learning, they remain susceptible to catastrophic forgetting when learning new concepts.<n>We introduce a new parameter-efficient fine-tuning module 'Learn and Calibrate', or LuCA, designed to acquire knowledge through an adapter-calibrator couple.<n>For each learning session, we deploy a sparse LuCA module on top of the last token, which we refer to as 'Token-level Sparse and Adaptation', or TO
Score: 3.73232466691291
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Class-incremental learning requires models to continually acquire knowledge of new classes without forgetting old ones. Although pre-trained models have demonstrated strong performance in class-incremental learning, they remain susceptible to catastrophic forgetting when learning new concepts. Excessive plasticity in the models breaks generalizability and causes forgetting, while strong stability results in insufficient adaptation to new classes. This necessitates effective adaptation with minimal modifications to preserve the general knowledge of pre-trained models. To address this challenge, we first introduce a new parameter-efficient fine-tuning module 'Learn and Calibrate', or LuCA, designed to acquire knowledge through an adapter-calibrator couple, enabling effective adaptation with well-refined feature representations. Second, for each learning session, we deploy a sparse LuCA module on top of the last token just before the classifier, which we refer to as 'Token-level Sparse Calibration and Adaptation', or TOSCA. This strategic design improves the orthogonality between the modules and significantly reduces both training and inference complexity. By leaving the generalization capabilities of the pre-trained models intact and adapting exclusively via the last token, our approach achieves a harmonious balance between stability and plasticity. Extensive experiments demonstrate TOSCA's state-of-the-art performance while introducing ~8 times fewer parameters compared to prior methods.

Related papers

Bayesian Principles Improve Prompt Learning In Vision-Language Models [10.593234723172767]
We propose a new training objective function based on a Bayesian learning principle to balance adaptability and generalizability.<n>This objective establishes a balance by allowing the fine-tuned model to adapt to downstream tasks while remaining close to the pre-trained model.
arXiv Detail & Related papers (2025-04-19T00:48:09Z)
Feature Calibration enhanced Parameter Synthesis for CLIP-based Class-incremental Learning [10.253058594622017]
Class-Incremental Learning (CIL) enables models to continuously learn new class knowledge while retaining previous classes. Traditional CIL methods rely primarily on visual features, which limits their effectiveness in complex, multimodal scenarios. We propose a Feature Enhanced Synthesis (FCPS) framework that mitigates catastrophic generalization while preserving the model's intrinsic generalization capability.
arXiv Detail & Related papers (2025-03-24T13:44:12Z)
Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion [10.322832012497722]
Class-incremental learning is a challenging problem, where the goal is to train a model that can classify data from an increasing number of classes over time. With the advancement of vision-language pre-trained models such as CLIP, they demonstrate good generalization ability. However, further adaptation to downstream tasks by simply fine-tuning the model leads to severe forgetting. Most existing works with pre-trained models assume that the forgetting of old classes is uniform when the model acquires new knowledge.
arXiv Detail & Related papers (2024-07-19T09:20:33Z)
Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning [113.89327264634984]
Few-shot class-incremental learning (FSCIL) confronts the challenge of integrating new classes into a model with minimal training samples. Traditional methods widely adopt static adaptation relying on a fixed parameter space to learn from data that arrive sequentially. We propose a dual selective SSM projector that dynamically adjusts the projection parameters based on the intermediate features for dynamic adaptation.
arXiv Detail & Related papers (2024-07-08T17:09:39Z)
Semantically-Shifted Incremental Adapter-Tuning is A Continual ViTransformer [44.10678347943115]
Class-incremental learning (CIL) aims to enable models to continuously learn new classes while overcoming catastrophic forgetting. In this paper, we revisit different parameter-efficient tuning (PET) methods within the context of continual learning. We observe that adapter tuning demonstrates superiority over prompt-based methods, even without parameter expansion in each learning session.
arXiv Detail & Related papers (2024-03-29T05:23:12Z)
Meta-Learned Attribute Self-Interaction Network for Continual and Generalized Zero-Shot Learning [46.6282595346048]
Zero-shot learning (ZSL) is a promising approach to generalizing a model to unseen categories during training. We propose a Meta-learned Attribute self-Interaction Network (MAIN) for continual ZSL. By pairing attribute self-interaction trained using meta-learning with inverse regularization of the attribute encoder, we are able to outperform state-of-the-art results without leveraging the unseen class attributes.
arXiv Detail & Related papers (2023-12-02T16:23:01Z)
Class Incremental Learning with Pre-trained Vision-Language Models [59.15538370859431]
We propose an approach to exploiting pre-trained vision-language models (e.g. CLIP) that enables further adaptation. Experiments on several conventional benchmarks consistently show a significant margin of improvement over the current state-of-the-art.
arXiv Detail & Related papers (2023-10-31T10:45:03Z)
Continual Learners are Incremental Model Generalizers [70.34479702177988]
This paper extensively studies the impact of Continual Learning (CL) models as pre-trainers. We find that the transfer quality of the representation often increases gradually without noticeable degradation in fine-tuning performance. We propose a new fine-tuning scheme, GLobal Attention Discretization (GLAD), that preserves rich task-generic representation during solving downstream tasks.
arXiv Detail & Related papers (2023-06-21T05:26:28Z)
SRIL: Selective Regularization for Class-Incremental Learning [5.810252620242912]
Class-Incremental Learning aims to create an integrated model that balances plasticity and stability to overcome this challenge. We propose a selective regularization method that accepts new knowledge while maintaining previous knowledge. We validate the effectiveness of the proposed method through extensive experimental protocols using CIFAR-100, ImageNet-Subset, and ImageNet-Full.
arXiv Detail & Related papers (2023-05-09T05:04:35Z)
On the Stability-Plasticity Dilemma of Class-Incremental Learning [50.863180812727244]
A primary goal of class-incremental learning is to strike a balance between stability and plasticity. This paper aims to shed light on how effectively recent class-incremental learning algorithms address the stability-plasticity trade-off.
arXiv Detail & Related papers (2023-04-04T09:34:14Z)
Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need [84.3507610522086]
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Recent pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. We argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring.
arXiv Detail & Related papers (2023-03-13T17:59:02Z)
Mitigating Forgetting in Online Continual Learning via Contrasting Semantically Distinct Augmentations [22.289830907729705]
Online continual learning (OCL) aims to enable model learning from a non-stationary data stream to continuously acquire new knowledge as well as retain the learnt one. Main challenge comes from the "catastrophic forgetting" issue -- the inability to well remember the learnt knowledge while learning the new ones.
arXiv Detail & Related papers (2022-11-10T05:29:43Z)
Towards Sustainable Self-supervised Learning [193.78876000005366]
We propose a Target-Enhanced Conditional (TEC) scheme which introduces two components to the existing mask-reconstruction based SSL. First, we propose patch-relation enhanced targets which enhances the target given by base model and encourages the new model to learn semantic-relation knowledge from the base model. Secondly, we introduce a conditional adapter that adaptively adjusts new model prediction to align with the target of different base models.
arXiv Detail & Related papers (2022-10-20T04:49:56Z)
Meta-Learned Attribute Self-Gating for Continual Generalized Zero-Shot Learning [82.07273754143547]
We propose a meta-continual zero-shot learning (MCZSL) approach to generalizing a model to categories unseen during training. By pairing self-gating of attributes and scaled class normalization with meta-learning based training, we are able to outperform state-of-the-art results.
arXiv Detail & Related papers (2021-02-23T18:36:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.