Related papers: CASP: Few-Shot Class-Incremental Learning with CLS Token Attention Steering Prompts

CASP: Few-Shot Class-Incremental Learning with CLS Token Attention Steering Prompts

URL: http://arxiv.org/abs/2601.16773v1
Date: Fri, 23 Jan 2026 14:19:04 GMT
Title: CASP: Few-Shot Class-Incremental Learning with CLS Token Attention Steering Prompts
Authors: Shuai Huang, Xuhan Lin, Yuwu Lu,
Abstract summary: Few-shot class-incremental learning (FSCIL) presents a core challenge in continual learning.<n>Recent prompt-based methods, which integrate pretrained backbones with task-specific prompts, have made notable progress.<n>We propose the CLS Token Attention Steering Prompts (CASP)
Score: 15.650117316903925
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Few-shot class-incremental learning (FSCIL) presents a core challenge in continual learning, requiring models to rapidly adapt to new classes with very limited samples while mitigating catastrophic forgetting. Recent prompt-based methods, which integrate pretrained backbones with task-specific prompts, have made notable progress. However, under extreme few-shot incremental settings, the model's ability to transfer and generalize becomes critical, and it is thus essential to leverage pretrained knowledge to learn feature representations that can be shared across future categories during the base session. Inspired by the mechanism of the CLS token, which is similar to human attention and progressively filters out task-irrelevant information, we propose the CLS Token Attention Steering Prompts (CASP). This approach introduces class-shared trainable bias parameters into the query, key, and value projections of the CLS token to explicitly modulate the self-attention weights. To further enhance generalization, we also design an attention perturbation strategy and perform Manifold Token Mixup in the shallow feature space, synthesizing potential new class features to improve generalization and reserve the representation capacity for upcoming tasks. Experiments on the CUB200, CIFAR100, and ImageNet-R datasets demonstrate that CASP outperforms state-of-the-art methods in both standard and fine-grained FSCIL settings without requiring fine-tuning during incremental phases and while significantly reducing the parameter overhead.

Related papers

Unlocking Prototype Potential: An Efficient Tuning Framework for Few-Shot Class-Incremental Learning [69.28860905525057]
Few-shot class-incremental learning (FSCIL) seeks to continuously learn new classes from very limited samples.<n>We introduce an efficient prototype fine-tuning framework that evolves static centroids into dynamic, learnable components.
arXiv Detail & Related papers (2026-02-05T03:50:53Z)
EKPC: Elastic Knowledge Preservation and Compensation for Class-Incremental Learning [53.88000987041739]
Class-Incremental Learning (CIL) aims to enable AI models to continuously learn from sequentially arriving data of different classes over time.<n>We propose the Elastic Knowledge Preservation and Compensation (EKPC) method, integrating Importance-aware importance Regularization (IPR) and Trainable Semantic Drift Compensation (TSDC) for CIL.
arXiv Detail & Related papers (2025-06-14T05:19:58Z)
CalFuse: Multi-Modal Continual Learning via Feature Calibration and Parameter Fusion [17.68751409041168]
Class-Continual Learning (CCL) addresses this challenge by incrementally incorporating new class knowledge without revisiting historical data.<n>Recent advances in Vision-Language Models (VLMs) such as CLIP demonstrate significant potential for CCL by leveraging pre-trained multi-modal knowledge.<n>We propose CalFuse, a framework that synergizes feature parameter Fusion to enable effective multi-modal knowledge integration.
arXiv Detail & Related papers (2025-03-24T13:44:12Z)
Task Consistent Prototype Learning for Incremental Few-shot Semantic Segmentation [20.49085411104439]
Incremental Few-Shot Semantic (iFSS) tackles a task that requires a model to continually expand its segmentation capability on novel classes. This study introduces a meta-learning-based prototype approach that encourages the model to learn how to adapt quickly while preserving previous knowledge. Experiments on iFSS datasets built upon PASCAL and COCO benchmarks show the advanced performance of the proposed approach.
arXiv Detail & Related papers (2024-10-16T23:42:27Z)
SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training [68.7896349660824]
We present an in-depth analysis of the progressive overfitting problem from the lens of Seq FT. Considering that the overly fast representation learning and the biased classification layer constitute this particular problem, we introduce the advanced Slow Learner with Alignment (S++) framework. Our approach involves a Slow Learner to selectively reduce the learning rate of backbone parameters, and a Alignment to align the disjoint classification layers in a post-hoc fashion.
arXiv Detail & Related papers (2024-08-15T17:50:07Z)
Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning [115.79349923044663]
Few-shot class-incremental learning (FSCIL) aims to incrementally learn novel classes from limited examples.<n>Existing methods face a critical dilemma: static architectures rely on a fixed parameter space to learn from data that arrive sequentially, prone to overfitting to the current session.<n>In this study, we explore the potential of Selective State Space Models (SSMs) for FSCIL.
arXiv Detail & Related papers (2024-07-08T17:09:39Z)
Continual Learners are Incremental Model Generalizers [70.34479702177988]
This paper extensively studies the impact of Continual Learning (CL) models as pre-trainers. We find that the transfer quality of the representation often increases gradually without noticeable degradation in fine-tuning performance. We propose a new fine-tuning scheme, GLobal Attention Discretization (GLAD), that preserves rich task-generic representation during solving downstream tasks.
arXiv Detail & Related papers (2023-06-21T05:26:28Z)
Complementary Learning Subnetworks for Parameter-Efficient Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks. Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z)
SLCA: Slow Learner with Classifier Alignment for Continual Learning on a Pre-trained Model [73.80068155830708]
We present an extensive analysis for continual learning on a pre-trained model (CLPM) We propose a simple but extremely effective approach named Slow Learner with Alignment (SLCA) Across a variety of scenarios, our proposal provides substantial improvements for CLPM.
arXiv Detail & Related papers (2023-03-09T08:57:01Z)
Mitigating Forgetting in Online Continual Learning via Contrasting Semantically Distinct Augmentations [22.289830907729705]
Online continual learning (OCL) aims to enable model learning from a non-stationary data stream to continuously acquire new knowledge as well as retain the learnt one. Main challenge comes from the "catastrophic forgetting" issue -- the inability to well remember the learnt knowledge while learning the new ones.
arXiv Detail & Related papers (2022-11-10T05:29:43Z)
Incremental Few-Shot Learning via Implanting and Compressing [13.122771115838523]
Incremental Few-Shot Learning requires a model to continually learn novel classes from only a few examples. We propose a two-step learning strategy referred to as textbfImplanting and textbfCompressing. Specifically, in the textbfImplanting step, we propose to mimic the data distribution of novel classes with the assistance of data-abundant base set. In the textbf step, we adapt the feature extractor to precisely represent each novel class for enhancing intra-class compactness.
arXiv Detail & Related papers (2022-03-19T11:04:43Z)
Self-Supervised Class Incremental Learning [51.62542103481908]
Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels. When updating them based on the new class data, they suffer from catastrophic forgetting: the model cannot discern old class data clearly from the new. In this paper, we explore the performance of Self-Supervised representation learning in Class Incremental Learning (SSCIL) for the first time.
arXiv Detail & Related papers (2021-11-18T06:58:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.