Revisiting Pool-based Prompt Learning for Few-shot Class-incremental Learning
- URL: http://arxiv.org/abs/2507.09183v2
- Date: Sun, 20 Jul 2025 03:01:51 GMT
- Title: Revisiting Pool-based Prompt Learning for Few-shot Class-incremental Learning
- Authors: Yongwei Jiang, Yixiong Zou, Yuhua Li, Ruixuan Li,
- Abstract summary: Few-Shot Class-Incremental Learning (FSCIL) faces dual challenges of data scarcity and incremental learning.<n>This paper presents the first study of current prompt pool methods in FSCIL tasks, revealing an unanticipated performance degradation in incremental sessions.<n>We propose LGSP-Prompt, which innovatively shifts pool-based prompt learning from the token dimension to the spatial dimension.
- Score: 9.365590675168589
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-Shot Class-Incremental Learning (FSCIL) faces dual challenges of data scarcity and incremental learning in real-world scenarios. While pool-based prompting methods have demonstrated success in traditional incremental learning, their effectiveness in FSCIL settings remains unexplored. This paper presents the first study of current prompt pool methods in FSCIL tasks, revealing an unanticipated performance degradation in incremental sessions. Through comprehensive analysis, we identify that this phenomenon stems from token-dimension saturation: with limited data, excessive prompts compete for task-relevant information, leading to model overfitting. Based on this finding, we propose LGSP-Prompt (Local-Global Spatial Prompting), which innovatively shifts pool-based prompt learning from the token dimension to the spatial dimension. LGSP-Prompt generates spatial prompts by synergistically combining local spatial features and global frequency-domain representations to highlight key patterns in input images. We construct two spatial prompt pools enabling dynamic prompt selection to maintain acquired knowledge while effectively learning novel sessions. Extensive experiments demonstrate that our approach achieves state-of-the-art performance across multiple FSCIL benchmarks, showing significant advantages in both base knowledge preservation and incremental learning. Our implementation is available at https://github.com/Jywsuperman/LGSP.
Related papers
- An Empirical Study of Federated Prompt Learning for Vision Language Model [50.73746120012352]
This paper systematically investigates behavioral differences between language prompt learning and vision prompt learning.<n>We conduct experiments to evaluate the impact of various fl and prompt configurations, such as client scale, aggregation strategies, and prompt length.<n>We explore strategies for enhancing prompt learning in complex scenarios where label skew and domain shift coexist.
arXiv Detail & Related papers (2025-05-29T03:09:15Z) - FDBPL: Faster Distillation-Based Prompt Learning for Region-Aware Vision-Language Models Adaptation [17.51747913191231]
We propose large textbfFaster large textbfDistillation-large textbfBased large textbfPrompt large textbfLL (textbfFDBPL)<n>It addresses issues by sharing soft supervision contexts across multiple training stages and implementing accelerated I/O. Comprehensive evaluations across 11 datasets demonstrate superior performance in base-to-new generalization, cross-dataset transfer, and robustness tests, achieving $2.2times$ faster training speed.
arXiv Detail & Related papers (2025-05-23T15:57:16Z) - ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search [53.40810298627443]
ReGUIDE is a framework for web grounding that enables MLLMs to learn data efficiently through self-generated reasoning and spatial-aware criticism.<n>Our experiments demonstrate that ReGUIDE significantly advances web grounding performance across multiple benchmarks.
arXiv Detail & Related papers (2025-05-21T08:36:18Z) - SiameseDuo++: Active Learning from Data Streams with Dual Augmented Siamese Networks [8.762175520727611]
This work proposes the SiameseDuo++ method, which uses active learning to automatically select instances for a human expert to label according to a budget.<n>Specifically, it incrementally trains two siamese neural networks which operate in synergy, augmented by generated examples.<n> Simulation experiments show that the proposed method outperforms strong baselines and state-of-the-art methods in terms of learning speed and/or performance.
arXiv Detail & Related papers (2025-04-06T20:45:25Z) - Underlying Semantic Diffusion for Effective and Efficient In-Context Learning [113.4003355229632]
Underlying Semantic Diffusion (US-Diffusion) is an enhanced diffusion model that boosts underlying semantics learning, computational efficiency, and in-context learning capabilities.<n>We present a Feedback-Aided Learning (FAL) framework, which leverages feedback signals to guide the model in capturing semantic details.<n>We also propose a plug-and-play Efficient Sampling Strategy (ESS) for dense sampling at time steps with high-noise levels.
arXiv Detail & Related papers (2025-03-06T03:06:22Z) - USDRL: Unified Skeleton-Based Dense Representation Learning with Multi-Grained Feature Decorrelation [24.90512145836643]
We introduce a Unified Skeleton-based Dense Representation Learning framework based on feature decorrelation.<n>We show that our approach significantly outperforms the current state-of-the-art (SOTA) approaches.
arXiv Detail & Related papers (2024-12-12T12:20:27Z) - LW2G: Learning Whether to Grow for Prompt-based Continual Learning [55.552510632228326]
Recent Prompt-based Continual learning has achieved remarkable performance with pre-trained models.<n>These approaches expand a prompt pool by adding a new set of prompts while learning and select the correct set during inference.<n>Previous studies have revealed that learning task-wised prompt sets individually and low selection accuracy pose challenges to the performance of PCL.
arXiv Detail & Related papers (2024-09-27T15:55:13Z) - Conditional Prototype Rectification Prompt Learning [32.533844163120875]
We propose a Prototype Rectification Prompt Learning (CPR) method to correct the bias of base examples and augment limited data in an effective way.
CPR achieves state-of-the-art performance on both few-shot classification and base-to-new generalization tasks.
arXiv Detail & Related papers (2024-04-15T15:43:52Z) - PL-FSCIL: Harnessing the Power of Prompts for Few-Shot Class-Incremental Learning [9.247718160705512]
Few-Shot Class-Incremental Learning (FSCIL) aims to enable deep neural networks to learn new tasks incrementally from a small number of labeled samples.
We propose a novel approach called Prompt Learning for FSCIL (PL-FSCIL)
PL-FSCIL harnesses the power of prompts in conjunction with a pre-trained Vision Transformer (ViT) model to address the challenges of FSCIL effectively.
arXiv Detail & Related papers (2024-01-26T12:11:04Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - Incremental Embedding Learning via Zero-Shot Translation [65.94349068508863]
Current state-of-the-art incremental learning methods tackle catastrophic forgetting problem in traditional classification networks.
We propose a novel class-incremental method for embedding network, named as zero-shot translation class-incremental method (ZSTCI)
In addition, ZSTCI can easily be combined with existing regularization-based incremental learning methods to further improve performance of embedding networks.
arXiv Detail & Related papers (2020-12-31T08:21:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.