Related papers: Convolutional Prompting meets Language Models for Continual Learning

Convolutional Prompting meets Language Models for Continual Learning

URL: http://arxiv.org/abs/2403.20317v1
Date: Fri, 29 Mar 2024 17:40:37 GMT
Title: Convolutional Prompting meets Language Models for Continual Learning
Authors: Anurag Roy, Riddhiman Moulick, Vinay K. Verma, Saptarshi Ghosh, Abir Das,
Abstract summary: Continual Learning (CL) enables machine learning models to learn from continuously shifting new training data in absence of data from old tasks. We propose ConvPrompt, a novel convolutional prompt creation mechanism that maintains layer-wise shared embeddings. The intelligent use of convolution enables us to maintain a low parameter overhead without compromising performance.
Score: 4.115213208594654
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Continual Learning (CL) enables machine learning models to learn from continuously shifting new training data in absence of data from old tasks. Recently, pretrained vision transformers combined with prompt tuning have shown promise for overcoming catastrophic forgetting in CL. These approaches rely on a pool of learnable prompts which can be inefficient in sharing knowledge across tasks leading to inferior performance. In addition, the lack of fine-grained layer specific prompts does not allow these to fully express the strength of the prompts for CL. We address these limitations by proposing ConvPrompt, a novel convolutional prompt creation mechanism that maintains layer-wise shared embeddings, enabling both layer-specific learning and better concept transfer across tasks. The intelligent use of convolution enables us to maintain a low parameter overhead without compromising performance. We further leverage Large Language Models to generate fine-grained text descriptions of each category which are used to get task similarity and dynamically decide the number of prompts to be learned. Extensive experiments demonstrate the superiority of ConvPrompt and improves SOTA by ~3% with significantly less parameter overhead. We also perform strong ablation over various modules to disentangle the importance of different components.

Related papers

Teaching Prompts to Coordinate: Hierarchical Layer-Grouped Prompt Tuning for Continual Learning [69.17264556340244]
We propose a novel hierarchical layer-grouped prompt tuning method for continual learning.<n>It improves model stability in two ways: (i) Layers in the same group share roughly the same prompts, which are adjusted by position encoding.<n>It utilizes a single task-specific root prompt to learn to generate sub-prompts for each layer group.
arXiv Detail & Related papers (2025-11-15T08:15:51Z)
DSS-Prompt: Dynamic-Static Synergistic Prompting for Few-Shot Class-Incremental Learning [4.957021413601961]
We introduce DSS-Prompt, a simple yet effective approach that transforms the pre-trained Vision Transformer with minimal modifications.<n>We conduct extensive experiments on four benchmarks to validate the effectiveness of our DSS-Prompt.<n>We show that it consistently achieves better performance than existing approaches on all datasets.
arXiv Detail & Related papers (2025-08-13T13:10:18Z)
Weighted Multi-Prompt Learning with Description-free Large Language Model Distillation [1.3381749415517021]
New approaches leveraging Large Language Models (LLM) in prompts have been proposed, enhancing robustness to unseen and diverse data.<n>Existing methods typically extract text-based responses (i.e., descriptions) from LLM to incorporate into prompts.<n>We propose Description-free Multi-prompt Learning(DeMul), a novel method that eliminates the process of extracting descriptions and instead directly distills knowledge from LLM into prompts.
arXiv Detail & Related papers (2025-07-09T07:55:25Z)
FDBPL: Faster Distillation-Based Prompt Learning for Region-Aware Vision-Language Models Adaptation [17.51747913191231]
We propose large textbfFaster large textbfDistillation-large textbfBased large textbfPrompt large textbfLL (textbfFDBPL)<n>It addresses issues by sharing soft supervision contexts across multiple training stages and implementing accelerated I/O. Comprehensive evaluations across 11 datasets demonstrate superior performance in base-to-new generalization, cross-dataset transfer, and robustness tests, achieving $2.2times$ faster training speed.
arXiv Detail & Related papers (2025-05-23T15:57:16Z)
Achieving More with Less: Additive Prompt Tuning for Rehearsal-Free Class-Incremental Learning [76.32953653161417]
Class-incremental learning enables models to learn new classes progressively while preserving knowledge of previously learned ones. Recent advances in this field have shifted towards parameter-efficient fine-tuning techniques. We present a novel prompt-based approach that addresses the limitation of current approaches.
arXiv Detail & Related papers (2025-03-11T02:27:37Z)
Modular Prompt Learning Improves Vision-Language Models [49.132774679968456]
We propose Modular Prompt Learning (MPL) to promote the preservation of information contained in the inserted prompts. On average, our method achieves 0.7% performance gain on the base-to-new generalization task. The largest improvement on the individual dataset is 10.7%.
arXiv Detail & Related papers (2025-02-19T22:00:20Z)
Adapter-Enhanced Semantic Prompting for Continual Learning [91.63494614012362]
Continual learning (CL) enables models to adapt to evolving data streams. Traditional methods usually retain the past data for replay or add additional branches in the model to learn new knowledge. We propose a novel lightweight CL framework, which integrates prompt tuning and adapter techniques.
arXiv Detail & Related papers (2024-12-15T06:14:55Z)
LW2G: Learning Whether to Grow for Prompt-based Continual Learning [15.766350352592331]
Recent Prompt-based Continual Learning (PCL) has achieved remarkable performance with Pre-Trained Models (PTMs) We propose a plug-in module in the former stage to textbfLearn Whether to Grow (LW2G) based on the disparities between tasks. Inspired by Gradient Projection Continual Learning, our LW2G develops a metric called Hinder Forward Capability (HFC) to measure the hindrance imposed on learning new tasks.
arXiv Detail & Related papers (2024-09-27T15:55:13Z)
PECTP: Parameter-Efficient Cross-Task Prompts for Incremental Vision Transformer [76.39111896665585]
Incremental Learning (IL) aims to learn deep models on sequential tasks continually. Recent vast pre-trained models (PTMs) have achieved outstanding performance by prompt technique in practical IL without the old samples.
arXiv Detail & Related papers (2024-07-04T10:37:58Z)
Few-Shot Class Incremental Learning with Attention-Aware Self-Adaptive Prompt [58.880105981772324]
We propose a novel framework named Attention-aware Self-adaptive Prompt (ASP) ASP encourages task-invariant prompts to capture shared knowledge by reducing specific information from the attention aspect. In summary, ASP prevents overfitting on base task and does not require enormous data in few-shot incremental tasks.
arXiv Detail & Related papers (2024-03-14T20:34:53Z)
SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models [71.78800549517298]
Continual learning (CL) ability is vital for deploying large language models (LLMs) in the dynamic world. Existing methods devise the learning module to acquire task-specific knowledge with parameter-efficient tuning (PET) block and the selection module to pick out the corresponding one for the testing input. We propose a novel Shared Attention Framework (SAPT) to align the PET learning and selection via the Shared Attentive Learning & Selection module.
arXiv Detail & Related papers (2024-01-16T11:45:03Z)
Introducing Language Guidance in Prompt-based Continual Learning [95.03110230754423]
We propose Language Guidance for Prompt-based Continual Learning (LGCL) as a plug-in for prompt-based methods. LGCL consistently improves the performance of prompt-based continual learning methods to set a new state-of-the art.
arXiv Detail & Related papers (2023-08-30T08:03:49Z)
Self-regulating Prompts: Foundational Model Adaptation without Forgetting [112.66832145320434]
We introduce a self-regularization framework for prompting called PromptSRC. PromptSRC guides the prompts to optimize for both task-specific and task-agnostic general representations.
arXiv Detail & Related papers (2023-07-13T17:59:35Z)
Task-Attentive Transformer Architecture for Continual Learning of Vision-and-Language Tasks Using Knowledge Distillation [18.345183818638475]
Continual learning (CL) can serve as a remedy through enabling knowledge-transfer across sequentially arriving tasks. We develop a transformer-based CL architecture for learning bimodal vision-and-language tasks. Our approach is scalable learning to a large number of tasks because it requires little memory and time overhead.
arXiv Detail & Related papers (2023-03-25T10:16:53Z)
Multimodal Parameter-Efficient Few-Shot Class Incremental Learning [1.9220716793379256]
Few-Shot Class Incremental Learning (FSCIL) is a challenging continual learning task, where limited training examples are available during several learning sessions. To succeed in this task, it is necessary to avoid over-fitting new classes caused by biased distributions in the few-shot training sets. CPE-CLIP significantly improves FSCIL performance compared to state-of-the-art proposals while also drastically reducing the number of learnable parameters and training costs.
arXiv Detail & Related papers (2023-03-08T17:34:15Z)
MaPLe: Multi-modal Prompt Learning [54.96069171726668]
We propose Multi-modal Prompt Learning (MaPLe) for both vision and language branches to improve alignment between the vision and language representations. Compared with the state-of-the-art method Co-CoOp, MaPLe exhibits favorable performance and achieves an absolute gain of 3.45% on novel classes.
arXiv Detail & Related papers (2022-10-06T17:59:56Z)
HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both Language and Vision-and-Language Tasks [38.43269863509866]
How to perform parameter-efficient fine-tuning has become fairly important for quick transfer learning and deployment. We design a novel unified parameter-efficient transfer learning framework that works effectively on both pure language and V&L tasks. Our proposed framework adds fewer trainable parameters in multi-task learning while achieving superior performances and transfer ability compared to state-of-the-art methods.
arXiv Detail & Related papers (2022-03-08T06:51:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.