Self-regulating Prompts: Foundational Model Adaptation without
Forgetting
- URL: http://arxiv.org/abs/2307.06948v2
- Date: Thu, 24 Aug 2023 16:56:59 GMT
- Title: Self-regulating Prompts: Foundational Model Adaptation without
Forgetting
- Authors: Muhammad Uzair Khattak, Syed Talal Wasim, Muzammal Naseer, Salman
Khan, Ming-Hsuan Yang and Fahad Shahbaz Khan
- Abstract summary: We introduce a self-regularization framework for prompting called PromptSRC.
PromptSRC guides the prompts to optimize for both task-specific and task-agnostic general representations.
- Score: 112.66832145320434
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prompt learning has emerged as an efficient alternative for fine-tuning
foundational models, such as CLIP, for various downstream tasks. Conventionally
trained using the task-specific objective, i.e., cross-entropy loss, prompts
tend to overfit downstream data distributions and find it challenging to
capture task-agnostic general features from the frozen CLIP. This leads to the
loss of the model's original generalization capability. To address this issue,
our work introduces a self-regularization framework for prompting called
PromptSRC (Prompting with Self-regulating Constraints). PromptSRC guides the
prompts to optimize for both task-specific and task-agnostic general
representations using a three-pronged approach by: (a) regulating prompted
representations via mutual agreement maximization with the frozen model, (b)
regulating with self-ensemble of prompts over the training trajectory to encode
their complementary strengths, and (c) regulating with textual diversity to
mitigate sample diversity imbalance with the visual branch. To the best of our
knowledge, this is the first regularization framework for prompt learning that
avoids overfitting by jointly attending to pre-trained model features, the
training trajectory during prompting, and the textual diversity. PromptSRC
explicitly steers the prompts to learn a representation space that maximizes
performance on downstream tasks without compromising CLIP generalization. We
perform extensive experiments on 4 benchmarks where PromptSRC overall performs
favorably well compared to the existing methods. Our code and pre-trained
models are publicly available at: https://github.com/muzairkhattak/PromptSRC.
Related papers
- A Similarity Paradigm Through Textual Regularization Without Forgetting [17.251684463032433]
We propose a novel method called Similarity Paradigm with Textual Regularization (SPTR) for prompt learning without forgetting.
SPTR is a two-pronged design based on hand-crafted prompts that is an inseparable framework.
Four representative tasks across 11 datasets demonstrate that SPTR outperforms existing prompt learning methods.
arXiv Detail & Related papers (2025-02-20T09:06:44Z) - Towards Generalizable Trajectory Prediction Using Dual-Level Representation Learning And Adaptive Prompting [107.4034346788744]
Existing vehicle trajectory prediction models struggle with generalizability, prediction uncertainties, and handling complex interactions.
We propose Perceiver with Register queries (PerReg+), a novel trajectory prediction framework that introduces: (1) Dual-Level Representation Learning via Self-Distillation (SD) and Masked Reconstruction (MR), capturing global context and fine-grained details; (2) Enhanced Multimodality using register-based queries and pretraining, eliminating the need for clustering and suppression; and (3) Adaptive Prompt Tuning during fine-tuning, freezing the main architecture and optimizing a small number of prompts for efficient adaptation.
arXiv Detail & Related papers (2025-01-08T20:11:09Z) - A Systematic Examination of Preference Learning through the Lens of Instruction-Following [83.71180850955679]
We use a novel synthetic data generation pipeline to generate 48,000 instruction unique-following prompts.
With our synthetic prompts, we use two preference dataset curation methods - rejection sampling (RS) and Monte Carlo Tree Search (MCTS)
Experiments reveal that shared prefixes in preference pairs, as generated by MCTS, provide marginal but consistent improvements.
High-contrast preference pairs generally outperform low-contrast pairs; however, combining both often yields the best performance.
arXiv Detail & Related papers (2024-12-18T15:38:39Z) - CAPrompt: Cyclic Prompt Aggregation for Pre-Trained Model Based Class Incremental Learning [12.249938312431993]
We propose a novel Cyclic Prompt Aggregation (CAPrompt) method to eliminate the dependency on task ID prediction.
Under concave conditions, the aggregated prompt achieves lower error compared to selecting a single task-specific prompt.
Our proposed CAPrompt outperforms state-of-the-art methods by 2%-3%.
arXiv Detail & Related papers (2024-12-12T04:34:28Z) - Revisiting Prompt Pretraining of Vision-Language Models [13.888505919946578]
We propose a general framework termed Revisiting Prompt Pretraining (RPP)
RPP targets at improving the fitting and generalization ability from two aspects: prompt structure and prompt supervision.
We additionally utilize soft labels derived from zero-shot probability predictions provided by a pretrained Contrastive Language Image Pretraining (CLIP) teacher model.
arXiv Detail & Related papers (2024-09-10T02:36:13Z) - RESTORE: Towards Feature Shift for Vision-Language Prompt Learning [33.13407089704543]
We show that prompt tuning along only one branch of CLIP is the reason why the misalignment occurs.
Without proper regularization across the learnable parameters in different modalities, prompt learning violates the original pre-training constraints.
We propose RESTORE, a multi-modal prompt learning method that exerts explicit constraints on cross-modal consistency.
arXiv Detail & Related papers (2024-03-10T08:52:48Z) - Spurious Feature Eraser: Stabilizing Test-Time Adaptation for Vision-Language Foundation Model [86.9619638550683]
Vision-language foundation models have exhibited remarkable success across a multitude of downstream tasks due to their scalability on extensive image-text paired data.
However, these models display significant limitations when applied to downstream tasks, such as fine-grained image classification, as a result of decision shortcuts''
arXiv Detail & Related papers (2024-03-01T09:01:53Z) - Any-Shift Prompting for Generalization over Distributions [66.29237565901734]
We propose any-shift prompting: a general probabilistic inference framework that considers the relationship between training and test distributions during prompt learning.
Within this framework, the test prompt exploits the distribution relationships to guide the generalization of the CLIP image-language model from training to any test distribution.
The network generates the tailored test prompt with both training and test information in a feedforward pass, avoiding extra training costs at test time.
arXiv Detail & Related papers (2024-02-15T16:53:42Z) - Bayesian Prompt Learning for Image-Language Model Generalization [64.50204877434878]
We use the regularization ability of Bayesian methods to frame prompt learning as a variational inference problem.
Our approach regularizes the prompt space, reduces overfitting to the seen prompts and improves the prompt generalization on unseen prompts.
We demonstrate empirically on 15 benchmarks that Bayesian prompt learning provides an appropriate coverage of the prompt space.
arXiv Detail & Related papers (2022-10-05T17:05:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.