A Similarity Paradigm Through Textual Regularization Without Forgetting
- URL: http://arxiv.org/abs/2502.14376v1
- Date: Thu, 20 Feb 2025 09:06:44 GMT
- Title: A Similarity Paradigm Through Textual Regularization Without Forgetting
- Authors: Fangming Cui, Jan Fong, Rongfei Zeng, Xinmei Tian, Jun Yu,
- Abstract summary: We propose a novel method called Similarity Paradigm with Textual Regularization (SPTR) for prompt learning without forgetting.
SPTR is a two-pronged design based on hand-crafted prompts that is an inseparable framework.
Four representative tasks across 11 datasets demonstrate that SPTR outperforms existing prompt learning methods.
- Score: 17.251684463032433
- License:
- Abstract: Prompt learning has emerged as a promising method for adapting pre-trained visual-language models (VLMs) to a range of downstream tasks. While optimizing the context can be effective for improving performance on specific tasks, it can often lead to poor generalization performance on unseen classes or datasets sampled from different distributions. It may be attributed to the fact that textual prompts tend to overfit downstream data distributions, leading to the forgetting of generalized knowledge derived from hand-crafted prompts. In this paper, we propose a novel method called Similarity Paradigm with Textual Regularization (SPTR) for prompt learning without forgetting. SPTR is a two-pronged design based on hand-crafted prompts that is an inseparable framework. 1) To avoid forgetting general textual knowledge, we introduce the optimal transport as a textual regularization to finely ensure approximation with hand-crafted features and tuning textual features. 2) In order to continuously unleash the general ability of multiple hand-crafted prompts, we propose a similarity paradigm for natural alignment score and adversarial alignment score to improve model robustness for generalization. Both modules share a common objective in addressing generalization issues, aiming to maximize the generalization capability derived from multiple hand-crafted prompts. Four representative tasks (i.e., non-generalization few-shot learning, base-to-novel generalization, cross-dataset generalization, domain generalization) across 11 datasets demonstrate that SPTR outperforms existing prompt learning methods.
Related papers
- Generalizable Prompt Tuning for Vision-Language Models [3.1008306011364644]
Learnable soft prompts often perform well in downstream tasks but lack generalizability.
The study shows that by treating soft and hand-crafted prompts as dual views of the textual modality, we can better ensemble task-specific and general semantic information.
To generate more expressive prompts, the study introduces a class-wise augmentation from the visual modality, resulting in significant robustness to a wider range of unseen classes.
arXiv Detail & Related papers (2024-10-04T07:02:13Z) - IntCoOp: Interpretability-Aware Vision-Language Prompt Tuning [94.52149969720712]
IntCoOp learns to jointly align attribute-level inductive biases and class embeddings during prompt-tuning.
IntCoOp improves CoOp by 7.35% in average performance across 10 diverse datasets.
arXiv Detail & Related papers (2024-06-19T16:37:31Z) - Text-driven Prompt Generation for Vision-Language Models in Federated
Learning [24.005620820818756]
Our work proposes Federated Text-driven Prompt Generation (FedTPG)
FedTPG learns a unified prompt generation network across multiple remote clients in a scalable manner.
Our comprehensive empirical evaluations on nine diverse image classification datasets show that our method is superior to existing federated prompt learning methods.
arXiv Detail & Related papers (2023-10-09T19:57:24Z) - DPL: Decoupled Prompt Learning for Vision-Language Models [41.90997623029582]
We propose a new method, Decoupled Prompt Learning, which reformulates the attention in prompt learning to alleviate this problem.
Our approach is flexible for both visual and textual modalities, making it easily extendable to multi-modal prompt learning.
arXiv Detail & Related papers (2023-08-19T15:48:38Z) - Self-regulating Prompts: Foundational Model Adaptation without
Forgetting [112.66832145320434]
We introduce a self-regularization framework for prompting called PromptSRC.
PromptSRC guides the prompts to optimize for both task-specific and task-agnostic general representations.
arXiv Detail & Related papers (2023-07-13T17:59:35Z) - Real-World Compositional Generalization with Disentangled
Sequence-to-Sequence Learning [81.24269148865555]
A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability.
We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency.
Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically.
arXiv Detail & Related papers (2022-12-12T15:40:30Z) - Learning Domain Invariant Prompt for Vision-Language Models [31.581652862478965]
We propose a novel prompt learning paradigm that directly generates emphdomain invariant prompt that can be generalized to unseen domains, called MetaPrompt.
Our method consistently and significantly outperforms existing methods.
arXiv Detail & Related papers (2022-12-08T11:23:24Z) - Bayesian Prompt Learning for Image-Language Model Generalization [64.50204877434878]
We use the regularization ability of Bayesian methods to frame prompt learning as a variational inference problem.
Our approach regularizes the prompt space, reduces overfitting to the seen prompts and improves the prompt generalization on unseen prompts.
We demonstrate empirically on 15 benchmarks that Bayesian prompt learning provides an appropriate coverage of the prompt space.
arXiv Detail & Related papers (2022-10-05T17:05:56Z) - LASP: Text-to-Text Optimization for Language-Aware Soft Prompting of
Vision & Language Models [67.19124099815645]
We propose a novel Language-Aware Soft Prompting (LASP) learning method to alleviate base class overfitting.
LASP is inherently amenable to including, during training, virtual classes, i.e. class names for which no visual samples are available.
LASP matches and surpasses, for the first time, the accuracy on novel classes obtained by hand-crafted prompts and CLIP for 8 out of 11 test datasets.
arXiv Detail & Related papers (2022-10-03T17:56:35Z) - Improving Multi-task Generalization Ability for Neural Text Matching via
Prompt Learning [54.66399120084227]
Recent state-of-the-art neural text matching models (PLMs) are hard to generalize to different tasks.
We adopt a specialization-generalization training strategy and refer to it as Match-Prompt.
In specialization stage, descriptions of different matching tasks are mapped to only a few prompt tokens.
In generalization stage, text matching model explores the essential matching signals by being trained on diverse multiple matching tasks.
arXiv Detail & Related papers (2022-04-06T11:01:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.