Adaptive Multi-Modality Prompt Learning
- URL: http://arxiv.org/abs/2312.00823v1
- Date: Thu, 30 Nov 2023 12:10:22 GMT
- Title: Adaptive Multi-Modality Prompt Learning
- Authors: Zongqian Wu, Yujing Liu, Mengmeng Zhan, Jialie Shen, Ping Hu, Xiaofeng
Zhu
- Abstract summary: We propose an adaptive multi-modality prompt learning to address the above issues.
The image prompt learning achieves in-sample and out-of-sample generalization, by first masking meaningless patches and then padding them with the learnable parameters and the information from texts.
Experimental results on real datasets demonstrate that our method outperforms SOTA methods, in terms of different downstream tasks.
- Score: 21.86784369327551
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although current prompt learning methods have successfully been designed to
effectively reuse the large pre-trained models without fine-tuning their large
number of parameters, they still have limitations to be addressed, i.e.,
without considering the adverse impact of meaningless patches in every image
and without simultaneously considering in-sample generalization and
out-of-sample generalization. In this paper, we propose an adaptive
multi-modality prompt learning to address the above issues. To do this, we
employ previous text prompt learning and propose a new image prompt learning.
The image prompt learning achieves in-sample and out-of-sample generalization,
by first masking meaningless patches and then padding them with the learnable
parameters and the information from texts. Moreover, each of the prompts
provides auxiliary information to each other, further strengthening these two
kinds of generalization. Experimental results on real datasets demonstrate that
our method outperforms SOTA methods, in terms of different downstream tasks.
Related papers
- MuAP: Multi-step Adaptive Prompt Learning for Vision-Language Model with Missing Modality [11.03329286331929]
We present the first comprehensive investigation into prompt learning behavior when modalities are incomplete.
We propose a novel Multi-step Adaptive Prompt Learning framework, aiming to generate multimodal prompts and perform multi-step prompt tuning.
arXiv Detail & Related papers (2024-09-07T03:33:46Z) - Adapting Vision-Language Models to Open Classes via Test-Time Prompt Tuning [50.26965628047682]
Adapting pre-trained models to open classes is a challenging problem in machine learning.
In this paper, we consider combining the advantages of both and come up with a test-time prompt tuning approach.
Our proposed method outperforms all comparison methods on average considering both base and new classes.
arXiv Detail & Related papers (2024-08-29T12:34:01Z) - Detecting, Explaining, and Mitigating Memorization in Diffusion Models [49.438362005962375]
We introduce a straightforward yet effective method for detecting memorized prompts by inspecting the magnitude of text-conditional predictions.
Our proposed method seamlessly integrates without disrupting sampling algorithms, and delivers high accuracy even at the first generation step.
Building on our detection strategy, we unveil an explainable approach that shows the contribution of individual words or tokens to memorization.
arXiv Detail & Related papers (2024-07-31T16:13:29Z) - Conditional Prototype Rectification Prompt Learning [32.533844163120875]
We propose a Prototype Rectification Prompt Learning (CPR) method to correct the bias of base examples and augment limited data in an effective way.
CPR achieves state-of-the-art performance on both few-shot classification and base-to-new generalization tasks.
arXiv Detail & Related papers (2024-04-15T15:43:52Z) - One-Shot Learning as Instruction Data Prospector for Large Language Models [108.81681547472138]
textscNuggets uses one-shot learning to select high-quality instruction data from extensive datasets.
We show that instruction tuning with the top 1% of examples curated by textscNuggets substantially outperforms conventional methods employing the entire dataset.
arXiv Detail & Related papers (2023-12-16T03:33:12Z) - DPL: Decoupled Prompt Learning for Vision-Language Models [41.90997623029582]
We propose a new method, Decoupled Prompt Learning, which reformulates the attention in prompt learning to alleviate this problem.
Our approach is flexible for both visual and textual modalities, making it easily extendable to multi-modal prompt learning.
arXiv Detail & Related papers (2023-08-19T15:48:38Z) - Gradient-Regulated Meta-Prompt Learning for Generalizable
Vision-Language Models [137.74524357614285]
We introduce a novel Gradient-RegulAted Meta-prompt learning framework.
It helps pre-training models adapt to downstream tasks in a parameter -- and data -- efficient way.
GRAM can be easily incorporated into various prompt tuning methods in a model-agnostic way.
arXiv Detail & Related papers (2023-03-12T05:03:37Z) - Learning Domain Invariant Prompt for Vision-Language Models [31.581652862478965]
We propose a novel prompt learning paradigm that directly generates emphdomain invariant prompt that can be generalized to unseen domains, called MetaPrompt.
Our method consistently and significantly outperforms existing methods.
arXiv Detail & Related papers (2022-12-08T11:23:24Z) - TEMPERA: Test-Time Prompting via Reinforcement Learning [57.48657629588436]
We propose Test-time Prompt Editing using Reinforcement learning (TEMPERA)
In contrast to prior prompt generation methods, TEMPERA can efficiently leverage prior knowledge.
Our method achieves 5.33x on average improvement in sample efficiency when compared to the traditional fine-tuning methods.
arXiv Detail & Related papers (2022-11-21T22:38:20Z) - Instance-aware Prompt Learning for Language Understanding and Generation [49.22899822734549]
We propose an instance-aware prompt learning method that learns a different prompt for each instance.
Our method achieves the state-of-the-art on the SuperGLUE few-shot learning benchmark.
arXiv Detail & Related papers (2022-01-18T17:03:25Z) - Continual Learning for Text Classification with Information
Disentanglement Based Regularization [18.258948837964724]
We propose an information disentanglement based regularization method for continual learning on text classification.
Experiments conducted on large-scale benchmarks demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2021-04-12T14:17:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.