InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural
Language Understanding
- URL: http://arxiv.org/abs/2306.04933v1
- Date: Thu, 8 Jun 2023 04:31:48 GMT
- Title: InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural
Language Understanding
- Authors: Junda Wu, Tong Yu, Rui Wang, Zhao Song, Ruiyi Zhang, Handong Zhao,
Chaochao Lu, Shuai Li, Ricardo Henao
- Abstract summary: We develop an information-theoretic framework that formulates soft prompt tuning as maximizing mutual information between prompts and other model parameters.
We show that InfoPrompt can significantly accelerate the convergence of the prompt tuning and outperform traditional prompt tuning methods.
- Score: 51.48361798508375
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Soft prompt tuning achieves superior performances across a wide range of
few-shot tasks. However, the performances of prompt tuning can be highly
sensitive to the initialization of the prompts. We also empirically observe
that conventional prompt tuning methods cannot encode and learn sufficient
task-relevant information from prompt tokens. In this work, we develop an
information-theoretic framework that formulates soft prompt tuning as
maximizing mutual information between prompts and other model parameters (or
encoded representations). This novel view helps us to develop a more efficient,
accurate and robust soft prompt tuning method InfoPrompt. With this framework,
we develop two novel mutual information based loss functions, to (i) discover
proper prompt initialization for the downstream tasks and learn sufficient
task-relevant information from prompt tokens and (ii) encourage the output
representation from the pretrained language model to be more aware of the
task-relevant information captured in the learnt prompt. Extensive experiments
validate that InfoPrompt can significantly accelerate the convergence of the
prompt tuning and outperform traditional prompt tuning methods. Finally, we
provide a formal theoretical result for showing to show that gradient descent
type algorithm can be used to train our mutual information loss.
Related papers
- On the Role of Attention in Prompt-tuning [90.97555030446563]
We study prompt-tuning for one-layer attention architectures and study contextual mixture-models.
We show that softmax-prompt-attention is provably more expressive than softmax-self-attention and linear-prompt-attention.
We also provide experiments that verify our theoretical insights on real datasets and demonstrate how prompt-tuning enables the model to attend to context-relevant information.
arXiv Detail & Related papers (2023-06-06T06:23:38Z) - Gradient-Regulated Meta-Prompt Learning for Generalizable
Vision-Language Models [137.74524357614285]
We introduce a novel Gradient-RegulAted Meta-prompt learning framework.
It helps pre-training models adapt to downstream tasks in a parameter -- and data -- efficient way.
GRAM can be easily incorporated into various prompt tuning methods in a model-agnostic way.
arXiv Detail & Related papers (2023-03-12T05:03:37Z) - Dynamic Prompting: A Unified Framework for Prompt Tuning [33.175097465669374]
We present a unified dynamic prompt (DP) tuning strategy that dynamically determines different factors of prompts based on specific tasks and instances.
Experimental results underscore the significant performance improvement achieved by dynamic prompt tuning across a wide range of tasks.
We establish the universal applicability of our approach under full-data, few-shot, and multitask scenarios.
arXiv Detail & Related papers (2023-03-06T06:04:46Z) - MetaPrompting: Learning to Learn Better Prompts [52.914694884515534]
We propose a new soft prompting method called MetaPrompting, which adopts the well-recognized model-agnostic meta-learning algorithm.
Extensive experiments show MetaPrompting brings significant improvement on four different datasets.
arXiv Detail & Related papers (2022-09-23T09:01:05Z) - Dual Modality Prompt Tuning for Vision-Language Pre-Trained Model [39.722927180264584]
We propose a novel Dual-modality Prompt Tuning (DPT) paradigm through learning text and visual prompts simultaneously.
To make the final image feature concentrate more on the target visual concept, a Class-Aware Visual Prompt Tuning scheme is proposed.
arXiv Detail & Related papers (2022-08-17T15:06:36Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization
for Relation Extraction [111.74812895391672]
We propose a Knowledge-aware Prompt-tuning approach with synergistic optimization (KnowPrompt)
We inject latent knowledge contained in relation labels into prompt construction with learnable virtual type words and answer words.
arXiv Detail & Related papers (2021-04-15T17:57:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.