Contrastive Demonstration Tuning for Pre-trained Language Models
- URL: http://arxiv.org/abs/2204.04392v4
- Date: Tue, 19 Sep 2023 12:27:36 GMT
- Title: Contrastive Demonstration Tuning for Pre-trained Language Models
- Authors: Xiaozhuan Liang, Ningyu Zhang, Siyuan Cheng, Zhenru Zhang, Chuanqi
Tan, Huajun Chen
- Abstract summary: Demonstration examples are crucial for an excellent final performance of prompt-tuning.
The proposed approach can be: (i) Plugged into any previous prompt-tuning approaches; (ii) Extended to widespread classification tasks with a large number of categories.
Experimental results on 16 datasets illustrate that our method integrated with previous approaches LM-BFF and P-tuning can yield better performance.
- Score: 59.90340768724675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pretrained language models can be effectively stimulated by textual prompts
or demonstrations, especially in low-data scenarios. Recent works have focused
on automatically searching discrete or continuous prompts or optimized
verbalizers, yet studies for the demonstration are still limited. Concretely,
the demonstration examples are crucial for an excellent final performance of
prompt-tuning. In this paper, we propose a novel pluggable, extensible, and
efficient approach named contrastive demonstration tuning, which is free of
demonstration sampling. Furthermore, the proposed approach can be: (i) Plugged
into any previous prompt-tuning approaches; (ii) Extended to widespread
classification tasks with a large number of categories. Experimental results on
16 datasets illustrate that our method integrated with previous approaches
LM-BFF and P-tuning can yield better performance. Code is available in
https://github.com/zjunlp/PromptKG/tree/main/research/Demo-Tuning.
Related papers
- DemoRank: Selecting Effective Demonstrations for Large Language Models in Ranking Task [24.780407347867943]
This paper explores how to select appropriate in-context demonstrations for the passage ranking task.
We propose a demonstration selection framework DemoRank for ranking task.
arXiv Detail & Related papers (2024-06-24T06:10:13Z) - Pattern-Aware Chain-of-Thought Prompting in Large Language Models [26.641713417293538]
Chain-of-thought (CoT) prompting can guide language models to engage in complex multi-step reasoning.
We show that the underlying reasoning patterns play a more crucial role in such tasks.
We propose Pattern-Aware CoT, a prompting method that considers the diversity of demonstration patterns.
arXiv Detail & Related papers (2024-04-23T07:50:00Z) - In-context Learning with Retrieved Demonstrations for Language Models: A Survey [23.24271704145876]
Few-shot in-context learners (ICL) are adept at adapting to new tasks with just a few demonstrations in the input context.
Instead of using a fixed set of demonstrations, one recent development is to retrieve demonstrations tailored to each input query.
We discuss and compare different design choices for retrieval models, retrieval training procedures, and inference algorithms.
arXiv Detail & Related papers (2024-01-21T23:34:42Z) - Exploring Lottery Prompts for Pre-trained Language Models [46.66885465183664]
We explore the instance-level prompt and their generalizability.
We find that for every instance, there is almost always a lottery prompt that induces the correct prediction from the PLM.
Some strong lottery prompts have high performance over the whole training set.
arXiv Detail & Related papers (2023-05-31T02:17:04Z) - Inverse Dynamics Pretraining Learns Good Representations for Multitask
Imitation [66.86987509942607]
We evaluate how such a paradigm should be done in imitation learning.
We consider a setting where the pretraining corpus consists of multitask demonstrations.
We argue that inverse dynamics modeling is well-suited to this setting.
arXiv Detail & Related papers (2023-05-26T14:40:46Z) - In-Context Demonstration Selection with Cross Entropy Difference [95.21947716378641]
Large language models (LLMs) can use in-context demonstrations to improve performance on zero-shot tasks.
We present a cross-entropy difference (CED) method for selecting in-context demonstrations.
arXiv Detail & Related papers (2023-05-24T05:04:00Z) - Dr.ICL: Demonstration-Retrieved In-context Learning [29.142262267850704]
In-context learning (ICL) teaching a large language model to perform a task with few-shot demonstrations has emerged as a strong paradigm for using LLMs.
Recent research suggests that retrieving semantically similar demonstrations to the input from a pool of available demonstrations results in better performance.
This work expands the applicability of retrieval-based ICL approaches by demonstrating that even simple word-overlap similarity measures such as BM25 outperform randomly selected demonstrations.
arXiv Detail & Related papers (2023-05-23T14:55:25Z) - Robustness of Demonstration-based Learning Under Limited Data Scenario [54.912936555876826]
Demonstration-based learning has shown great potential in stimulating pretrained language models' ability under limited data scenario.
Why such demonstrations are beneficial for the learning process remains unclear since there is no explicit alignment between the demonstrations and the predictions.
In this paper, we design pathological demonstrations by gradually removing intuitively useful information from the standard ones to take a deep dive of the robustness of demonstration-based sequence labeling.
arXiv Detail & Related papers (2022-10-19T16:15:04Z) - Prompt Tuning for Generative Multimodal Pretrained Models [75.44457974275154]
We implement prompt tuning on the unified sequence-to-sequence pretrained model adaptive to both understanding and generation tasks.
Experimental results demonstrate that the light-weight prompt tuning can achieve comparable performance with finetuning.
In comparison with finetuned models, the prompt-tuned models demonstrate improved robustness against adversarial attacks.
arXiv Detail & Related papers (2022-08-04T08:56:38Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.