Related papers: Demystifying Prompts in Language Models via Perplexity Estimation

Demystifying Prompts in Language Models via Perplexity Estimation

URL: http://arxiv.org/abs/2212.04037v2
Date: Thu, 12 Sep 2024 19:54:37 GMT
Title: Demystifying Prompts in Language Models via Perplexity Estimation
Authors: Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer,
Abstract summary: Performance of a prompt is coupled with the extent to which the model is familiar with the language it contains. We show that the lower the perplexity of the prompt is, the better the prompt is able to perform the task.
Score: 109.59105230163041
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Language models can be prompted to perform a wide variety of zero- and few-shot learning problems. However, performance varies significantly with the choice of prompt, and we do not yet understand why this happens or how to pick the best prompts. In this work, we analyze the factors that contribute to this variance and establish a new empirical hypothesis: the performance of a prompt is coupled with the extent to which the model is familiar with the language it contains. Over a wide range of tasks, we show that the lower the perplexity of the prompt is, the better the prompt is able to perform the task. As a result, we devise a method for creating prompts: (1) automatically extend a small seed set of manually written prompts by paraphrasing using GPT3 and backtranslation and (2) choose the lowest perplexity prompts to get significant gains in performance.

Related papers

Has My System Prompt Been Used? Large Language Model Prompt Membership Inference [56.20586932251531]
We develop Prompt Detective, a statistical method to reliably determine whether a given system prompt was used by a third-party language model. Our work reveals that even minor changes in system prompts manifest in distinct response distributions, enabling us to verify prompt usage with statistical significance.
arXiv Detail & Related papers (2025-02-14T08:00:42Z)
Zero-shot Approach to Overcome Perturbation Sensitivity of Prompts [7.208567411886273]
Recent studies have demonstrated that natural-language prompts can help to leverage the knowledge learned by pre-trained language models for the binary sentence-level sentiment classification task. This study aims to find high-quality prompts for the given task in a zero-shot setting. We empirically demonstrate that the top-ranked prompts are high-quality and significantly outperform the base prompt and the prompts generated using few-shot learning for the binary sentence-level sentiment classification task.
arXiv Detail & Related papers (2023-05-25T03:36:43Z)
Prompting Large Language Model for Machine Translation: A Case Study [87.88120385000666]
We offer a systematic study on prompting strategies for machine translation. We examine factors for prompt template and demonstration example selection. We explore the use of monolingual data and the feasibility of cross-lingual, cross-domain, and sentence-to-document transfer learning.
arXiv Detail & Related papers (2023-01-17T18:32:06Z)
Toward Human Readable Prompt Tuning: Kubrick's The Shining is a good movie, and a good prompt too? [84.91689960190054]
Large language models can perform new tasks in a zero-shot fashion, given natural language prompts. It is underexplored what factors make the prompts effective, especially when the prompts are natural language.
arXiv Detail & Related papers (2022-12-20T18:47:13Z)
TEMPERA: Test-Time Prompting via Reinforcement Learning [57.48657629588436]
We propose Test-time Prompt Editing using Reinforcement learning (TEMPERA) In contrast to prior prompt generation methods, TEMPERA can efficiently leverage prior knowledge. Our method achieves 5.33x on average improvement in sample efficiency when compared to the traditional fine-tuning methods.
arXiv Detail & Related papers (2022-11-21T22:38:20Z)
Do Prompts Solve NLP Tasks Using Natural Language? [18.611748762251494]
In this work, we empirically compare the three types of prompts under both few-shot and fully-supervised settings. Our experimental results show that schema prompts are the most effective in general.
arXiv Detail & Related papers (2022-03-02T07:20:59Z)
Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt [98.26682501616024]
We propose a novel model that uses a unified prompt for all languages, called UniPrompt. The unified prompt is computation by a multilingual PLM to produce language-independent representation. Our proposed methods can significantly outperform the strong baselines across different languages.
arXiv Detail & Related papers (2022-02-23T11:57:52Z)
Instance-aware Prompt Learning for Language Understanding and Generation [49.22899822734549]
We propose an instance-aware prompt learning method that learns a different prompt for each instance. Our method achieves the state-of-the-art on the SuperGLUE few-shot learning benchmark.
arXiv Detail & Related papers (2022-01-18T17:03:25Z)
Do Prompt-Based Models Really Understand the Meaning of their Prompts? [12.857580576554865]
We find that models learn just as fast with many prompts that are intentionally irrelevant or even pathologically misleading. We find little evidence that suggests existing prompt-based models truly understand the meaning of their given prompts.
arXiv Detail & Related papers (2021-09-02T23:46:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.