Why is prompting hard? Understanding prompts on binary sequence predictors
- URL: http://arxiv.org/abs/2502.10760v1
- Date: Sat, 15 Feb 2025 10:55:47 GMT
- Title: Why is prompting hard? Understanding prompts on binary sequence predictors
- Authors: Li Kevin Wenliang, Anian Ruoss, Jordi Grau-Moya, Marcus Hutter, Tim Genewein,
- Abstract summary: Large language models (LLMs) can be prompted to do many tasks.
Finding good prompts is not always easy, nor is understanding some performant prompts.
- Score: 19.855572748273236
- License:
- Abstract: Large language models (LLMs) can be prompted to do many tasks, but finding good prompts is not always easy, nor is understanding some performant prompts. We explore these issues by viewing prompting as conditioning a near-optimal sequence predictor (LLM) pretrained on diverse data sources. Through numerous prompt search experiments, we show that the unintuitive patterns in optimal prompts can be better understood given the pretraining distribution, which is often unavailable in practice. Moreover, even using exhaustive search, reliably identifying optimal prompts from practical neural predictors can be difficult. Further, we demonstrate that common prompting methods, such as using intuitive prompts or samples from the targeted task, are in fact suboptimal. Thus, this work takes an initial step towards understanding the difficulties in finding and understanding optimal prompts from a statistical and empirical perspective.
Related papers
- Exploring Task-Level Optimal Prompts for Visual In-Context Learning [20.34945396590862]
We propose task-level prompting to reduce the cost of searching for prompts during the inference stage.
We show that our proposed method can identify near-optimal prompts and reach the best VICL performance with a minimal cost.
arXiv Detail & Related papers (2025-01-15T14:52:20Z) - CAPrompt: Cyclic Prompt Aggregation for Pre-Trained Model Based Class Incremental Learning [12.249938312431993]
We propose a novel Cyclic Prompt Aggregation (CAPrompt) method to eliminate the dependency on task ID prediction.
Under concave conditions, the aggregated prompt achieves lower error compared to selecting a single task-specific prompt.
Our proposed CAPrompt outperforms state-of-the-art methods by 2%-3%.
arXiv Detail & Related papers (2024-12-12T04:34:28Z) - Generalizable Prompt Tuning for Vision-Language Models [3.1008306011364644]
Learnable soft prompts often perform well in downstream tasks but lack generalizability.
The study shows that by treating soft and hand-crafted prompts as dual views of the textual modality, we can better ensemble task-specific and general semantic information.
To generate more expressive prompts, the study introduces a class-wise augmentation from the visual modality, resulting in significant robustness to a wider range of unseen classes.
arXiv Detail & Related papers (2024-10-04T07:02:13Z) - Task Facet Learning: A Structured Approach to Prompt Optimization [14.223730629357178]
We propose an algorithm that learns multiple facets of a task from a set of training examples.
The resulting algorithm, UniPrompt, consists of a generative model to generate initial candidates for each prompt section.
Empirical evaluation on multiple datasets and a real-world task shows that prompts generated using UniPrompt obtain higher accuracy than human-tuned prompts.
arXiv Detail & Related papers (2024-06-15T04:54:26Z) - Plum: Prompt Learning using Metaheuristic [28.024094195968672]
We introduce metaheuristics, a branch of discrete non-visual optimization methods with over 100 options.
Within our paradigm, we test six typical methods, demonstrating their effectiveness in white-box and black-box prompt learning.
We show that these methods can be used to discover more human-understandable prompts, opening the door to a cornucopia of possibilities in prompt optimization.
arXiv Detail & Related papers (2023-11-14T18:14:56Z) - InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural
Language Understanding [51.48361798508375]
We develop an information-theoretic framework that formulates soft prompt tuning as maximizing mutual information between prompts and other model parameters.
We show that InfoPrompt can significantly accelerate the convergence of the prompt tuning and outperform traditional prompt tuning methods.
arXiv Detail & Related papers (2023-06-08T04:31:48Z) - Toward Human Readable Prompt Tuning: Kubrick's The Shining is a good
movie, and a good prompt too? [84.91689960190054]
Large language models can perform new tasks in a zero-shot fashion, given natural language prompts.
It is underexplored what factors make the prompts effective, especially when the prompts are natural language.
arXiv Detail & Related papers (2022-12-20T18:47:13Z) - Bayesian Prompt Learning for Image-Language Model Generalization [64.50204877434878]
We use the regularization ability of Bayesian methods to frame prompt learning as a variational inference problem.
Our approach regularizes the prompt space, reduces overfitting to the seen prompts and improves the prompt generalization on unseen prompts.
We demonstrate empirically on 15 benchmarks that Bayesian prompt learning provides an appropriate coverage of the prompt space.
arXiv Detail & Related papers (2022-10-05T17:05:56Z) - MetaPrompting: Learning to Learn Better Prompts [52.914694884515534]
We propose a new soft prompting method called MetaPrompting, which adopts the well-recognized model-agnostic meta-learning algorithm.
Extensive experiments show MetaPrompting brings significant improvement on four different datasets.
arXiv Detail & Related papers (2022-09-23T09:01:05Z) - RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL)
RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs)
Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z) - Least-to-Most Prompting Enables Complex Reasoning in Large Language
Models [52.59923418570378]
We propose a novel prompting strategy, least-to-most prompting, to overcome the challenge of easy-to-hard generalization.
We show that least-to-most prompting is capable of generalizing to more difficult problems than those seen in prompts.
neural-symbolic models in the literature that specialize in solving SCAN are trained on the entire training set containing over 15,000 examples.
arXiv Detail & Related papers (2022-05-21T15:34:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.