Prompt Sketching for Large Language Models
- URL: http://arxiv.org/abs/2311.04954v1
- Date: Wed, 8 Nov 2023 18:57:23 GMT
- Title: Prompt Sketching for Large Language Models
- Authors: Luca Beurer-Kellner, Mark Niklas M\"uller, Marc Fischer, Martin Vechev
- Abstract summary: Recent prompting strategies for large language models (LLMs) query the model multiple times sequentially.
This leads to disconnected and undesirably wordy intermediate responses.
We propose prompt sketching, a new prompting paradigm in which an LLM does not only respond by completing a prompt, but by predicting values for multiple variables in a template.
- Score: 7.687678490751105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many recent prompting strategies for large language models (LLMs) query the
model multiple times sequentially -- first to produce intermediate results and
then the final answer. However, using these methods, both decoder and model are
unaware of potential follow-up prompts, leading to disconnected and undesirably
wordy intermediate responses. In this work, we address this issue by proposing
prompt sketching, a new prompting paradigm in which an LLM does not only
respond by completing a prompt, but by predicting values for multiple variables
in a template. This way, sketching grants users more control over the
generation process, e.g., by providing a reasoning framework via intermediate
instructions, leading to better overall results. The key idea enabling
sketching with existing, autoregressive models is to adapt the decoding
procedure to also score follow-up instructions during text generation, thus
optimizing overall template likelihood in inference. Our experiments show that
in a zero-shot setting, prompt sketching outperforms existing, sequential
prompting schemes such as direct asking or chain-of-thought on 7 out of 8 LLM
benchmarking tasks, including state tracking, arithmetic reasoning, and general
question answering. To facilitate future use, we release a number of generic,
yet effective sketches applicable to many tasks, and an open source library
called dclib, powering our sketch-aware decoders.
Related papers
- Graph-Structured Speculative Decoding [52.94367724136063]
Speculative decoding has emerged as a promising technique to accelerate the inference of Large Language Models.
We introduce an innovative approach utilizing a directed acyclic graph (DAG) to manage the drafted hypotheses.
We observe a remarkable speedup of 1.73$times$ to 1.96$times$, significantly surpassing standard speculative decoding.
arXiv Detail & Related papers (2024-07-23T06:21:24Z) - Efficient Prompting Methods for Large Language Models: A Survey [50.171011917404485]
Prompting has become a mainstream paradigm for adapting large language models (LLMs) to specific natural language processing tasks.
This approach brings the additional computational burden of model inference and human effort to guide and control the behavior of LLMs.
We present the basic concepts of prompting, review the advances for efficient prompting, and highlight future research directions.
arXiv Detail & Related papers (2024-04-01T12:19:08Z) - Monotonic Paraphrasing Improves Generalization of Language Model Prompting [42.74429247000797]
MonoPara is an end-to-end decoding strategy that paraphrases given prompts or instructions into their lower perplexity counterparts.
It does not require any training and can monotonically lower the perplexity of the paraphrased prompt or instruction.
It is also shown to effectively improve LMs' generalization on perturbed and unseen task instructions.
arXiv Detail & Related papers (2024-03-24T06:49:07Z) - Meta-Task Prompting Elicits Embeddings from Large Language Models [54.757445048329735]
We introduce a new unsupervised text embedding method, Meta-Task Prompting with Explicit One-Word Limitation.
We generate high-quality sentence embeddings from Large Language Models without the need for model fine-tuning.
Our findings suggest a new scaling law, offering a versatile and resource-efficient approach for embedding generation across diverse scenarios.
arXiv Detail & Related papers (2024-02-28T16:35:52Z) - Chimera: A Lossless Decoding Method for Accelerating Large Language Models Inference by Fusing all Tokens [15.566726645722657]
We propose a novel framework specifically designed for speculative sampling.
Within this framework, we introduce a lightweight draft model that effectively utilizes previously generated tokens to predict subsequent words.
We demonstrate impressive results, achieving an average latency speedup ratio of 2.7x compared to the vanilla auto-regressive decoding approach.
arXiv Detail & Related papers (2024-02-24T08:10:39Z) - Instruction Position Matters in Sequence Generation with Large Language
Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization.
We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z) - AutoHint: Automatic Prompt Optimization with Hint Generation [11.737818328656735]
This paper presents AutoHint, a novel framework for automatic prompt engineering and optimization for Large Language Models (LLM)
We propose a framework to inherit the merits of both in-context learning and zero-shot learning by incorporating enriched instructions derived from input-output demonstrations to optimize original prompt.
We refer to the enrichment as the hint and propose a framework to automatically generate the hint from labeled data.
arXiv Detail & Related papers (2023-07-13T00:49:27Z) - Boosted Prompt Ensembles for Large Language Models [38.402161594793775]
Methods such as chain-of-thought prompting and self-consistency have pushed the frontier of language model reasoning performance with no additional training.
We propose a prompt ensembling method for large language models, which uses a small dataset to construct a set of few shot prompts that together comprise a boosted prompt ensemble''
We show that this outperforms single-prompt output-space ensembles and bagged prompt-space ensembles on the GSM8k and AQuA datasets.
arXiv Detail & Related papers (2023-04-12T16:47:15Z) - Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.
Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z) - Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation
with Large Language Models [116.25562358482962]
State-of-the-art neural language models can be used to solve ad-hoc language tasks without the need for supervised training.
PromptIDE allows users to experiment with prompt variations, visualize prompt performance, and iteratively optimize prompts.
arXiv Detail & Related papers (2022-08-16T17:17:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.