Guiding Large Language Models via Directional Stimulus Prompting
- URL: http://arxiv.org/abs/2302.11520v4
- Date: Mon, 9 Oct 2023 21:01:22 GMT
- Title: Guiding Large Language Models via Directional Stimulus Prompting
- Authors: Zekun Li, Baolin Peng, Pengcheng He, Michel Galley, Jianfeng Gao,
Xifeng Yan
- Abstract summary: We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.
Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
- Score: 114.84930073977672
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce Directional Stimulus Prompting, a novel framework for guiding
black-box large language models (LLMs) toward specific desired outputs. Instead
of directly adjusting LLMs, our method employs a small tunable policy model
(e.g., T5) to generate an auxiliary directional stimulus prompt for each input
instance. These directional stimulus prompts act as nuanced, instance-specific
hints and clues to guide LLMs in generating desired outcomes, such as including
specific keywords in the generated summary. Our approach sidesteps the
challenges of direct LLM tuning by optimizing the policy model to explore
directional stimulus prompts that align LLMs with desired behaviors. The policy
model can be optimized through 1) supervised fine-tuning using labeled data and
2) reinforcement learning from offline or online rewards based on the LLM's
output. We assess our method across summarization, dialogue response
generation, and chain-of-thought reasoning tasks. Our experiments demonstrate
that the framework consistently improves LLMs' (e.g., ChatGPT, Codex,
InstructGPT) performance on these supervised tasks using minimal labeled data.
Notably, using just 80 dialogues on the MultiWOZ dataset, our approach enhances
ChatGPT's performance by an impressive 41.4%, matching or surpassing some fully
supervised start-of-the-art models. Additionally, the instance-specific
chain-of-thought prompt generated by our approach improves InstructGPT's
reasoning accuracy compared to human-crafted or automatically generated
prompts. The code and data are publicly available at
\url{https://github.com/Leezekun/Directional-Stimulus-Prompting}.
Related papers
- Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization [4.9201947803787744]
Large language models (LLMs) can generate fluent summaries across domains using prompting techniques.
We show that adding keyphrases in prompts can improve ROUGE F1 and recall.
We introduce Keyphrase Signal Extractor (SigExt), a lightweight model that can be finetuned to extract salient keyphrases.
arXiv Detail & Related papers (2024-10-03T17:54:56Z) - zsLLMCode: An Effective Approach for Functional Code Embedding via LLM with Zero-Shot Learning [6.976968804436321]
Large language models (LLMs) have the capability of zero-shot learning, which does not require training or fine-tuning.
We propose zsLLMCode, a novel approach that generates functional code embeddings using LLMs.
arXiv Detail & Related papers (2024-09-23T01:03:15Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - Show, Don't Tell: Aligning Language Models with Demonstrated Feedback [54.10302745921713]
Demonstration ITerated Task Optimization (DITTO) directly aligns language model outputs to a user's demonstrated behaviors.
We evaluate DITTO's ability to learn fine-grained style and task alignment across domains such as news articles, emails, and blog posts.
arXiv Detail & Related papers (2024-06-02T23:13:56Z) - One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs)
We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z) - CodecLM: Aligning Language Models with Tailored Synthetic Data [51.59223474427153]
We introduce CodecLM, a framework for adaptively generating high-quality synthetic data for instruction-following abilities.
We first encode seed instructions into metadata, which are concise keywords generated on-the-fly to capture the target instruction distribution.
We also introduce Self-Rubrics and Contrastive Filtering during decoding to tailor data-efficient samples.
arXiv Detail & Related papers (2024-04-08T21:15:36Z) - Prompt Highlighter: Interactive Control for Multi-Modal LLMs [50.830448437285355]
This study targets a critical aspect of multi-modal LLMs' (LLMs&VLMs) inference: explicit controllable text generation.
We introduce a novel inference method, Prompt Highlighter, which enables users to highlight specific prompt spans to interactively control the focus during generation.
We find that, during inference, guiding the models with highlighted tokens through the attention weights leads to more desired outputs.
arXiv Detail & Related papers (2023-12-07T13:53:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.