Related papers: InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models

InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models

URL: http://arxiv.org/abs/2306.03082v2
Date: Tue, 8 Aug 2023 17:33:54 GMT
Title: InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models
Authors: Lichang Chen, Jiuhai Chen, Tom Goldstein, Heng Huang, Tianyi Zhou
Abstract summary: Large language models(LLMs) are instruction followers, but it can be challenging to find the best instruction for different situations. We optimize a low-dimensional soft prompt applied to an open-source LLM to generate the instruction for the black-box LLM. Our results show that InstructZero outperforms SOTA auto-instruction methods across a variety of downstream tasks.
Score: 117.92988284226765
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models~(LLMs) are instruction followers, but it can be challenging to find the best instruction for different situations, especially for black-box LLMs on which backpropagation is forbidden. Instead of directly optimizing the discrete instruction, we optimize a low-dimensional soft prompt applied to an open-source LLM to generate the instruction for the black-box LLM. On each iteration of the proposed method, which we call InstructZero, a soft prompt is converted into an instruction using the open-source LLM, which is then submitted to the black-box LLM for zero-shot evaluation, and the performance is sent to Bayesian optimization to produce new soft prompts improving the zero-shot performance. We evaluate InstructZero on different combinations of open-source LLMs and APIs including Vicuna and ChatGPT. Our results show that InstructZero outperforms SOTA auto-instruction methods across a variety of downstream tasks. Our code and data are publicly available at https://github.com/Lichang-Chen/InstructZero.

Related papers

LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints [86.59857711385833]
We introduce RealInstruct, the first benchmark designed to evaluate LLMs' ability to follow real-world multi-constrained instructions. To address the performance gap between open-source and proprietary models, we propose the Decompose, Critique and Refine (DeCRIM) self-correction pipeline. Our results show that DeCRIM improves Mistral's performance by 7.3% on RealInstruct and 8.0% on IFEval even with weak feedback.
arXiv Detail & Related papers (2024-10-09T01:25:10Z)
Optimising Hard Prompts with Few-Shot Meta-Prompting [0.0]
Contextual prompts include context in the form of a document or dialogue along with the natural language instructions to the Large Language Model (LLM) Masking the context, it acts as template for prompts. In this paper, we present an iterative method to generate better templates using an LLM from an existing set of prompt templates without revealing the context to the LLM.
arXiv Detail & Related papers (2024-07-09T07:02:57Z)
InverseCoder: Unleashing the Power of Instruction-Tuned Code LLMs with Inverse-Instruct [43.7550233177368]
We propose INVERSE-INSTRUCT, which summarizes instructions from code snippets instead of the reverse. We present a series of code LLMs named InverseCoder, which surpasses the performance of the original code LLMs on a wide range of benchmarks.
arXiv Detail & Related papers (2024-07-08T08:00:05Z)
CodecLM: Aligning Language Models with Tailored Synthetic Data [51.59223474427153]
We introduce CodecLM, a framework for adaptively generating high-quality synthetic data for instruction-following abilities. We first encode seed instructions into metadata, which are concise keywords generated on-the-fly to capture the target instruction distribution. We also introduce Self-Rubrics and Contrastive Filtering during decoding to tailor data-efficient samples.
arXiv Detail & Related papers (2024-04-08T21:15:36Z)
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions [43.19966425619236]
We utilize instructions in code style, which are more structural and less ambiguous, to replace typically natural language instructions. Under few-shot scenarios, we propose a novel method to compose in-context demonstrations using both clean and adversarial samples. Experiments on eight robustness datasets show that our method consistently outperforms prompting LLMs with natural language instructions.
arXiv Detail & Related papers (2024-02-26T09:30:55Z)
Tuna: Instruction Tuning using Feedback from Large Language Models [74.04950416204551]
We propose finetuning an instruction-tuned large language model using our novel textitprobabilistic ranking and textitcontextual ranking approaches. Probabilistic ranking enables the instruction-tuned model to inherit the relative rankings of high-quality and low-quality responses from the teacher LLM. On the other hand, learning with contextual ranking allows the model to refine its own response distribution using the contextual understanding ability of stronger LLMs.
arXiv Detail & Related papers (2023-10-20T09:55:06Z)
LLMRec: Benchmarking Large Language Models on Recommendation Task [54.48899723591296]
The application of Large Language Models (LLMs) in the recommendation domain has not been thoroughly investigated. We benchmark several popular off-the-shelf LLMs on five recommendation tasks, including rating prediction, sequential recommendation, direct recommendation, explanation generation, and review summarization. The benchmark results indicate that LLMs displayed only moderate proficiency in accuracy-based tasks such as sequential and direct recommendation.
arXiv Detail & Related papers (2023-08-23T16:32:54Z)
Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following [44.701091969256055]
We present our finding that prepending a Task-Agnostic Prefix Prompt (TAPP) to the input improves the instruction-following ability of various Large Language Models (LLMs) during inference. We observe that both base LLMs (i.e. not fine-tuned to follow instructions) and instruction-tuned models benefit from TAPP, resulting in 34.58% and 12.26% improvement on average.
arXiv Detail & Related papers (2023-02-28T16:06:35Z)
Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs. Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.