Auto-Instruct: Automatic Instruction Generation and Ranking for
Black-Box Language Models
- URL: http://arxiv.org/abs/2310.13127v1
- Date: Thu, 19 Oct 2023 19:52:55 GMT
- Title: Auto-Instruct: Automatic Instruction Generation and Ranking for
Black-Box Language Models
- Authors: Zhihan Zhang, Shuohang Wang, Wenhao Yu, Yichong Xu, Dan Iter, Qingkai
Zeng, Yang Liu, Chenguang Zhu, Meng Jiang
- Abstract summary: Large language models (LLMs) can perform a wide range of tasks by following natural language instructions.
We introduce Auto-Instruct, a novel method to automatically improve the quality of instructions provided to LLMs.
In experiments on 118 out-of-domain tasks, Auto-Instruct surpasses both human-written instructions and existing baselines of LLM-generated instructions.
- Score: 91.02730155418699
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) can perform a wide range of tasks by following
natural language instructions, without the necessity of task-specific
fine-tuning. Unfortunately, the performance of LLMs is greatly influenced by
the quality of these instructions, and manually writing effective instructions
for each task is a laborious and subjective process. In this paper, we
introduce Auto-Instruct, a novel method to automatically improve the quality of
instructions provided to LLMs. Our method leverages the inherent generative
ability of LLMs to produce diverse candidate instructions for a given task, and
then ranks them using a scoring model trained on a variety of 575 existing NLP
tasks. In experiments on 118 out-of-domain tasks, Auto-Instruct surpasses both
human-written instructions and existing baselines of LLM-generated
instructions. Furthermore, our method exhibits notable generalizability even
with other LLMs that are not incorporated into its training process.
Related papers
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs [47.94710556156627]
MIA-Bench is a benchmark designed to evaluate multimodal large language models (MLLMs) on their ability to strictly adhere to complex instructions.
Our benchmark comprises a diverse set of 400 image-prompt pairs, each crafted to challenge the models' compliance with layered instructions.
arXiv Detail & Related papers (2024-07-01T17:53:35Z) - Distilling Instruction-following Abilities of Large Language Models with Task-aware Curriculum Planning [12.651588927599441]
Instruction tuning aims to align large language models with open-domain instructions and human-preferred responses.
We introduce Task-Aware Curriculum Planning for Instruction Refinement (TAPIR) to select instructions that are difficult for a student LLM to follow.
To balance the student's capabilities, task distributions in training sets are adjusted with responses automatically refined according to their corresponding tasks.
arXiv Detail & Related papers (2024-05-22T08:38:26Z) - Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing [56.75702900542643]
We introduce AlphaLLM for the self-improvements of Large Language Models.
It integrates Monte Carlo Tree Search (MCTS) with LLMs to establish a self-improving loop.
Our experimental results show that AlphaLLM significantly enhances the performance of LLMs without additional annotations.
arXiv Detail & Related papers (2024-04-18T15:21:34Z) - RoCoIns: Enhancing Robustness of Large Language Models through
Code-Style Instructions [43.19966425619236]
We utilize instructions in code style, which are more structural and less ambiguous, to replace typically natural language instructions.
Under few-shot scenarios, we propose a novel method to compose in-context demonstrations using both clean and adversarial samples.
Experiments on eight robustness datasets show that our method consistently outperforms prompting LLMs with natural language instructions.
arXiv Detail & Related papers (2024-02-26T09:30:55Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM Instruction Tuning [32.54921739100195]
We propose CoachLM, a novel approach to enhance the quality of instruction datasets through automatic revisions on samples in the dataset.
CoachLM is trained from the samples revised by human experts and significantly increases the proportion of high-quality samples in the dataset from 17.7% to 78.9%.
Results show that CoachLM improves the instruction-following capabilities of the instruction-tuned LLM by an average of 29.9%.
arXiv Detail & Related papers (2023-11-22T09:04:57Z) - Learning to Plan with Natural Language [111.76828049344839]
Large Language Models (LLMs) have shown remarkable performance in various basic natural language tasks.
For completing the complex task, we still need a plan for the task to guide LLMs to generate the specific solutions step by step.
We propose the Learning to Plan method, which involves two phases: (1) In the first learning task plan phase, it iteratively updates the task plan with new step-by-step solutions and behavioral instructions, which are obtained by prompting LLMs to derive from training error feedback.
arXiv Detail & Related papers (2023-04-20T17:09:12Z) - Large Language Models Are Human-Level Prompt Engineers [31.98042013940282]
We propose Automatic Prompt Engineer for automatic instruction generation and selection.
We show that APE-engineered prompts can be applied to steer models toward truthfulness and/or informativeness.
arXiv Detail & Related papers (2022-11-03T15:43:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.