Related papers: WizardLM: Empowering Large Language Models to Follow Complex Instructions

WizardLM: Empowering Large Language Models to Follow Complex Instructions

URL: http://arxiv.org/abs/2304.12244v2
Date: Sat, 10 Jun 2023 13:18:25 GMT
Title: WizardLM: Empowering Large Language Models to Follow Complex Instructions
Authors: Can Xu, Qingfeng Sun, Kai Zheng, Xiubo Geng, Pu Zhao, Jiazhan Feng, Chongyang Tao, Daxin Jiang
Abstract summary: We show an avenue for creating large amounts of instruction data with varying levels of complexity using LLM instead of humans. We use our proposed Evol-Instruct to rewrite instructions step by step into more complex instructions. Then, we mix all generated instruction data to fine-tune LLaMA.
Score: 67.41048242052258
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Training large language models (LLMs) with open-domain instruction following data brings colossal success. However, manually creating such instruction data is very time-consuming and labor-intensive. Moreover, humans may struggle to produce high-complexity instructions. In this paper, we show an avenue for creating large amounts of instruction data with varying levels of complexity using LLM instead of humans. Starting with an initial set of instructions, we use our proposed Evol-Instruct to rewrite them step by step into more complex instructions. Then, we mix all generated instruction data to fine-tune LLaMA. We call the resulting model WizardLM. Human evaluations on a complexity-balanced test bed and Vicuna's testset show that instructions from Evol-Instruct are superior to human-created ones. By analyzing the human evaluation results of the high complexity part, we demonstrate that outputs from our WizardLM are preferred to outputs from OpenAI ChatGPT. In GPT-4 automatic evaluation, WizardLM achieves more than 90\% capacity of ChatGPT on 17 out of 29 skills. Even though WizardLM still lags behind ChatGPT in some aspects, our findings suggest that fine-tuning with AI-evolved instructions is a promising direction for enhancing LLMs. Our code and data are public at https://github.com/nlpxucan/WizardLM

Related papers

Do LLMs "know" internally when they follow instructions? [7.87370534634794]
We investigate whether large language models (LLMs) encode information in their representations that correlate with instruction-following success. Our analysis identifies a direction in the input embedding space, termed the instruction-following dimension, that predicts whether a response will comply with a given instruction. We demonstrate that modifying representations along this dimension improves instruction-following success rates compared to random changes.
arXiv Detail & Related papers (2024-10-18T14:55:14Z)
Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs [23.38507910115345]
In-context learning (ICL) techniques can train strong conversational agents with only a small amount of human supervision. Here we explore the application of such techniques to language models that are much smaller (around 10B--40B parameters) and have permissive licenses. We find the Self-Instruct approach to be less effective at these sizes and propose new ICL methods that draw on two main ideas.
arXiv Detail & Related papers (2023-10-21T10:21:17Z)
Tuna: Instruction Tuning using Feedback from Large Language Models [74.04950416204551]
We propose finetuning an instruction-tuned large language model using our novel textitprobabilistic ranking and textitcontextual ranking approaches. Probabilistic ranking enables the instruction-tuned model to inherit the relative rankings of high-quality and low-quality responses from the teacher LLM. On the other hand, learning with contextual ranking allows the model to refine its own response distribution using the contextual understanding ability of stronger LLMs.
arXiv Detail & Related papers (2023-10-20T09:55:06Z)
Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models [91.02730155418699]
Large language models (LLMs) can perform a wide range of tasks by following natural language instructions. We introduce Auto-Instruct, a novel method to automatically improve the quality of instructions provided to LLMs. In experiments on 118 out-of-domain tasks, Auto-Instruct surpasses both human-written instructions and existing baselines of LLM-generated instructions.
arXiv Detail & Related papers (2023-10-19T19:52:55Z)
Ada-Instruct: Adapting Instruction Generators for Complex Reasoning [14.456571495691561]
We introduce Ada-Instruct, an adaptive instruction generator developed through fine-tuning. We empirically validated Ada-Instruct's efficacy across different applications.
arXiv Detail & Related papers (2023-10-06T13:28:04Z)
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct [130.37945867605302]
We present WizardMath, which enhances the mathematical CoT reasoning abilities of large language models (LLMs) without using external python tools. Remarkably, WizardMath-Mistral 7B surpasses top-tier open-source LLMs by a substantial margin with higher data efficiency. Our preliminary exploration highlights the pivotal role of instruction evolution and process supervision in achieving exceptional math performance.
arXiv Detail & Related papers (2023-08-18T14:23:21Z)
WizardCoder: Empowering Code Large Language Models with Evol-Instruct [67.24653703564492]
We introduce WizardCoder, which empowers Code LLMs with complex instruction fine-tuning. Our model surpasses all other open-source Code LLMs by a substantial margin.
arXiv Detail & Related papers (2023-06-14T15:18:48Z)
InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models [117.92988284226765]
Large language models(LLMs) are instruction followers, but it can be challenging to find the best instruction for different situations. We optimize a low-dimensional soft prompt applied to an open-source LLM to generate the instruction for the black-box LLM. Our results show that InstructZero outperforms SOTA auto-instruction methods across a variety of downstream tasks.
arXiv Detail & Related papers (2023-06-05T17:55:22Z)
LongForm: Effective Instruction Tuning with Reverse Instructions [74.14035528786997]
We introduce the LongForm-C dataset, which is created by reverse instructions. We generate instructions via LLMs for human-written corpus examples using reverse instructions. Our models outperform 10x larger language models without instruction tuning on tasks such as story/recipe generation and long-form question answering.
arXiv Detail & Related papers (2023-04-17T17:36:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.