LongForm: Effective Instruction Tuning with Reverse Instructions
- URL: http://arxiv.org/abs/2304.08460v2
- Date: Wed, 14 Feb 2024 18:00:33 GMT
- Title: LongForm: Effective Instruction Tuning with Reverse Instructions
- Authors: Abdullatif K\"oksal, Timo Schick, Anna Korhonen, Hinrich Sch\"utze
- Abstract summary: We introduce the LongForm-C dataset, which is created by reverse instructions.
First we select a diverse set of human-written documents from corpora such as C4 and Wikipedia.
We generate instructions via LLMs for human-written corpus examples using reverse instructions.
- Score: 43.7029933201002
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Instruction tuning enables language models to more effectively generalize and
better follow user intent. However, obtaining instruction data is costly and
challenging. Prior work employs methods such as expensive human annotation,
crowd-sourced datasets with alignment issues, and generating noisy examples via
LLMs. We introduce the LongForm-C dataset, which is created by reverse
instructions. We generate instructions via LLMs for human-written corpus
examples using reverse instructions. First we select a diverse set of
human-written documents from corpora such as C4 and Wikipedia; then we generate
instructions for these documents via LLMs. This approach provides a cheaper and
cleaner instruction-tuning dataset with natural output and one suitable for
long text generation. Our models outperform 10x larger language models without
instruction tuning on tasks such as story/recipe generation and long-form
question answering. Moreover, LongForm models outperform prior
instruction-tuned models such as FLAN-T5 and Alpaca by a large margin, and
improve language understanding capabilities further. Finally, our models can
effectively follow and answer multilingual instructions; we demonstrate this
for news generation. We publicly release our data and models:
https://github.com/akoksal/LongForm.
Related papers
- Cookbook: A framework for improving LLM generative abilities via programmatic data generating templates [57.29125360837203]
Cookbook is a framework that generates training data consisting of simple patterns over random tokens.
We find that finetuning on Cookbook-generated data is able to improve performance on its corresponding task by up to 52.7 accuracy points.
arXiv Detail & Related papers (2024-10-07T17:29:40Z) - Self-Alignment with Instruction Backtranslation [162.02529653768096]
We present a method to build a high quality instruction following language model by automatically labelling human-written text with corresponding instructions.
Our approach, named instruction backtranslation, starts with a language model finetuned on a small amount of seed data, and a given web corpus.
arXiv Detail & Related papers (2023-08-11T17:47:54Z) - WizardCoder: Empowering Code Large Language Models with Evol-Instruct [67.24653703564492]
We introduce WizardCoder, which empowers Code LLMs with complex instruction fine-tuning.
Our model surpasses all other open-source Code LLMs by a substantial margin.
arXiv Detail & Related papers (2023-06-14T15:18:48Z) - Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation [92.2167864437497]
We propose Dynosaur, a dynamic growth paradigm for the automatic curation of instruction-tuning data.
Based on the metadata of existing datasets, we use LLMs to automatically construct instruction-tuning data by identifying relevant data fields and generating appropriate instructions.
By leveraging the existing annotated datasets, Dynosaur offers several advantages: 1) it reduces the API cost for generating instructions; 2) it provides high-quality data for instruction tuning; and 3) it supports the continuous improvement of models by generating instruction-tuning data when a new annotated dataset becomes available.
arXiv Detail & Related papers (2023-05-23T17:56:26Z) - Self-Instruct: Aligning Language Models with Self-Generated Instructions [76.42871502364697]
Self-Instruct is a framework for improving the instruction-following capabilities of pretrained language models.
Our pipeline generates instructions, input, and output samples from a language model, then filters invalid or similar ones before using them to finetune the original model.
For further evaluation, we curate a set of expert-written instructions for novel tasks, and show through human evaluation that tuning GPT3 with Self-Instruct outperforms using existing public instruction datasets by a large margin.
arXiv Detail & Related papers (2022-12-20T18:59:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.