Related papers: LongForm: Effective Instruction Tuning with Reverse Instructions

LongForm: Effective Instruction Tuning with Reverse Instructions

URL: http://arxiv.org/abs/2304.08460v2
Date: Wed, 14 Feb 2024 18:00:33 GMT
Title: LongForm: Effective Instruction Tuning with Reverse Instructions
Authors: Abdullatif K\"oksal, Timo Schick, Anna Korhonen, Hinrich Sch\"utze
Abstract summary: We introduce the LongForm-C dataset, which is created by reverse instructions. First we select a diverse set of human-written documents from corpora such as C4 and Wikipedia. We generate instructions via LLMs for human-written corpus examples using reverse instructions.
Score: 43.7029933201002
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Instruction tuning enables language models to more effectively generalize and better follow user intent. However, obtaining instruction data is costly and challenging. Prior work employs methods such as expensive human annotation, crowd-sourced datasets with alignment issues, and generating noisy examples via LLMs. We introduce the LongForm-C dataset, which is created by reverse instructions. We generate instructions via LLMs for human-written corpus examples using reverse instructions. First we select a diverse set of human-written documents from corpora such as C4 and Wikipedia; then we generate instructions for these documents via LLMs. This approach provides a cheaper and cleaner instruction-tuning dataset with natural output and one suitable for long text generation. Our models outperform 10x larger language models without instruction tuning on tasks such as story/recipe generation and long-form question answering. Moreover, LongForm models outperform prior instruction-tuned models such as FLAN-T5 and Alpaca by a large margin, and improve language understanding capabilities further. Finally, our models can effectively follow and answer multilingual instructions; we demonstrate this for news generation. We publicly release our data and models: https://github.com/akoksal/LongForm.

Related papers

Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models [22.16558378953053]
We build state-of-the-art instruction-tuning datasets sourced from human-written instructions. LLMs fine-tuned on our datasets consistently outperform those fine-tuned on existing ones. Analyses suggest that instruction-tuning in a new language allows LLMs to follow instructions, while the tuned models exhibit a notable lack of culture-specific knowledge in that language.
arXiv Detail & Related papers (2025-03-31T04:28:38Z)
Cookbook: A framework for improving LLM generative abilities via programmatic data generating templates [57.29125360837203]
Cookbook is a framework that generates training data consisting of simple patterns over random tokens. We find that finetuning on Cookbook-generated data is able to improve performance on its corresponding task by up to 52.7 accuracy points.
arXiv Detail & Related papers (2024-10-07T17:29:40Z)
Self-Alignment with Instruction Backtranslation [162.02529653768096]
We present a method to build a high quality instruction following language model by automatically labelling human-written text with corresponding instructions. Our approach, named instruction backtranslation, starts with a language model finetuned on a small amount of seed data, and a given web corpus.
arXiv Detail & Related papers (2023-08-11T17:47:54Z)
WizardCoder: Empowering Code Large Language Models with Evol-Instruct [67.24653703564492]
We introduce WizardCoder, which empowers Code LLMs with complex instruction fine-tuning. Our model surpasses all other open-source Code LLMs by a substantial margin.
arXiv Detail & Related papers (2023-06-14T15:18:48Z)
Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation [92.2167864437497]
We propose Dynosaur, a dynamic growth paradigm for the automatic curation of instruction-tuning data. Based on the metadata of existing datasets, we use LLMs to automatically construct instruction-tuning data by identifying relevant data fields and generating appropriate instructions. By leveraging the existing annotated datasets, Dynosaur offers several advantages: 1) it reduces the API cost for generating instructions; 2) it provides high-quality data for instruction tuning; and 3) it supports the continuous improvement of models by generating instruction-tuning data when a new annotated dataset becomes available.
arXiv Detail & Related papers (2023-05-23T17:56:26Z)
Self-Instruct: Aligning Language Models with Self-Generated Instructions [76.42871502364697]
Self-Instruct is a framework for improving the instruction-following capabilities of pretrained language models. Our pipeline generates instructions, input, and output samples from a language model, then filters invalid or similar ones before using them to finetune the original model. For further evaluation, we curate a set of expert-written instructions for novel tasks, and show through human evaluation that tuning GPT3 with Self-Instruct outperforms using existing public instruction datasets by a large margin.
arXiv Detail & Related papers (2022-12-20T18:59:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.