Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning
- URL: http://arxiv.org/abs/2310.11716v1
- Date: Wed, 18 Oct 2023 05:13:47 GMT
- Title: Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning
- Authors: Ming Li, Lichang Chen, Jiuhai Chen, Shwai He, Heng Huang, Jiuxiang Gu,
Tianyi Zhou
- Abstract summary: Low-quality data in the training set are usually detrimental to instruction tuning.
We propose a novel method, termed "reflection-tuning"
This approach utilizes an oracle LLM to recycle the original training data by introspecting and enhancing the quality of instructions and responses in the data.
- Score: 79.32236399694077
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in Large Language Models (LLMs) have expanded the
horizons of natural language understanding and generation. Notably, the output
control and alignment with the input of LLMs can be refined through instruction
tuning. However, as highlighted in several studies, low-quality data in the
training set are usually detrimental to instruction tuning, resulting in
inconsistent or even misleading LLM outputs. We propose a novel method, termed
"reflection-tuning," which addresses the problem by self-improvement and
judging capabilities of LLMs. This approach utilizes an oracle LLM to recycle
the original training data by introspecting and enhancing the quality of
instructions and responses in the data. Extensive experiments on widely used
evaluation benchmarks show that LLMs trained with our recycled data outperform
those trained with existing datasets in various benchmarks.
Related papers
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - Optimizing and Testing Instruction-Following: Analyzing the Impact of Fine-Grained Instruction Variants on instruction-tuned LLMs [27.321629102942754]
We introduce an effective data augmentation technique that decomposes complex instructions into simpler sub-components, modifies these, and reconstructs them into new variants.
Our findings show that LLMs fine-tuned with DeMoRecon will gain significant performance boost on both ours and commonly used instructions-following benchmarks.
arXiv Detail & Related papers (2024-06-17T08:08:11Z) - LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement [79.31084387589968]
Pretrained large language models (LLMs) are currently state-of-the-art for solving the vast majority of natural language processing tasks.
We propose LLM2LLM, a data augmentation strategy that uses a teacher LLM to enhance a small seed dataset.
We achieve improvements up to 24.2% on the GSM8K dataset, 32.6% on CaseHOLD, 32.0% on SNIPS, 52.6% on TREC and 39.8% on SST-2 over regular fine-tuning in the low-data regime.
arXiv Detail & Related papers (2024-03-22T08:57:07Z) - CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM Instruction Tuning [32.54921739100195]
We propose CoachLM, a novel approach to enhance the quality of instruction datasets through automatic revisions on samples in the dataset.
CoachLM is trained from the samples revised by human experts and significantly increases the proportion of high-quality samples in the dataset from 17.7% to 78.9%.
Results show that CoachLM improves the instruction-following capabilities of the instruction-tuned LLM by an average of 29.9%.
arXiv Detail & Related papers (2023-11-22T09:04:57Z) - TRACE: A Comprehensive Benchmark for Continual Learning in Large
Language Models [52.734140807634624]
Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety.
Existing continual learning benchmarks lack sufficient challenge for leading aligned LLMs.
We introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs.
arXiv Detail & Related papers (2023-10-10T16:38:49Z) - From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning [52.257422715393574]
We introduce a self-guided methodology for Large Language Models (LLMs) to autonomously discern and select cherry samples from open-source datasets.
Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal metric to identify discrepancies between a model's expected responses and its intrinsic generation capability.
arXiv Detail & Related papers (2023-08-23T09:45:29Z) - ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation [43.270424225285105]
We focus on adapting and empowering a pure large language model for zero-shot and few-shot recommendation tasks.
We propose Retrieval-enhanced Large Language models (ReLLa) for recommendation tasks in both zero-shot and few-shot settings.
arXiv Detail & Related papers (2023-08-22T02:25:04Z) - On Learning to Summarize with Large Language Models as References [101.79795027550959]
Large language models (LLMs) are favored by human annotators over the original reference summaries in commonly used summarization datasets.
We study an LLM-as-reference learning setting for smaller text summarization models to investigate whether their performance can be substantially improved.
arXiv Detail & Related papers (2023-05-23T16:56:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.