Don't Half-listen: Capturing Key-part Information in Continual Instruction Tuning
- URL: http://arxiv.org/abs/2403.10056v1
- Date: Fri, 15 Mar 2024 06:54:20 GMT
- Title: Don't Half-listen: Capturing Key-part Information in Continual Instruction Tuning
- Authors: Yongquan He, Xuancheng Huang, Minghao Tang, Lingxun Meng, Xiang Li, Wei Lin, Wenyuan Zhang, Yifu Gao,
- Abstract summary: We propose a novel continual instruction tuning method based on Key-part Information Gain (KPIG)
Our method computes the information gain on masked parts to dynamically replay data and refine the training objective.
Experiments demonstrate our method achieves superior performance on both seen and held-out tasks.
- Score: 13.535110749767451
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Instruction tuning for large language models (LLMs) can drive them to produce results consistent with human goals in specific downstream tasks. However, the process of continual instruction tuning (CIT) for LLMs may bring about the catastrophic forgetting (CF) problem, where previously learned abilities are degraded. Recent methods try to alleviate the CF problem by modifying models or replaying data, which may only remember the surface-level pattern of instructions and get confused on held-out tasks. In this paper, we propose a novel continual instruction tuning method based on Key-part Information Gain (KPIG). Our method computes the information gain on masked parts to dynamically replay data and refine the training objective, which enables LLMs to capture task-aware information relevant to the correct response and alleviate overfitting to general descriptions in instructions. In addition, we propose two metrics, P-score and V-score, to measure the generalization and instruction-following abilities of LLMs. Experiments demonstrate our method achieves superior performance on both seen and held-out tasks.
Related papers
- Aligning Instruction Tuning with Pre-training [81.4748965653345]
We propose Aligning Instruction Tuning with Pre-training (AITP) to align instruction tuning with pre-training distributions.
We show consistent performance improvements with AITP on three fully open large language models (LLMs) across eight benchmarks.
arXiv Detail & Related papers (2025-01-16T08:27:40Z) - Instruction Tuning With Loss Over Instructions [42.9106826952674]
Instruction Modelling (IM) trains LMs by applying a loss function to the instruction and prompt part rather than solely to the output part.
We show that, in many scenarios, IM can effectively improve the LM performance on both NLP tasks and open-ended generation benchmarks.
Remarkably, in the most advantageous case, IM boosts model performance on AlpacaEval 1.0 by over 100%.
arXiv Detail & Related papers (2024-05-23T10:12:03Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning [79.32236399694077]
Low-quality data in the training set are usually detrimental to instruction tuning.
We propose a novel method, termed "reflection-tuning"
This approach utilizes an oracle LLM to recycle the original training data by introspecting and enhancing the quality of instructions and responses in the data.
arXiv Detail & Related papers (2023-10-18T05:13:47Z) - TRACE: A Comprehensive Benchmark for Continual Learning in Large
Language Models [52.734140807634624]
Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety.
Existing continual learning benchmarks lack sufficient challenge for leading aligned LLMs.
We introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs.
arXiv Detail & Related papers (2023-10-10T16:38:49Z) - From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning [63.63840740526497]
We investigate how instruction tuning adjusts pre-trained models with a focus on intrinsic changes.
The impact of instruction tuning is then studied by comparing the explanations derived from the pre-trained and instruction-tuned models.
Our findings reveal three significant impacts of instruction tuning.
arXiv Detail & Related papers (2023-09-30T21:16:05Z) - Instruction Position Matters in Sequence Generation with Large Language
Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization.
We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.