Related papers: Fine-tuning Large Language Models with Sequential Instructions

Fine-tuning Large Language Models with Sequential Instructions

URL: http://arxiv.org/abs/2403.07794v3
Date: Wed, 3 Jul 2024 08:18:44 GMT
Title: Fine-tuning Large Language Models with Sequential Instructions
Authors: Hanxu Hu, Simon Yu, Pinzhen Chen, Edoardo M. Ponti,
Abstract summary: We find that existing instruction-tuned models struggle to respond to queries with multiple instructions. We contend that part of the fine-tuning data mixture should be sequential--containing a chain of interrelated tasks. We automate this process by turning instructions in existing datasets into diverse and complex sequential instructions. Models that underwent our sequential instruction tuning show improved results in coding, maths, and open-ended generation.
Score: 2.546845645875049
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the success of existing instruction-tuned models, we find that they usually struggle to respond to queries with multiple instructions. This impairs their performance in complex problems whose solution consists of multiple intermediate tasks. Thus, we contend that part of the fine-tuning data mixture should be sequential--containing a chain of interrelated tasks. We first approach sequential instruction tuning from a task-driven perspective, manually creating interpretable intermediate tasks for multilingual and visual question answering: namely "translate then predict" and "caption then answer". Next, we automate this process by turning instructions in existing datasets (e.g., Alpaca and FlanCoT) into diverse and complex sequential instructions, making our method general-purpose. Models that underwent our sequential instruction tuning show improved results in coding, maths, and open-ended generation. Moreover, we put forward a new benchmark named SeqEval to evaluate a model's ability to follow all the instructions in a sequence, which further corroborates the benefits of our fine-tuning method. We hope that our endeavours will open new research avenues on instruction tuning for complex tasks.

Related papers

The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models [48.455388608863785]
We introduce a benchmark designed to evaluate models' abilities to follow multiple instructions through sequential instruction following tasks. Our benchmark evaluates instruction following using four tasks (text modification, question answering, mathematics, and security rules) More recent and larger models significantly outperform their older and smaller counterparts on the SIFo tasks, validating the benchmark's effectiveness.
arXiv Detail & Related papers (2024-06-28T15:34:26Z)
From Symbolic Tasks to Code Generation: Diversification Yields Better Task Performers [1.6958018695660049]
We show that a more diverse instruction set, extending beyond code-related tasks, improves the performance of code generation. Our observations suggest that a more diverse semantic space for instruction-tuning sets greatly improves the model's ability to follow instructions and perform tasks.
arXiv Detail & Related papers (2024-05-30T07:54:07Z)
Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models [15.444719480373001]
We propose a novel concept of compositional instructions called chain-of-instructions (CoI) Unlike the conventional practice of solving single instruction tasks, our proposed method encourages a model to solve each subtask step by step until the final answer is reached. CoI-tuning improves the model's ability to handle instructions composed of multiple subtasks as well as unseen composite tasks such as multilingual summarization.
arXiv Detail & Related papers (2024-02-18T10:10:40Z)
Instruction Diversity Drives Generalization To Unseen Tasks [1.9059113568275998]
Generalization emerges once a diverse enough set of tasks is provided, even though very few examples are provided for each task. Generalization emerges once a diverse enough set of tasks is provided, even though very few examples are provided for each task.
arXiv Detail & Related papers (2024-02-16T18:47:21Z)
Context-dependent Instruction Tuning for Dialogue Response Generation [61.21790201307179]
Recent language models have achieved impressive performance in natural language computation tasks by incorporating instructions with task input during fine-tuning. We introduce a context-based instruction fine-tuning framework for each multi-turn dialogue. During the evaluation, the model generates instructions based on the previous context to self-guide the response.
arXiv Detail & Related papers (2023-11-13T01:25:30Z)
Instruction Position Matters in Sequence Generation with Large Language Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization. We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z)
Discovering Non-monotonic Autoregressive Orderings with Variational Inference [67.27561153666211]
We develop an unsupervised parallelizable learner that discovers high-quality generation orders purely from training data. We implement the encoder as a Transformer with non-causal attention that outputs permutations in one forward pass. Empirical results in language modeling tasks demonstrate that our method is context-aware and discovers orderings that are competitive with or even better than fixed orders.
arXiv Detail & Related papers (2021-10-27T16:08:09Z)
Understanding Multimodal Procedural Knowledge by Sequencing Multimodal Instructional Manuals [48.55362590292391]
We benchmark machine learning models' capability of reasoning over and sequencing unordered multimodal instructions. We find models not only perform significantly worse than humans but also seem incapable of efficiently utilizing the multimodal information. We propose sequentiality-aware pretraining techniques that exploit the sequential alignment properties of both texts and images.
arXiv Detail & Related papers (2021-10-16T06:12:15Z)
Automated Concatenation of Embeddings for Structured Prediction [75.44925576268052]
We propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks. We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model.
arXiv Detail & Related papers (2020-10-10T14:03:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.