Related papers: Fine-Tuning on Noisy Instructions: Effects on Generalization and Performance

Fine-Tuning on Noisy Instructions: Effects on Generalization and Performance

URL: http://arxiv.org/abs/2510.03528v1
Date: Fri, 03 Oct 2025 21:54:33 GMT
Title: Fine-Tuning on Noisy Instructions: Effects on Generalization and Performance
Authors: Ahmed Alajrami, Xingwei Tan, Nikolaos Aletras,
Abstract summary: We show that perturbations in instruction-tuning data can enhance large language models' resistance against noisy instructions.<n>Surprisingly, our results suggest that instruction-tuning on perturbed instructions can, in some cases, improve downstream performance.
Score: 29.349544663659938
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Instruction-tuning plays a vital role in enhancing the task-solving abilities of large language models (LLMs), improving their usability in generating helpful responses on various tasks. However, previous work has demonstrated that they are sensitive to minor variations in instruction phrasing. In this paper, we explore whether introducing perturbations in instruction-tuning data can enhance LLMs' resistance against noisy instructions. We focus on how instruction-tuning with perturbations, such as removing stop words or shuffling words, affects LLMs' performance on the original and perturbed versions of widely-used benchmarks (MMLU, BBH, GSM8K). We further assess learning dynamics and potential shifts in model behavior. Surprisingly, our results suggest that instruction-tuning on perturbed instructions can, in some cases, improve downstream performance. These findings highlight the importance of including perturbed instructions in instruction-tuning, which can make LLMs more resilient to noisy user inputs.

Related papers

Exploring the Impact of Instruction-Tuning on LLM's Susceptibility to Misinformation [3.032542495872679]
We investigate the impact of instruction-tuning on large language models' susceptibility to misinformation.<n>Our analysis reveals that instruction-tuned LLMs are significantly more likely to accept misinformation when it is presented by the user.
arXiv Detail & Related papers (2025-07-24T08:58:47Z)
Enhancing LLM Robustness to Perturbed Instructions: An Empirical Study [8.827173113748701]
We study character- and word-level edits of task-specific instructions, which substantially degrade downstream performance.<n>We find that, on average, self-denoising achieves substantially higher performance gains than alternative strategies.
arXiv Detail & Related papers (2025-04-03T16:17:56Z)
The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities [51.594836904623534]
We investigate whether instruction-tuned models possess fundamentally different capabilities from base models that are prompted using in-context examples.<n>We show that the performance of instruction-tuned models is significantly correlated with the in-context performance of their base counterparts.<n>Specifically, we extend this understanding to instruction-tuned models, suggesting that their pretraining data similarly sets a limiting boundary on the tasks they can solve.
arXiv Detail & Related papers (2025-01-15T10:57:55Z)
Eliciting Causal Abilities in Large Language Models for Reasoning Tasks [14.512834333917414]
We introduce the Self-Causal Instruction Enhancement (SCIE) method, which enables LLMs to generate high-quality, low-quantity observational data.<n>In SCIE, the instructions are treated as the treatment, and textual features are used to process natural language.<n>Our method effectively generates instructions that enhance reasoning performance with reduced training cost of prompts.
arXiv Detail & Related papers (2024-12-19T17:03:02Z)
SwitchCIT: Switching for Continual Instruction Tuning [14.085371250265224]
Large language models (LLMs) and multimodal models (MMs) have exhibited impressive capabilities in various domains.<n>Continual instruction tuning is crucial to adapt a large model to evolving tasks and domains.<n>This work addresses the catastrophic forgetting in continual instruction learning through a mechanism for routing computations to parameter-efficient tuned models.
arXiv Detail & Related papers (2024-07-16T14:37:33Z)
Contrastive Instruction Tuning [61.97704869248903]
We propose Contrastive Instruction Tuning to maximize the similarity between semantically equivalent instruction-instance pairs. Experiments on the PromptBench benchmark show that CoIN consistently improves LLMs' robustness to unseen instructions with variations across character, word, sentence, and semantic levels by an average of +2.5% in accuracy.
arXiv Detail & Related papers (2024-02-17T00:09:32Z)
Demystifying Instruction Mixing for Fine-tuning Large Language Models [29.69436955342966]
This study categorizes instructions into three primary types: NLP downstream tasks, coding, and general chat. We find that certain instruction types are more advantageous for specific applications but can negatively impact other areas.
arXiv Detail & Related papers (2023-12-17T18:44:26Z)
Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning [79.32236399694077]
Low-quality data in the training set are usually detrimental to instruction tuning. We propose a novel method, termed "reflection-tuning" This approach utilizes an oracle LLM to recycle the original training data by introspecting and enhancing the quality of instructions and responses in the data.
arXiv Detail & Related papers (2023-10-18T05:13:47Z)
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning [63.63840740526497]
We investigate how instruction tuning adjusts pre-trained models with a focus on intrinsic changes. The impact of instruction tuning is then studied by comparing the explanations derived from the pre-trained and instruction-tuned models. Our findings reveal three significant impacts of instruction tuning.
arXiv Detail & Related papers (2023-09-30T21:16:05Z)
Instruction Position Matters in Sequence Generation with Large Language Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization. We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z)
Enhancing Large Language Models Against Inductive Instructions with Dual-critique Prompting [55.15697111170836]
This paper reveals the behaviors of large language models (LLMs) towards textitinductive instructions and enhance their truthfulness and helpfulness accordingly. After extensive human and automatic evaluations, we uncovered a universal vulnerability among LLMs in processing inductive instructions. We identify that different inductive styles affect the models' ability to identify the same underlying errors, and the complexity of the underlying assumptions also influences the model's performance.
arXiv Detail & Related papers (2023-05-23T06:38:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.