Related papers: Instructive Decoding: Instruction-Tuned Large Language Models are Self-Refiner from Noisy Instructions

Instructive Decoding: Instruction-Tuned Large Language Models are Self-Refiner from Noisy Instructions

URL: http://arxiv.org/abs/2311.00233v2
Date: Sat, 17 Feb 2024 09:00:29 GMT
Title: Instructive Decoding: Instruction-Tuned Large Language Models are Self-Refiner from Noisy Instructions
Authors: Taehyeon Kim, Joonkee Kim, Gihun Lee, Se-Young Yun
Abstract summary: This paper presents Instructive Decoding (ID), a simple yet effective approach that augments the efficacy of instruction-tuned models. ID adjusts the logits for next-token prediction in a contrastive manner, utilizing predictions generated from a manipulated version of the original instruction. We conduct experiments across a spectrum of such noisy instructions, ranging from those that insert semantic noise via random words to others like 'opposite' that elicit deviated responses.
Score: 26.192531184689763
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While instruction-tuned language models have demonstrated impressive zero-shot generalization, these models often struggle to generate accurate responses when faced with instructions that fall outside their training set. This paper presents Instructive Decoding (ID), a simple yet effective approach that augments the efficacy of instruction-tuned models. Specifically, ID adjusts the logits for next-token prediction in a contrastive manner, utilizing predictions generated from a manipulated version of the original instruction, referred to as a noisy instruction. This noisy instruction aims to elicit responses that could diverge from the intended instruction yet remain plausible. We conduct experiments across a spectrum of such noisy instructions, ranging from those that insert semantic noise via random words to others like 'opposite' that elicit the deviated responses. Our approach achieves considerable performance gains across various instruction-tuned models and tasks without necessitating any additional parameter updates. Notably, utilizing 'opposite' as the noisy instruction in ID, which exhibits the maximum divergence from the original instruction, consistently produces the most significant performance gains across multiple models and tasks.

Related papers

On the Effect of Instruction Tuning Loss on Generalization [22.288479270814484]
We show that the standard instruction tuning loss often yields suboptimal performance and limited robustness to input prompt variations.<n>We find that a low-to-moderate weight for prompt tokens coupled with a moderate-to-high weight for response tokens yields the best-performing models across settings.
arXiv Detail & Related papers (2025-07-10T14:46:33Z)
Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction [68.6543680065379]
Large language models (LLMs) are vulnerable to prompt injection attacks. We propose a novel defense method that leverages, rather than suppresses, the instruction-following abilities of LLMs.
arXiv Detail & Related papers (2025-04-29T07:13:53Z)
Improving Instruction-Following in Language Models through Activation Steering [58.876600545898675]
We derive instruction-specific vector representations from language models and use them to steer models accordingly. We demonstrate how this method can enhance model adherence to constraints such as output format, length, and word inclusion. Our findings demonstrate that activation steering offers a practical and scalable approach for fine-grained control in language generation.
arXiv Detail & Related papers (2024-10-15T08:38:20Z)
Instruction Following without Instruction Tuning [87.72635104686275]
We find two forms of adaptation (tuning) that are deficient compared to instruction tuning, yet still yield instruction following. We support this by hand-writing a rule-based language model which yields instruction following in a product-of-experts with a pretrained model.
arXiv Detail & Related papers (2024-09-21T22:36:22Z)
From Symbolic Tasks to Code Generation: Diversification Yields Better Task Performers [1.6958018695660049]
We show that a more diverse instruction set, extending beyond code-related tasks, improves the performance of code generation. Our observations suggest that a more diverse semantic space for instruction-tuning sets greatly improves the model's ability to follow instructions and perform tasks.
arXiv Detail & Related papers (2024-05-30T07:54:07Z)
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning [63.63840740526497]
We investigate how instruction tuning adjusts pre-trained models with a focus on intrinsic changes. The impact of instruction tuning is then studied by comparing the explanations derived from the pre-trained and instruction-tuned models. Our findings reveal three significant impacts of instruction tuning.
arXiv Detail & Related papers (2023-09-30T21:16:05Z)
Instruction Position Matters in Sequence Generation with Large Language Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization. We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z)
Instruction-following Evaluation through Verbalizer Manipulation [64.73188776428799]
We propose a novel instruction-following evaluation protocol called verbalizer manipulation. It instructs the model to verbalize the task label with words aligning with model priors to different extents. We observe that the instruction-following abilities of models, across different families and scales, are significantly distinguished by their performance on less natural verbalizers.
arXiv Detail & Related papers (2023-07-20T03:54:24Z)
Evaluating the Zero-shot Robustness of Instruction-tuned Language Models [23.488398944358643]
We find that using novel (unobserved) but appropriate instruction phrasings consistently degrades model performance. We propose a simple method to mitigate this issue by introducing soft prompt'' embedding parameters. We show that this method consistently improves the robustness of instruction-tuned models.
arXiv Detail & Related papers (2023-06-20T03:48:51Z)
Enhancing Large Language Models Against Inductive Instructions with Dual-critique Prompting [55.15697111170836]
This paper reveals the behaviors of large language models (LLMs) towards textitinductive instructions and enhance their truthfulness and helpfulness accordingly. After extensive human and automatic evaluations, we uncovered a universal vulnerability among LLMs in processing inductive instructions. We identify that different inductive styles affect the models' ability to identify the same underlying errors, and the complexity of the underlying assumptions also influences the model's performance.
arXiv Detail & Related papers (2023-05-23T06:38:20Z)
Self-Instruct: Aligning Language Models with Self-Generated Instructions [76.42871502364697]
Self-Instruct is a framework for improving the instruction-following capabilities of pretrained language models. Our pipeline generates instructions, input, and output samples from a language model, then filters invalid or similar ones before using them to finetune the original model. For further evaluation, we curate a set of expert-written instructions for novel tasks, and show through human evaluation that tuning GPT3 with Self-Instruct outperforms using existing public instruction datasets by a large margin.
arXiv Detail & Related papers (2022-12-20T18:59:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.