RoCoIns: Enhancing Robustness of Large Language Models through
Code-Style Instructions
- URL: http://arxiv.org/abs/2402.16431v1
- Date: Mon, 26 Feb 2024 09:30:55 GMT
- Title: RoCoIns: Enhancing Robustness of Large Language Models through
Code-Style Instructions
- Authors: Yuansen Zhang, Xiao Wang, Zhiheng Xi, Han Xia, Tao Gui, Qi Zhang,
Xuanjing Huang
- Abstract summary: We utilize instructions in code style, which are more structural and less ambiguous, to replace typically natural language instructions.
Under few-shot scenarios, we propose a novel method to compose in-context demonstrations using both clean and adversarial samples.
Experiments on eight robustness datasets show that our method consistently outperforms prompting LLMs with natural language instructions.
- Score: 43.19966425619236
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have showcased remarkable capabilities in
following human instructions. However, recent studies have raised concerns
about the robustness of LLMs when prompted with instructions combining textual
adversarial samples. In this paper, drawing inspiration from recent works that
LLMs are sensitive to the design of the instructions, we utilize instructions
in code style, which are more structural and less ambiguous, to replace
typically natural language instructions. Through this conversion, we provide
LLMs with more precise instructions and strengthen the robustness of LLMs.
Moreover, under few-shot scenarios, we propose a novel method to compose
in-context demonstrations using both clean and adversarial samples
(\textit{adversarial context method}) to further boost the robustness of the
LLMs. Experiments on eight robustness datasets show that our method
consistently outperforms prompting LLMs with natural language instructions. For
example, with gpt-3.5-turbo, our method achieves an improvement of 5.68\% in
test set accuracy and a reduction of 5.66 points in Attack Success Rate (ASR).
Related papers
- If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - Auto-Instruct: Automatic Instruction Generation and Ranking for
Black-Box Language Models [91.02730155418699]
Large language models (LLMs) can perform a wide range of tasks by following natural language instructions.
We introduce Auto-Instruct, a novel method to automatically improve the quality of instructions provided to LLMs.
In experiments on 118 out-of-domain tasks, Auto-Instruct surpasses both human-written instructions and existing baselines of LLM-generated instructions.
arXiv Detail & Related papers (2023-10-19T19:52:55Z) - Evaluating Large Language Models at Evaluating Instruction Following [54.49567482594617]
We introduce a challenging meta-evaluation benchmark, LLMBar, designed to test the ability of an LLM evaluator in discerning instruction-following outputs.
We discover that different evaluators exhibit distinct performance on LLMBar and even the highest-scoring ones have substantial room for improvement.
arXiv Detail & Related papers (2023-10-11T16:38:11Z) - Evaluating the Robustness to Instructions of Large Language Models [6.947956990248856]
Fine-tuning Large Language Models (LLMs) can boost their zero-shot capabilities on novel tasks.
We evaluate six models including Alpaca, Vicuna, WizardLM, and Traditional Task-oriented Models(Flan-T5-XL/XXL, T0++)
We find that the robustness of different scales of FLAN-T5 models to RE instruction is worse than the robustness to QA instruction.
arXiv Detail & Related papers (2023-08-28T04:57:07Z) - Improving Translation Faithfulness of Large Language Models via
Augmenting Instructions [89.76691340615848]
We propose SWIE (Segment-Weighted Instruction Embedding) and an instruction-following dataset OVERMISS.
SWIE improves the model instruction understanding by adding a global instruction representation on the following input and response representations.
OVERMISS improves model faithfulness by comparing over-translation and miss-translation results with the correct translation.
arXiv Detail & Related papers (2023-08-24T09:32:29Z) - Scaling Sentence Embeddings with Large Language Models [43.19994568210206]
In this work, we propose an in-context learning-based method aimed at improving sentence embeddings performance.
Our approach involves adapting the previous prompt-based representation method for autoregressive models.
By scaling model size, we find scaling to more than tens of billion parameters harms the performance on semantic textual similarity tasks.
arXiv Detail & Related papers (2023-07-31T13:26:03Z) - Enhancing Large Language Models Against Inductive Instructions with
Dual-critique Prompting [55.15697111170836]
This paper reveals the behaviors of large language models (LLMs) towards textitinductive instructions and enhance their truthfulness and helpfulness accordingly.
After extensive human and automatic evaluations, we uncovered a universal vulnerability among LLMs in processing inductive instructions.
We identify that different inductive styles affect the models' ability to identify the same underlying errors, and the complexity of the underlying assumptions also influences the model's performance.
arXiv Detail & Related papers (2023-05-23T06:38:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.