Related papers: RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions

RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions

URL: http://arxiv.org/abs/2402.16431v1
Date: Mon, 26 Feb 2024 09:30:55 GMT
Title: RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions
Authors: Yuansen Zhang, Xiao Wang, Zhiheng Xi, Han Xia, Tao Gui, Qi Zhang, Xuanjing Huang
Abstract summary: We utilize instructions in code style, which are more structural and less ambiguous, to replace typically natural language instructions. Under few-shot scenarios, we propose a novel method to compose in-context demonstrations using both clean and adversarial samples. Experiments on eight robustness datasets show that our method consistently outperforms prompting LLMs with natural language instructions.
Score: 43.19966425619236
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have showcased remarkable capabilities in following human instructions. However, recent studies have raised concerns about the robustness of LLMs when prompted with instructions combining textual adversarial samples. In this paper, drawing inspiration from recent works that LLMs are sensitive to the design of the instructions, we utilize instructions in code style, which are more structural and less ambiguous, to replace typically natural language instructions. Through this conversion, we provide LLMs with more precise instructions and strengthen the robustness of LLMs. Moreover, under few-shot scenarios, we propose a novel method to compose in-context demonstrations using both clean and adversarial samples (\textit{adversarial context method}) to further boost the robustness of the LLMs. Experiments on eight robustness datasets show that our method consistently outperforms prompting LLMs with natural language instructions. For example, with gpt-3.5-turbo, our method achieves an improvement of 5.68\% in test set accuracy and a reduction of 5.66 points in Attack Success Rate (ASR).

Related papers

Bridging Writing Manner Gap in Visual Instruction Tuning by Creating LLM-aligned Instructions [20.58878416527427]
We argue that there exists a substantial writing manner gap between the visual instructions and the base Large Language Models (LLMs) within LMMs. We propose leveraging the base LLM to align the writing manner of soft-format visual instructions with that of the base LLM itself, resulting in novel LLM-aligned instructions.
arXiv Detail & Related papers (2025-03-24T03:59:06Z)
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code) Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z)
Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models [91.02730155418699]
Large language models (LLMs) can perform a wide range of tasks by following natural language instructions. We introduce Auto-Instruct, a novel method to automatically improve the quality of instructions provided to LLMs. In experiments on 118 out-of-domain tasks, Auto-Instruct surpasses both human-written instructions and existing baselines of LLM-generated instructions.
arXiv Detail & Related papers (2023-10-19T19:52:55Z)
Evaluating Large Language Models at Evaluating Instruction Following [54.49567482594617]
We introduce a challenging meta-evaluation benchmark, LLMBar, designed to test the ability of an LLM evaluator in discerning instruction-following outputs. We discover that different evaluators exhibit distinct performance on LLMBar and even the highest-scoring ones have substantial room for improvement.
arXiv Detail & Related papers (2023-10-11T16:38:11Z)
Evaluating the Robustness to Instructions of Large Language Models [6.947956990248856]
Fine-tuning Large Language Models (LLMs) can boost their zero-shot capabilities on novel tasks. We evaluate six models including Alpaca, Vicuna, WizardLM, and Traditional Task-oriented Models(Flan-T5-XL/XXL, T0++) We find that the robustness of different scales of FLAN-T5 models to RE instruction is worse than the robustness to QA instruction.
arXiv Detail & Related papers (2023-08-28T04:57:07Z)
Improving Translation Faithfulness of Large Language Models via Augmenting Instructions [89.76691340615848]
We propose SWIE (Segment-Weighted Instruction Embedding) and an instruction-following dataset OVERMISS. SWIE improves the model instruction understanding by adding a global instruction representation on the following input and response representations. OVERMISS improves model faithfulness by comparing over-translation and miss-translation results with the correct translation.
arXiv Detail & Related papers (2023-08-24T09:32:29Z)
Scaling Sentence Embeddings with Large Language Models [43.19994568210206]
In this work, we propose an in-context learning-based method aimed at improving sentence embeddings performance. Our approach involves adapting the previous prompt-based representation method for autoregressive models. By scaling model size, we find scaling to more than tens of billion parameters harms the performance on semantic textual similarity tasks.
arXiv Detail & Related papers (2023-07-31T13:26:03Z)
Enhancing Large Language Models Against Inductive Instructions with Dual-critique Prompting [55.15697111170836]
This paper reveals the behaviors of large language models (LLMs) towards textitinductive instructions and enhance their truthfulness and helpfulness accordingly. After extensive human and automatic evaluations, we uncovered a universal vulnerability among LLMs in processing inductive instructions. We identify that different inductive styles affect the models' ability to identify the same underlying errors, and the complexity of the underlying assumptions also influences the model's performance.
arXiv Detail & Related papers (2023-05-23T06:38:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.