On the Adversarial Robustness of Instruction-Tuned Large Language Models for Code
- URL: http://arxiv.org/abs/2411.19508v1
- Date: Fri, 29 Nov 2024 07:00:47 GMT
- Title: On the Adversarial Robustness of Instruction-Tuned Large Language Models for Code
- Authors: Md Imran Hossen, Xiali Hei,
- Abstract summary: We assess the impact of diverse input challenges on the functionality and correctness of generated code using rigorous metrics and established benchmarks.<n>Open-source models demonstrate an increased susceptibility to input perturbations, resulting in declines in functional correctness ranging from 12% to 34%.<n>In contrast, commercial models demonstrate relatively greater resilience, with performance degradation ranging from 3% to 24%.
- Score: 4.286327408435937
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The advent of instruction-tuned Large Language Models designed for coding tasks (Code LLMs) has transformed software engineering practices. However, their robustness against various input challenges remains a critical concern. This study introduces DegradePrompter, a novel method designed to systematically evaluate the robustness of instruction-tuned Code LLMs. We assess the impact of diverse input challenges on the functionality and correctness of generated code using rigorous metrics and established benchmarks. Our comprehensive evaluation includes five state-of-the-art open-source models and three production-grade closed-source models, revealing varying degrees of robustness. Open-source models demonstrate an increased susceptibility to input perturbations, resulting in declines in functional correctness ranging from 12% to 34%. In contrast, commercial models demonstrate relatively greater resilience, with performance degradation ranging from 3% to 24%. To enhance the robustness of the models against these vulnerabilities, we investigate a straightforward yet effective mitigation strategy. Our findings highlight the need for robust defense mechanisms and comprehensive evaluations during both the development and deployment phases to ensure the resilience and reliability of automated code generation systems.
Related papers
- Towards Robust LLMs: an Adversarial Robustness Measurement Framework [0.0]
Large Language Models (LLMs) remain vulnerable to adversarial perturbations, undermining their reliability in high-stakes applications.
We adapt the Robustness Measurement and Assessment framework to quantify LLM resilience against adversarial inputs without requiring access to model parameters.
Our work provides a systematic methodology to assess LLM robustness, advancing the development of more reliable language models for real-world deployment.
arXiv Detail & Related papers (2025-04-24T16:36:19Z) - T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models [88.63040835652902]
Text to video models are vulnerable to jailbreak attacks, where specially crafted prompts bypass safety mechanisms and lead to the generation of harmful or unsafe content.
We propose T2VShield, a comprehensive and model agnostic defense framework designed to protect text to video models from jailbreak threats.
Our method systematically analyzes the input, model, and output stages to identify the limitations of existing defenses.
arXiv Detail & Related papers (2025-04-22T01:18:42Z) - Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-Judge [0.0]
Large Language Models (LLMs) have revolutionized artificial intelligence, driving advancements in machine translation, summarization, and conversational agents.
Recent studies indicate that LLMs remain vulnerable to adversarial attacks designed to elicit biased responses.
This work proposes a scalable benchmarking framework to evaluate LLM robustness against adversarial bias elicitation.
arXiv Detail & Related papers (2025-04-10T16:00:59Z) - Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models.
Our framework incorporates two complementary strategies: internal TTC and external TTC.
We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z) - AutoLogi: Automated Generation of Logic Puzzles for Evaluating Reasoning Abilities of Large Language Models [86.83875864328984]
We propose an automated method for synthesizing open-ended logic puzzles, and use it to develop a bilingual benchmark, AutoLogi.
Our approach features program-based verification and controllable difficulty levels, enabling more reliable evaluation that better distinguishes models' reasoning abilities.
arXiv Detail & Related papers (2025-02-24T07:02:31Z) - Breaking Focus: Contextual Distraction Curse in Large Language Models [68.4534308805202]
We investigate a critical vulnerability in Large Language Models (LLMs)
This phenomenon arises when models fail to maintain consistent performance on questions modified with semantically coherent but irrelevant context.
We propose an efficient tree-based search methodology to automatically generate CDV examples.
arXiv Detail & Related papers (2025-02-03T18:43:36Z) - HarmLevelBench: Evaluating Harm-Level Compliance and the Impact of Quantization on Model Alignment [1.8843687952462742]
This paper aims to address gaps in the current literature on jailbreaking techniques and the evaluation of LLM vulnerabilities.
Our contributions include the creation of a novel dataset designed to assess the harmfulness of model outputs across multiple harm levels.
We provide a comprehensive benchmark of state-of-the-art jailbreaking attacks, specifically targeting the Vicuna 13B v1.5 model.
arXiv Detail & Related papers (2024-11-11T10:02:49Z) - Outside the Comfort Zone: Analysing LLM Capabilities in Software Vulnerability Detection [9.652886240532741]
This paper thoroughly analyses large language models' capabilities in detecting vulnerabilities within source code.
We evaluate the performance of six open-source models that are specifically trained for vulnerability detection against six general-purpose LLMs.
arXiv Detail & Related papers (2024-08-29T10:00:57Z) - SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models [54.78329741186446]
We propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation.
Experiments across both in-domain and out-of-domain benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.
arXiv Detail & Related papers (2024-08-28T06:33:03Z) - The Risk of Federated Learning to Skew Fine-Tuning Features and
Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model.
We introduce three robustness indicators and conduct experiments across diverse robust datasets.
Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - On the Reliability and Explainability of Language Models for Program
Generation [15.569926313298337]
We study the capabilities and limitations of automated program generation approaches.
We employ advanced explainable AI approaches to highlight the tokens that significantly contribute to the code transformation.
Our analysis reveals that, in various experimental scenarios, language models can recognize code grammar and structural information, but they exhibit limited robustness to changes in input sequences.
arXiv Detail & Related papers (2023-02-19T14:59:52Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z) - Measure and Improve Robustness in NLP Models: A Survey [23.515869499536237]
robustness has been separately explored in applications like vision and NLP, with various definitions, evaluation and mitigation strategies in multiple lines of research.
We first connect multiple definitions of robustness, then unify various lines of work on identifying robustness failures and evaluating models' robustness.
We present mitigation strategies that are data-driven, model-driven, and inductive-prior-based, with a more systematic view of how to effectively improve robustness in NLP models.
arXiv Detail & Related papers (2021-12-15T18:02:04Z) - Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.