Related papers: Adversarial Bug Reports as a Security Risk in Language Model-Based Automated Program Repair

Adversarial Bug Reports as a Security Risk in Language Model-Based Automated Program Repair

URL: http://arxiv.org/abs/2509.05372v1
Date: Thu, 04 Sep 2025 09:41:57 GMT
Title: Adversarial Bug Reports as a Security Risk in Language Model-Based Automated Program Repair
Authors: Piotr Przymus, Andreas Happe, Jürgen Cito,
Abstract summary: Automated Program Repair (APR) systems are increasingly integrated into modern software development.<n>In this paper, we investigate the security risks posed by adversarial bug reports.<n>We develop a comprehensive threat model and conduct an empirical study to evaluate the vulnerability of state-of-the-art APR systems to such attacks.
Score: 1.1677624591989955
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Model (LLM) - based Automated Program Repair (APR) systems are increasingly integrated into modern software development workflows, offering automated patches in response to natural language bug reports. However, this reliance on untrusted user input introduces a novel and underexplored attack surface. In this paper, we investigate the security risks posed by adversarial bug reports -- realistic-looking issue submissions crafted to mislead APR systems into producing insecure or harmful code changes. We develop a comprehensive threat model and conduct an empirical study to evaluate the vulnerability of state-of-the-art APR systems to such attacks. Our demonstration comprises 51 adversarial bug reports generated across a spectrum of strategies, from manual curation to fully automated pipelines. We test these against leading APR model and assess both pre-repair defenses (e.g., LlamaGuard variants, PromptGuard variants, Granite-Guardian, and custom LLM filters) and post-repair detectors (GitHub Copilot, CodeQL). Our findings show that current defenses are insufficient: 90\% of crafted bug reports triggered attacker-aligned patches. The best pre-repair filter blocked only 47\%, while post-repair analysis-often requiring human oversight-was effective in just 58\% of cases. To support scalable security testing, we introduce a prototype framework for automating the generation of adversarial bug reports. Our analysis exposes a structural asymmetry: generating adversarial inputs is inexpensive, while detecting or mitigating them remains costly and error-prone. We conclude with practical recommendations for improving the robustness of APR systems against adversarial misuse and highlight directions for future work on trustworthy automated repair.

Related papers

ReasAlign: Reasoning Enhanced Safety Alignment against Prompt Injection Attack [52.17935054046577]
We present ReasAlign, a model-level solution to improve safety alignment against indirect prompt injection attacks.<n>ReasAlign incorporates structured reasoning steps to analyze user queries, detect conflicting instructions, and preserve the continuity of the user's intended tasks.
arXiv Detail & Related papers (2026-01-15T08:23:38Z)
Automated Red-Teaming Framework for Large Language Model Security Assessment: A Comprehensive Attack Generation and Detection System [4.864011355064205]
This paper introduces an automated red-teaming framework that generates, executes, and evaluates adversarial prompts to uncover security vulnerabilities in large language models (LLMs)<n>Our framework integrates meta-prompting-based attack synthesis, multi-modal vulnerability detection, and standardized evaluation protocols spanning six major threat categories.<n> Experiments on the GPT-OSS-20B model reveal 47 distinct vulnerabilities, including 21 high-severity and 12 novel attack patterns.
arXiv Detail & Related papers (2025-12-21T19:12:44Z)
Red Teaming Program Repair Agents: When Correct Patches can Hide Vulnerabilities [22.02073334787359]
We propose SWExploit, which generates adversarial issue statements to make APR agents produce patches that are functionally correct yet vulnerable.<n>Based on our evaluation, we are the first to challenge the traditional assumption that a patch passing all tests is inherently reliable and secure.
arXiv Detail & Related papers (2025-09-30T07:38:57Z)
VulAgent: Hypothesis-Validation based Multi-Agent Vulnerability Detection [55.957275374847484]
VulAgent is a multi-agent vulnerability detection framework based on hypothesis validation.<n>It implements a semantics-sensitive, multi-view detection pipeline, each aligned to a specific analysis perspective.<n>On average, VulAgent improves overall accuracy by 6.6%, increases the correct identification rate of vulnerable--fixed code pairs by up to 450%, and reduces the false positive rate by about 36%.
arXiv Detail & Related papers (2025-09-15T02:25:38Z)
Trust Me, I Know This Function: Hijacking LLM Static Analysis using Bias [3.178301843099705]
Large Language Models (LLMs) are increasingly trusted to perform automated code review and static analysis at scale.<n>This paper identifies and exploit a critical vulnerability in LLM-based code analysis.<n>We develop a fully automated, black-box algorithm that discovers and injects Familiar Pattern Attack (FPA) into target code.
arXiv Detail & Related papers (2025-08-24T13:42:48Z)
Benchmarking Misuse Mitigation Against Covert Adversaries [80.74502950627736]
Existing language model safety evaluations focus on overt attacks and low-stakes tasks.<n>We develop Benchmarks for Stateful Defenses (BSD), a data generation pipeline that automates evaluations of covert attacks and corresponding defenses.<n>Our evaluations indicate that decomposition attacks are effective misuse enablers, and highlight stateful defenses as a countermeasure.
arXiv Detail & Related papers (2025-06-06T17:33:33Z)
A Multi-Dataset Evaluation of Models for Automated Vulnerability Repair [2.7674959824386858]
This study investigates pre-trained language models, CodeBERT and CodeT5, for automated vulnerability patching across six datasets and four languages.<n>We evaluate their accuracy and generalization to unknown vulnerabilities.<n>Results show that while both models face challenges with fragmented or sparse context, CodeBERT performs comparatively better in such scenarios, whereas CodeT5 excels in capturing complex vulnerability patterns.
arXiv Detail & Related papers (2025-06-05T13:00:19Z)
AegisLLM: Scaling Agentic Systems for Self-Reflective Defense in LLM Security [74.22452069013289]
AegisLLM is a cooperative multi-agent defense against adversarial attacks and information leakage.<n>We show that scaling agentic reasoning system at test-time substantially enhances robustness without compromising model utility.<n> Comprehensive evaluations across key threat scenarios, including unlearning and jailbreaking, demonstrate the effectiveness of AegisLLM.
arXiv Detail & Related papers (2025-04-29T17:36:05Z)
Evaluating Pre-Trained Models for Multi-Language Vulnerability Patching [3.220818227251765]
This paper investigates the potential of pre-trained language models, CodeBERT and CodeT5, for automated vulnerability patching.<n>We evaluate these models on their accuracy, computational efficiency, and how the length of vulnerable code patches impacts performance.
arXiv Detail & Related papers (2025-01-13T13:51:05Z)
A Case Study of LLM for Automated Vulnerability Repair: Assessing Impact of Reasoning and Patch Validation Feedback [7.742213291781287]
We present VRpilot, a vulnerability repair technique based on reasoning and patch validation feedback. Our results show that VRpilot generates, on average, 14% and 7.6% more correct patches than the baseline techniques on C and Java.
arXiv Detail & Related papers (2024-05-24T16:29:48Z)
RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen) RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs. We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z)
A LLM Assisted Exploitation of AI-Guardian [57.572998144258705]
We evaluate the robustness of AI-Guardian, a recent defense to adversarial examples published at IEEE S&P 2023. We write none of the code to attack this model, and instead prompt GPT-4 to implement all attack algorithms following our instructions and guidance. This process was surprisingly effective and efficient, with the language model at times producing code from ambiguous instructions faster than the author of this paper could have done.
arXiv Detail & Related papers (2023-07-20T17:33:25Z)
DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection. Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables. We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z)
Early Detection of Security-Relevant Bug Reports using Machine Learning: How Far Are We? [6.438136820117887]
In a typical maintenance scenario, security-relevant bug reports are prioritised by the development team when preparing corrective patches. Open security-relevant bug reports can become a critical leak of sensitive information that attackers can leverage to perform zero-day attacks. In recent years, approaches for the detection of security-relevant bug reports based on machine learning have been reported with promising performance.
arXiv Detail & Related papers (2021-12-19T11:30:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.