Related papers: RTL-Breaker: Assessing the Security of LLMs against Backdoor Attacks on HDL Code Generation

RTL-Breaker: Assessing the Security of LLMs against Backdoor Attacks on HDL Code Generation

URL: http://arxiv.org/abs/2411.17569v1
Date: Tue, 26 Nov 2024 16:31:18 GMT
Title: RTL-Breaker: Assessing the Security of LLMs against Backdoor Attacks on HDL Code Generation
Authors: Lakshmi Likhitha Mankali, Jitendra Bhandari, Manaar Alam, Ramesh Karri, Michail Maniatakos, Ozgur Sinanoglu, Johann Knechtel,
Abstract summary: Large language models (LLMs) have demonstrated remarkable potential with code generation/completion tasks for hardware design. LLMs are susceptible to so-called data poisoning or backdoor attacks. Here, attackers inject malicious code for the training data, which can be carried over into the HDL code generated by LLMs.
Score: 17.53405545690049
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) have demonstrated remarkable potential with code generation/completion tasks for hardware design. In fact, LLM-based hardware description language (HDL) code generation has enabled the industry to realize complex designs more quickly, reducing the time and effort required in the development cycle. However, the increased reliance on such automation introduces critical security risks. Notably, given that LLMs have to be trained on vast datasets of codes that are typically sourced from publicly available repositories (often without thorough validation), LLMs are susceptible to so-called data poisoning or backdoor attacks. Here, attackers inject malicious code for the training data, which can be carried over into the HDL code generated by LLMs. This threat vector can compromise the security and integrity of entire hardware systems. In this work, we propose RTL-Breaker, a novel backdoor attack framework on LLM-based HDL code generation. RTL-Breaker provides an in-depth analysis for essential aspects of this novel problem: 1) various trigger mechanisms versus their effectiveness for inserting malicious modifications, and 2) side-effects by backdoor attacks on code generation in general, i.e., impact on code quality. RTL-Breaker emphasizes the urgent need for more robust measures to safeguard against such attacks. Toward that end, we open-source our framework and all data.

Related papers

Inducing Vulnerable Code Generation in LLM Coding Assistants [10.067898047221558]
In this paper, we reveal a real-world threat, named HACKODE, where attackers exploit referenced external information to embed attack sequences. We designed a prototype of the attack, which generates effective attack sequences for potential diverse inputs. On a real-world application, HACKODE achieves 75.92% ASR, demonstrating its real-world impact.
arXiv Detail & Related papers (2025-04-22T13:09:20Z)
Smoke and Mirrors: Jailbreaking LLM-based Code Generation via Implicit Malicious Prompts [5.718926328180089]
This paper introduces a jailbreaking approach, CodeJailbreaker, designed to uncover safety concerns in code generation. Experiments on the recently-released RMCBench benchmark demonstrate that CodeJailbreaker markedly surpasses the conventional jailbreaking strategy.
arXiv Detail & Related papers (2025-03-23T06:06:12Z)
Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks [88.84977282952602]
A high volume of recent ML security literature focuses on attacks against aligned large language models (LLMs) In this paper, we analyze security and privacy vulnerabilities that are unique to LLM agents. We conduct a series of illustrative attacks on popular open-source and commercial agents, demonstrating the immediate practical implications of their vulnerabilities.
arXiv Detail & Related papers (2025-02-12T17:19:36Z)
LLM-Virus: Evolutionary Jailbreak Attack on Large Language Models [59.29840790102413]
Existing jailbreak attacks are primarily based on opaque optimization techniques and gradient search methods. We propose LLM-Virus, a jailbreak attack method based on evolutionary algorithm, termed evolutionary jailbreak. Our results show that LLM-Virus achieves competitive or even superior performance compared to existing attack methods.
arXiv Detail & Related papers (2024-12-28T07:48:57Z)
Look Before You Leap: Enhancing Attention and Vigilance Regarding Harmful Content with GuidelineLLM [53.79753074854936]
Large language models (LLMs) are increasingly vulnerable to emerging jailbreak attacks. This vulnerability poses significant risks to real-world applications. We propose a novel defensive paradigm called GuidelineLLM.
arXiv Detail & Related papers (2024-12-10T12:42:33Z)
Aligning LLMs to Be Robust Against Prompt Injection [55.07562650579068]
We show that alignment can be a powerful tool to make LLMs more robust against prompt injection attacks. Our method -- SecAlign -- first builds an alignment dataset by simulating prompt injection attacks. Our experiments show that SecAlign robustifies the LLM substantially with a negligible hurt on model utility.
arXiv Detail & Related papers (2024-10-07T19:34:35Z)
HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data [60.75578581719921]
Large language models (LLMs) have shown great potential for automatic code generation. Recent studies highlight that many LLM-generated code contains serious security vulnerabilities. We introduce HexaCoder, a novel approach to enhance the ability of LLMs to generate secure codes.
arXiv Detail & Related papers (2024-09-10T12:01:43Z)
MEGen: Generative Backdoor in Large Language Models via Model Editing [56.46183024683885]
Large language models (LLMs) have demonstrated remarkable capabilities. Their powerful generative abilities enable flexible responses based on various queries or instructions. This paper proposes an editing-based generative backdoor, named MEGen, aiming to create a customized backdoor for NLP tasks with the least side effects.
arXiv Detail & Related papers (2024-08-20T10:44:29Z)
An Exploratory Study on Fine-Tuning Large Language Models for Secure Code Generation [17.69409515806874]
We present an exploratory study on whether fine-tuning pre-trained LLMs on datasets of vulnerability-fixing commits can promote secure code generation. We crawled a fine-tuning dataset for secure code generation by collecting code fixes of confirmed vulnerabilities from open-source repositories. Our exploration reveals that fine-tuning LLMs can improve secure code generation by 6.4% in C language and 5.4% in C++ language.
arXiv Detail & Related papers (2024-08-17T02:51:27Z)
Can We Trust Large Language Models Generated Code? A Framework for In-Context Learning, Security Patterns, and Code Evaluations Across Diverse LLMs [2.7138982369416866]
Large Language Models (LLMs) have revolutionized automated code generation in software engineering. However, concerns have arisen regarding the security and quality of the generated code. Our research aims to tackle these issues by introducing a framework for secure behavioral learning of LLMs.
arXiv Detail & Related papers (2024-06-18T11:29:34Z)
An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection [17.948513691133037]
We introduce CodeBreaker, a pioneering LLM-assisted backdoor attack framework on code completion models. By integrating malicious payloads directly into the source code with minimal transformation, CodeBreaker challenges current security measures.
arXiv Detail & Related papers (2024-06-10T22:10:05Z)
CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion [117.178835165855]
This paper introduces CodeAttack, a framework that transforms natural language inputs into code inputs. Our studies reveal a new and universal safety vulnerability of these models against code input. We find that a larger distribution gap between CodeAttack and natural language leads to weaker safety generalization.
arXiv Detail & Related papers (2024-03-12T17:55:38Z)
DeceptPrompt: Exploiting LLM-driven Code Generation via Adversarial Natural Language Instructions [27.489622263456983]
We introduce DeceptPrompt, an algorithm that can generate adversarial natural language instructions that drive the Code LLMs to generate functionality correct code with vulnerabilities. When applying the optimized prefix/suffix, the attack success rate (ASR) will improve by average 50% compared with no prefix/suffix applying.
arXiv Detail & Related papers (2023-12-07T22:19:06Z)
Can LLMs Patch Security Issues? [1.3299507495084417]
Large Language Models (LLMs) have shown impressive proficiency in code generation. LLMs share a weakness with their human counterparts: producing code that inadvertently has security vulnerabilities. We propose Feedback-Driven Security Patching (FDSP), where LLMs automatically refine generated, vulnerable code.
arXiv Detail & Related papers (2023-11-13T08:54:37Z)
Attack Prompt Generation for Red Teaming and Defending Large Language Models [70.157691818224]
Large language models (LLMs) are susceptible to red teaming attacks, which can induce LLMs to generate harmful content. We propose an integrated approach that combines manual and automatic methods to economically generate high-quality attack prompts.
arXiv Detail & Related papers (2023-10-19T06:15:05Z)
Red Teaming Language Model Detectors with Language Models [114.36392560711022]
Large language models (LLMs) present significant safety and ethical risks if exploited by malicious users. Recent works have proposed algorithms to detect LLM-generated text and protect LLMs. We study two types of attack strategies: 1) replacing certain words in an LLM's output with their synonyms given the context; 2) automatically searching for an instructional prompt to alter the writing style of the generation.
arXiv Detail & Related papers (2023-05-31T10:08:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.