MGC: A Compiler Framework Exploiting Compositional Blindness in Aligned LLMs for Malware Generation
- URL: http://arxiv.org/abs/2507.02057v1
- Date: Wed, 02 Jul 2025 18:00:49 GMT
- Title: MGC: A Compiler Framework Exploiting Compositional Blindness in Aligned LLMs for Malware Generation
- Authors: Lu Yan, Zhuo Zhang, Xiangzhe Xu, Shengwei An, Guangyu Shen, Zhou Xuan, Xuan Chen, Xiangyu Zhang,
- Abstract summary: Large language models (LLMs) have democratized software development, reducing the expertise barrier for programming complex applications.<n>This accessibility extends to malicious software development, raising significant security concerns.<n>In this paper, we introduce the Malware Generation Compiler (MGC), a novel framework that leverages this vulnerability through modular decomposition and alignment-evasive generation.
- Score: 22.29476520010842
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large language models (LLMs) have democratized software development, reducing the expertise barrier for programming complex applications. This accessibility extends to malicious software development, raising significant security concerns. While LLM providers have implemented alignment mechanisms to prevent direct generation of overtly malicious code, these safeguards predominantly evaluate individual prompts in isolation, overlooking a critical vulnerability: malicious operations can be systematically decomposed into benign-appearing sub-tasks. In this paper, we introduce the Malware Generation Compiler (MGC), a novel framework that leverages this vulnerability through modular decomposition and alignment-evasive generation. MGC employs a specialized Malware Description Intermediate Representation (MDIR) to bridge high-level malicious intents and benign-appearing code snippets. Extensive evaluation demonstrates that our attack reliably generates functional malware across diverse task specifications and categories, outperforming jailbreaking methods by +365.79% and underground services by +78.07% in correctness on three benchmark datasets. Case studies further show that MGC can reproduce and even enhance 16 real-world malware samples. This work provides critical insights for security researchers by exposing the risks of compositional attacks against aligned AI systems. Demonstrations are available at https://sites.google.com/view/malware-generation-compiler.
Related papers
- The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs [39.85609149662187]
We present DIJA, the first systematic study and jailbreak attack framework that exploits unique safety weaknesses of dLLMs.<n>Our proposed DIJA constructs adversarial interleaved mask-text prompts that exploit the text generation mechanisms of dLLMs.<n>Our findings underscore the urgent need for rethinking safety alignment in this emerging class of language models.
arXiv Detail & Related papers (2025-07-15T08:44:46Z) - Are AI-Generated Fixes Secure? Analyzing LLM and Agent Patches on SWE-bench [9.229310642804036]
We present the first large-scale security analysis of LLM-generated patches using 20,000+ issues from the SWE-bench dataset.<n>We evaluate patches produced by a standalone LLM (Llama 3.3) and compare them to developer-written patches.<n>We also assess the security of patches generated by three top-performing agentic frameworks (OpenHands, AutoCodeRover, HoneyComb) on a subset of our data.
arXiv Detail & Related papers (2025-06-30T21:10:19Z) - LLMs Caught in the Crossfire: Malware Requests and Jailbreak Challenges [70.85114705489222]
We propose MalwareBench, a benchmark dataset containing 3,520 jailbreaking prompts for malicious code-generation.<n>M MalwareBench is based on 320 manually crafted malicious code generation requirements, covering 11 jailbreak methods and 29 code functionality categories.<n>Experiments show that mainstream LLMs exhibit limited ability to reject malicious code-generation requirements, and the combination of multiple jailbreak methods further reduces the model's security capabilities.
arXiv Detail & Related papers (2025-06-09T12:02:39Z) - Why Not Act on What You Know? Unleashing Safety Potential of LLMs via Self-Aware Guard Enhancement [48.50995874445193]
Large Language Models (LLMs) have shown impressive capabilities across various tasks but remain vulnerable to meticulously crafted jailbreak attacks.<n>We propose SAGE (Self-Aware Guard Enhancement), a training-free defense strategy designed to align LLMs' strong safety discrimination performance with their relatively weaker safety generation ability.
arXiv Detail & Related papers (2025-05-17T15:54:52Z) - AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents [54.29555239363013]
We propose a generic black-box fuzzing framework, AgentVigil, to automatically discover and exploit indirect prompt injection vulnerabilities.<n>We evaluate AgentVigil on two public benchmarks, AgentDojo and VWA-adv, where it achieves 71% and 70% success rates against agents based on o3-mini and GPT-4o.<n>We apply our attacks in real-world environments, successfully misleading agents to navigate to arbitrary URLs, including malicious sites.
arXiv Detail & Related papers (2025-05-09T07:40:17Z) - MASKDROID: Robust Android Malware Detection with Masked Graph Representations [56.09270390096083]
We propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware.
We introduce a masking mechanism into the Graph Neural Network based framework, forcing MASKDROID to recover the whole input graph.
This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks.
arXiv Detail & Related papers (2024-09-29T07:22:47Z) - Transforming Computer Security and Public Trust Through the Exploration of Fine-Tuning Large Language Models [0.0]
"Mallas" are malicious services that exploit large language models (LLMs) for nefarious purposes.
This paper delves into the proliferation of Mallas by examining the use of various pre-trained language models and their efficiency and vulnerabilities.
arXiv Detail & Related papers (2024-06-02T06:10:31Z) - Double Backdoored: Converting Code Large Language Model Backdoors to Traditional Malware via Adversarial Instruction Tuning Attacks [15.531860128240385]
This work investigates novel techniques for transitioning backdoors from the AI/ML domain to traditional computer malware.<n>We present MalInstructCoder, a framework designed to assess the cybersecurity vulnerabilities of instruction-tuned Code LLMs.<n>We conduct a comprehensive investigation into the exploitability of the code-specific instruction tuning process involving three state-of-the-art Code LLMs.
arXiv Detail & Related papers (2024-04-29T10:14:58Z) - AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting [54.931241667414184]
We propose textbfAdaptive textbfShield Prompting, which prepends inputs with defense prompts to defend MLLMs against structure-based jailbreak attacks.
Our methods can consistently improve MLLMs' robustness against structure-based jailbreak attacks.
arXiv Detail & Related papers (2024-03-14T15:57:13Z) - Can LLMs Patch Security Issues? [1.3299507495084417]
Large Language Models (LLMs) have shown impressive proficiency in code generation.
LLMs share a weakness with their human counterparts: producing code that inadvertently has security vulnerabilities.
We propose Feedback-Driven Security Patching (FDSP), where LLMs automatically refine generated, vulnerable code.
arXiv Detail & Related papers (2023-11-13T08:54:37Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.