From Promise to Peril: Rethinking Cybersecurity Red and Blue Teaming in the Age of LLMs
- URL: http://arxiv.org/abs/2506.13434v1
- Date: Mon, 16 Jun 2025 12:52:19 GMT
- Title: From Promise to Peril: Rethinking Cybersecurity Red and Blue Teaming in the Age of LLMs
- Authors: Alsharif Abuadbba, Chris Hicks, Kristen Moore, Vasilios Mavroudis, Burak Hasircioglu, Diksha Goel, Piers Jennings,
- Abstract summary: Large Language Models (LLMs) are set to reshape cybersecurity by augmenting red and blue team operations.<n>This position paper maps LLM applications across cybersecurity frameworks such as MITRE ATT&CK and the NIST Cybersecurity Framework (CSF)<n>Key limitations include hallucinations, limited context retention, poor reasoning, and sensitivity to prompts.<n>We recommend maintaining human-in-the-loop oversight, enhancing model explainability, integrating privacy-preserving mechanisms, and building systems robust to adversarial exploitation.
- Score: 5.438441265064793
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) are set to reshape cybersecurity by augmenting red and blue team operations. Red teams can exploit LLMs to plan attacks, craft phishing content, simulate adversaries, and generate exploit code. Conversely, blue teams may deploy them for threat intelligence synthesis, root cause analysis, and streamlined documentation. This dual capability introduces both transformative potential and serious risks. This position paper maps LLM applications across cybersecurity frameworks such as MITRE ATT&CK and the NIST Cybersecurity Framework (CSF), offering a structured view of their current utility and limitations. While LLMs demonstrate fluency and versatility across various tasks, they remain fragile in high-stakes, context-heavy environments. Key limitations include hallucinations, limited context retention, poor reasoning, and sensitivity to prompts, which undermine their reliability in operational settings. Moreover, real-world integration raises concerns around dual-use risks, adversarial misuse, and diminished human oversight. Malicious actors could exploit LLMs to automate reconnaissance, obscure attack vectors, and lower the technical threshold for executing sophisticated attacks. To ensure safer adoption, we recommend maintaining human-in-the-loop oversight, enhancing model explainability, integrating privacy-preserving mechanisms, and building systems robust to adversarial exploitation. As organizations increasingly adopt AI driven cybersecurity, a nuanced understanding of LLMs' risks and operational impacts is critical to securing their defensive value while mitigating unintended consequences.
Related papers
- The Dark Side of LLMs: Agent-based Attacks for Complete Computer Takeover [0.18472148461613155]
Large Language Model (LLM) agents and multi-agent systems introduce unprecedented security vulnerabilities.<n>This paper presents a comprehensive evaluation of the security of LLMs used as reasoning engines within autonomous agents.<n>We focus on how different attack surfaces and trust boundaries can be leveraged to orchestrate such takeovers.
arXiv Detail & Related papers (2025-07-09T13:54:58Z) - ROSE: Toward Reality-Oriented Safety Evaluation of Large Language Models [60.28667314609623]
Large Language Models (LLMs) are increasingly deployed as black-box components in real-world applications.<n>We propose Reality-Oriented Safety Evaluation (ROSE), a novel framework that uses multi-objective reinforcement learning to fine-tune an adversarial LLM.
arXiv Detail & Related papers (2025-06-17T10:55:17Z) - CoP: Agentic Red-teaming for Large Language Models using Composition of Principles [61.404771120828244]
This paper proposes an agentic workflow to automate and scale the red-teaming process of Large Language Models (LLMs)<n>Human users provide a set of red-teaming principles as instructions to an AI agent to automatically orchestrate effective red-teaming strategies and generate jailbreak prompts.<n>When tested against leading LLMs, CoP reveals unprecedented safety risks by finding novel jailbreak prompts and improving the best-known single-turn attack success rate by up to 19.0 times.
arXiv Detail & Related papers (2025-06-01T02:18:41Z) - Safety Guardrails for LLM-Enabled Robots [82.0459036717193]
Traditional robot safety approaches do not address the novel vulnerabilities of large language models (LLMs)<n>We propose RoboGuard, a two-stage guardrail architecture to ensure the safety of LLM-enabled robots.<n>We show that RoboGuard reduces the execution of unsafe plans from 92% to below 2.5% without compromising performance on safe plans.
arXiv Detail & Related papers (2025-03-10T22:01:56Z) - Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks [88.84977282952602]
A high volume of recent ML security literature focuses on attacks against aligned large language models (LLMs)<n>In this paper, we analyze security and privacy vulnerabilities that are unique to LLM agents.<n>We conduct a series of illustrative attacks on popular open-source and commercial agents, demonstrating the immediate practical implications of their vulnerabilities.
arXiv Detail & Related papers (2025-02-12T17:19:36Z) - Emerging Security Challenges of Large Language Models [6.151633954305939]
Large language models (LLMs) have achieved record adoption in a short period of time across many different sectors.<n>They are open-ended models trained on diverse data without being tailored for specific downstream tasks.<n>Traditional Machine Learning (ML) models are vulnerable to adversarial attacks.
arXiv Detail & Related papers (2024-12-23T14:36:37Z) - Targeting the Core: A Simple and Effective Method to Attack RAG-based Agents via Direct LLM Manipulation [4.241100280846233]
AI agents, powered by large language models (LLMs), have transformed human-computer interactions by enabling seamless, natural, and context-aware communication.<n>This paper investigates a critical vulnerability: adversarial attacks targeting the LLM core within AI agents.
arXiv Detail & Related papers (2024-12-05T18:38:30Z) - Global Challenge for Safe and Secure LLMs Track 1 [57.08717321907755]
The Global Challenge for Safe and Secure Large Language Models (LLMs) is a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO)
This paper introduces the Global Challenge for Safe and Secure Large Language Models (LLMs), a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO) to foster the development of advanced defense mechanisms against automated jailbreaking attacks.
arXiv Detail & Related papers (2024-11-21T08:20:31Z) - Large language models in 6G security: challenges and opportunities [5.073128025996496]
We focus on the security aspects of Large Language Models (LLMs) from the viewpoint of potential adversaries.
This will include the development of a comprehensive threat taxonomy, categorizing various adversary behaviors.
Also, our research will concentrate on how LLMs can be integrated into cybersecurity efforts by defense teams, also known as blue teams.
arXiv Detail & Related papers (2024-03-18T20:39:34Z) - AutoAttacker: A Large Language Model Guided System to Implement
Automatic Cyber-attacks [13.955084410934694]
Large language models (LLMs) have demonstrated impressive results on natural language tasks.
As LLMs inevitably advance, they may be able to automate both the pre- and post-breach attack stages.
This research can help defensive systems and teams learn to detect novel attack behaviors preemptively before their use in the wild.
arXiv Detail & Related papers (2024-03-02T00:10:45Z) - On the Vulnerability of LLM/VLM-Controlled Robotics [54.57914943017522]
We highlight vulnerabilities in robotic systems integrating large language models (LLMs) and vision-language models (VLMs) due to input modality sensitivities.<n>Our results show that simple input perturbations reduce task execution success rates by 22.2% and 14.6% in two representative LLM/VLM-controlled robotic systems.
arXiv Detail & Related papers (2024-02-15T22:01:45Z) - Attack Prompt Generation for Red Teaming and Defending Large Language
Models [70.157691818224]
Large language models (LLMs) are susceptible to red teaming attacks, which can induce LLMs to generate harmful content.
We propose an integrated approach that combines manual and automatic methods to economically generate high-quality attack prompts.
arXiv Detail & Related papers (2023-10-19T06:15:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.