Out of the Cage: How Stochastic Parrots Win in Cyber Security
Environments
- URL: http://arxiv.org/abs/2308.12086v2
- Date: Mon, 28 Aug 2023 09:42:59 GMT
- Title: Out of the Cage: How Stochastic Parrots Win in Cyber Security
Environments
- Authors: Maria Rigaki, Ond\v{r}ej Luk\'a\v{s}, Carlos A. Catania, Sebastian
Garcia
- Abstract summary: Large Language Models (LLMs) have gained widespread popularity across diverse domains.
This paper introduces a novel application of pre-trained LLMs as agents within cybersecurity network environments.
We present an approach wherein pre-trained LLMs are leveraged as attacking agents in two reinforcement learning environments.
- Score: 0.5735035463793008
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large Language Models (LLMs) have gained widespread popularity across diverse
domains involving text generation, summarization, and various natural language
processing tasks. Despite their inherent limitations, LLM-based designs have
shown promising capabilities in planning and navigating open-world scenarios.
This paper introduces a novel application of pre-trained LLMs as agents within
cybersecurity network environments, focusing on their utility for sequential
decision-making processes.
We present an approach wherein pre-trained LLMs are leveraged as attacking
agents in two reinforcement learning environments. Our proposed agents
demonstrate similar or better performance against state-of-the-art agents
trained for thousands of episodes in most scenarios and configurations. In
addition, the best LLM agents perform similarly to human testers of the
environment without any additional training process. This design highlights the
potential of LLMs to efficiently address complex decision-making tasks within
cybersecurity.
Furthermore, we introduce a new network security environment named
NetSecGame. The environment is designed to eventually support complex
multi-agent scenarios within the network security domain. The proposed
environment mimics real network attacks and is designed to be highly modular
and adaptable for various scenarios.
Related papers
- Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks.
However, they still struggle with problems requiring multi-step decision-making and environmental feedback.
We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z) - Multi-Agent Collaboration in Incident Response with Large Language Models [0.0]
Incident response (IR) is a critical aspect of cybersecurity, requiring rapid decision-making and coordinated efforts to address cyberattacks effectively.
Leveraging large language models (LLMs) as intelligent agents offers a novel approach to enhancing collaboration and efficiency in IR scenarios.
This paper explores the application of LLM-based multi-agent collaboration using the Backdoors & Breaches framework.
arXiv Detail & Related papers (2024-12-01T03:12:26Z) - MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation [52.739500459903724]
Large Language Models (LLMs) have demonstrated remarkable planning abilities across various domains, including robotics manipulation and navigation.
We propose a novel multi-agent LLM framework that distributes high-level planning and low-level control code generation across specialized LLM agents.
We evaluate our approach on nine RLBench tasks, including long-horizon tasks, and demonstrate its ability to solve robotics manipulation in a zero-shot setting.
arXiv Detail & Related papers (2024-11-26T17:53:44Z) - DynaSaur: Large Language Agents Beyond Predefined Actions [108.75187263724838]
Existing LLM agent systems typically select actions from a fixed and predefined set at every step.
We propose an LLM agent framework that enables the dynamic creation and composition of actions in an online manner.
Our experiments on the GAIA benchmark demonstrate that this framework offers significantly greater flexibility and outperforms previous methods.
arXiv Detail & Related papers (2024-11-04T02:08:59Z) - Entity-based Reinforcement Learning for Autonomous Cyber Defence [0.22499166814992438]
Key challenge for autonomous cyber defence is ensuring a defensive agent's ability to generalise across diverse network topologies and configurations.
Standard approaches to deep reinforcement learning expect fixed-size observation and action spaces.
In autonomous cyber defence, this makes it hard to develop agents that generalise to environments with network topologies different from those trained on.
arXiv Detail & Related papers (2024-10-23T08:04:12Z) - Hierarchical Multi-agent Reinforcement Learning for Cyber Network Defense [7.967738380932909]
We propose a hierarchical Proximal Policy Optimization (PPO) architecture that decomposes the cyber defense task into specific sub-tasks like network investigation and host recovery.
Our approach involves training sub-policies for each sub-task using PPO enhanced with domain expertise.
These sub-policies are then leveraged by a master defense policy that coordinates their selection to solve complex network defense tasks.
arXiv Detail & Related papers (2024-10-22T18:35:05Z) - Hackphyr: A Local Fine-Tuned LLM Agent for Network Security Environments [0.5735035463793008]
Large Language Models (LLMs) have shown remarkable potential across various domains, including cybersecurity.
We present Hackphyr, a locally fine-tuned LLM to be used as a red-team agent within network security environments.
arXiv Detail & Related papers (2024-09-17T15:28:25Z) - Compromising Embodied Agents with Contextual Backdoor Attacks [69.71630408822767]
Large language models (LLMs) have transformed the development of embodied intelligence.
This paper uncovers a significant backdoor security threat within this process.
By poisoning just a few contextual demonstrations, attackers can covertly compromise the contextual environment of a black-box LLM.
arXiv Detail & Related papers (2024-08-06T01:20:12Z) - Large Language Models for Cyber Security: A Systematic Literature Review [14.924782327303765]
We conduct a comprehensive review of the literature on the application of Large Language Models in cybersecurity (LLM4Security)
We observe that LLMs are being applied to a wide range of cybersecurity tasks, including vulnerability detection, malware analysis, network intrusion detection, and phishing detection.
Third, we identify several promising techniques for adapting LLMs to specific cybersecurity domains, such as fine-tuning, transfer learning, and domain-specific pre-training.
arXiv Detail & Related papers (2024-05-08T02:09:17Z) - AgentBench: Evaluating LLMs as Agents [88.45506148281379]
Large Language Models (LLMs) are becoming increasingly smart and autonomous, targeting real-world pragmatic missions beyond traditional NLP tasks.
We present AgentBench, a benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities.
arXiv Detail & Related papers (2023-08-07T16:08:11Z) - Scenic4RL: Programmatic Modeling and Generation of Reinforcement
Learning Environments [89.04823188871906]
Generation of diverse realistic scenarios is challenging for real-time strategy (RTS) environments.
Most of the existing simulators rely on randomly generating the environments.
We introduce the benefits of adopting an existing formal scenario specification language, SCENIC, to assist researchers.
arXiv Detail & Related papers (2021-06-18T21:49:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.