Towards a Generalisable Cyber Defence Agent for Real-World Computer Networks
- URL: http://arxiv.org/abs/2511.09114v1
- Date: Thu, 13 Nov 2025 01:32:53 GMT
- Title: Towards a Generalisable Cyber Defence Agent for Real-World Computer Networks
- Authors: Tim Dudman, Martyn Bull,
- Abstract summary: This research introduces a novel set of Topological Extensions for Reinforcement Learning Agents (TERLA)<n>TERLA provides generalisability for the defence of networks with differing topology and size, without the need for retraining.<n>Generalisability has been demonstrated by showing that all TERLA agents have the same network-agnostic neural network architecture.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in deep reinforcement learning for autonomous cyber defence have resulted in agents that can successfully defend simulated computer networks against cyber-attacks. However, many of these agents would need retraining to defend networks with differing topology or size, making them poorly suited to real-world networks where topology and size can vary over time. In this research we introduce a novel set of Topological Extensions for Reinforcement Learning Agents (TERLA) that provide generalisability for the defence of networks with differing topology and size, without the need for retraining. Our approach involves the use of heterogeneous graph neural network layers to produce a fixed-size latent embedding representing the observed network state. This representation learning stage is coupled with a reduced, fixed-size, semantically meaningful and interpretable action space. We apply TERLA to a standard deep reinforcement learning Proximal Policy Optimisation (PPO) agent model, and to reduce the sim-to-real gap, conduct our research using Cyber Autonomy Gym for Experimentation (CAGE) Challenge 4. This Cyber Operations Research Gym environment has many of the features of a real-world network, such as realistic Intrusion Detection System (IDS) events and multiple agents defending network segments of differing topology and size. TERLA agents retain the defensive performance of vanilla PPO agents whilst showing improved action efficiency. Generalisability has been demonstrated by showing that all TERLA agents have the same network-agnostic neural network architecture, and by deploying a single TERLA agent multiple times to defend network segments with differing topology and size, showing improved defensive performance and efficiency.
Related papers
- Multi-Agent Collaborative Intrusion Detection for Low-Altitude Economy IoT: An LLM-Enhanced Agentic AI Framework [60.72591149679355]
The rapid expansion of low-altitude economy Internet of Things (LAE-IoT) networks has created unprecedented security challenges.<n>Traditional intrusion detection systems fail to tackle the unique characteristics of aerial IoT environments.<n>We introduce a large language model (LLM)-enabled agentic AI framework for enhancing intrusion detection in LAE-IoT networks.
arXiv Detail & Related papers (2026-01-25T12:47:25Z) - Automated Cyber Defense with Generalizable Graph-based Reinforcement Learning Agents [7.45063623129985]
Deep reinforcement learning is emerging as a viable strategy for automated cyber defense.<n>In this work, we frame ACD as a two-player context-based partially observable Markov decision problem.<n>We show that this approach outperforms the state-of-the-art by a wide margin.
arXiv Detail & Related papers (2025-09-19T16:57:27Z) - Exploiting Edge Features for Transferable Adversarial Attacks in Distributed Machine Learning [54.26807397329468]
This work explores a previously overlooked vulnerability in distributed deep learning systems.<n>An adversary who intercepts the intermediate features transmitted between them can still pose a serious threat.<n>We propose an exploitation strategy specifically designed for distributed settings.
arXiv Detail & Related papers (2025-07-09T20:09:00Z) - General Autonomous Cybersecurity Defense: Learning Robust Policies for Dynamic Topologies and Diverse Attackers [3.6956995102043164]
autonomous cybersecurity defense (ACD) systems have become essential for real-time threat detection and response with optional human intervention.<n>Existing ACD systems rely on limiting assumptions, particularly the stationarity of the underlying network dynamics.<n>This work explores methods for developing agents to learn generalizable policies across dynamic network environments.
arXiv Detail & Related papers (2025-06-28T01:12:13Z) - Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning [57.3885832382455]
We show that introducing static network sparsity alone can unlock further scaling potential beyond dense counterparts with state-of-the-art architectures.<n>Our analysis reveals that, in contrast to naively scaling up dense DRL networks, such sparse networks achieve both higher parameter efficiency for network expressivity.
arXiv Detail & Related papers (2025-06-20T17:54:24Z) - An Attentive Graph Agent for Topology-Adaptive Cyber Defence [1.0812794909131096]
We develop a custom version of the Cyber Operations Research Gym (CybORG) environment, encoding network state as a directed graph.<n>We employ a Graph Attention Network (GAT) architecture to process node, edge, and global features, and adapt its output to be compatible with policy gradient methods in reinforcement learning.<n>We demonstrate that GAT defensive policies can be trained using our low-level directed graph observations, even when unexpected connections arise during simulation.
arXiv Detail & Related papers (2025-01-24T18:22:37Z) - Entity-based Reinforcement Learning for Autonomous Cyber Defence [0.22499166814992438]
Key challenge for autonomous cyber defence is ensuring a defensive agent's ability to generalise across diverse network topologies and configurations.<n>Standard approaches to deep reinforcement learning expect fixed-size observation and action spaces.<n>In autonomous cyber defence, this makes it hard to develop agents that generalise to environments with network topologies different from those trained on.
arXiv Detail & Related papers (2024-10-23T08:04:12Z) - Hierarchical Multi-agent Reinforcement Learning for Cyber Network Defense [9.927281246704604]
We propose a hierarchical Proximal Policy Optimization (PPO) architecture that decomposes the cyber defense task into specific sub-tasks like network investigation and host recovery.<n>Our approach involves training sub-policies for each sub-task using PPO enhanced with cybersecurity domain expertise.<n>These sub-policies are then leveraged by a master defense policy that coordinates their selection to solve complex network defense tasks.
arXiv Detail & Related papers (2024-10-22T18:35:05Z) - Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [86.99017195607077]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents.<n>Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z) - Graph Neural Networks for Decentralized Multi-Agent Perimeter Defense [111.9039128130633]
We develop an imitation learning framework that learns a mapping from defenders' local perceptions and their communication graph to their actions.
We run perimeter defense games in scenarios with different team sizes and configurations to demonstrate the performance of the learned network.
arXiv Detail & Related papers (2023-01-23T19:35:59Z) - Learn2Perturb: an End-to-end Feature Perturbation Learning to Improve
Adversarial Robustness [79.47619798416194]
Learn2Perturb is an end-to-end feature perturbation learning approach for improving the adversarial robustness of deep neural networks.
Inspired by the Expectation-Maximization, an alternating back-propagation training algorithm is introduced to train the network and noise parameters consecutively.
arXiv Detail & Related papers (2020-03-02T18:27:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.