Related papers: Scalable Learning of Intrusion Responses through Recursive Decomposition

Scalable Learning of Intrusion Responses through Recursive Decomposition

URL: http://arxiv.org/abs/2309.03292v2
Date: Fri, 15 Sep 2023 08:51:59 GMT
Title: Scalable Learning of Intrusion Responses through Recursive Decomposition
Authors: Kim Hammar and Rolf Stadler
Abstract summary: We study automated intrusion response for an IT infrastructure and the interaction between an attacker and a defender as a partially observed game. To solve the game we follow an approach where attack and defense strategies co-evolve through reinforcement learning and self-play toward an equilibrium. We introduce an algorithm called Decompositional Fictitious Self-Play (DFSP), which learns equilibria through approximation.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We study automated intrusion response for an IT infrastructure and formulate the interaction between an attacker and a defender as a partially observed stochastic game. To solve the game we follow an approach where attack and defense strategies co-evolve through reinforcement learning and self-play toward an equilibrium. Solutions proposed in previous work prove the feasibility of this approach for small infrastructures but do not scale to realistic scenarios due to the exponential growth in computational complexity with the infrastructure size. We address this problem by introducing a method that recursively decomposes the game into subgames which can be solved in parallel. Applying optimal stopping theory we show that the best response strategies in these subgames exhibit threshold structures, which allows us to compute them efficiently. To solve the decomposed game we introduce an algorithm called Decompositional Fictitious Self-Play (DFSP), which learns Nash equilibria through stochastic approximation. We evaluate the learned strategies in an emulation environment where real intrusions and response actions can be executed. The results show that the learned strategies approximate an equilibrium and that DFSP significantly outperforms a state-of-the-art algorithm for a realistic infrastructure configuration.

Related papers

Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning [58.533203990515034]
Scaling neural networks has driven breakthrough advances in machine learning, yet this paradigm fails in deep reinforcement learning (DRL)<n>We show that dynamic sparse training strategies provide module-specific benefits that complement the primary scalability foundation established by architectural improvements.<n>We finally distill these insights into Module-Specific Training (MST), a practical framework that exploits the benefits of architectural improvements and demonstrates substantial scalability gains across diverse RL algorithms without algorithmic modifications.
arXiv Detail & Related papers (2025-10-14T03:03:08Z)
Step-Aware Policy Optimization for Reasoning in Diffusion Large Language Models [57.42778606399764]
Diffusion language models (dLLMs) offer a promising, non-autoregressive paradigm for text generation.<n>Current reinforcement learning approaches often rely on sparse, outcome-based rewards.<n>We argue that this stems from a fundamental mismatch with the natural structure of reasoning.
arXiv Detail & Related papers (2025-10-02T00:34:15Z)
Binarized Neural Networks Converge Toward Algorithmic Simplicity: Empirical Support for the Learning-as-Compression Hypothesis [36.24954635616374]
We propose a shift toward algorithmic information theory, using Binarized Neural Networks (BNNs) as a first proxy.<n>We apply the Block Decomposition Method (BDM) and demonstrate it more closely tracks structural changes during training than entropy.<n>These results support the view of training as a process of algorithmic compression, where learning corresponds to the progressive internalization of structured regularities.
arXiv Detail & Related papers (2025-05-27T02:51:36Z)
Conjectural Online Learning with First-order Beliefs in Asymmetric Information Stochastic Games [13.33996350474556]
Asymmetric information games (AISGs) arise in many complex socio-technical systems. We propose conjectural online learning (COL), an online learning method under generic information structures in AISGs.
arXiv Detail & Related papers (2024-02-29T01:07:29Z)
Automated Security Response through Online Learning with Adaptive Conjectures [13.33996350474556]
We study automated security response for an IT infrastructure. We formulate the interaction between an attacker and a defender as a partially observed, non-stationary game.
arXiv Detail & Related papers (2024-02-19T20:06:15Z)
RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning. Our proposed method uses reinforcement learning with user intervention signals themselves as rewards. This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z)
Learning Near-Optimal Intrusion Responses Against Dynamic Attackers [0.0]
We study automated intrusion response and formulate the interaction between an attacker and a defender as an optimal stopping game. To obtain near-optimal defender strategies, we develop a fictitious self-play algorithm that learns Nashlibria through approximation. We argue that this approach can produce effective defender strategies for a practical IT infrastructure.
arXiv Detail & Related papers (2023-01-11T16:36:24Z)
Finding mixed-strategy equilibria of continuous-action games without gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients. We model players' strategies using artificial neural networks. This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z)
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient [65.08966446962845]
offline reinforcement learning, which aims at optimizing decision-making strategies with historical data, has been extensively applied in real-life applications. We take a step by considering offline reinforcement learning with differentiable function class approximation (DFA) Most importantly, we show offline differentiable function approximation is provably efficient by analyzing the pessimistic fitted Q-learning algorithm.
arXiv Detail & Related papers (2022-10-03T07:59:42Z)
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation. We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z)
Learning Security Strategies through Game Play and Optimal Stopping [0.0]
We study automated intrusion prevention using reinforcement learning. We formulate the interaction between an attacker and a defender as an optimal stopping game. To obtain the optimal defender strategies, we introduce T-FP, a fictitious self-play algorithm.
arXiv Detail & Related papers (2022-05-29T15:30:00Z)
Mind Your Solver! On Adversarial Attack and Defense for Combinatorial Optimization [111.78035414744045]
We take an initiative on developing the mechanisms for adversarial attack and defense towards optimization solvers. We present a simple yet effective defense strategy to modify the graph structure to increase the robustness of solvers.
arXiv Detail & Related papers (2021-12-28T15:10:15Z)
Learning Generative Deception Strategies in Combinatorial Masking Games [27.2744631811653]
One way deception can be employed is through obscuring, or masking, some of the information about how systems are configured. We present a novel game-theoretic model of the resulting defender-attacker interaction, where the defender chooses a subset of attributes to mask, while the attacker responds by choosing an exploit to execute. We present a novel highly scalable approach for approximately solving such games by representing the strategies of both players as neural networks.
arXiv Detail & Related papers (2021-09-23T20:42:44Z)
Disturbing Reinforcement Learning Agents with Corrupted Rewards [62.997667081978825]
We analyze the effects of different attack strategies based on reward perturbations on reinforcement learning algorithms. We show that smoothly crafting adversarial rewards are able to mislead the learner, and that using low exploration probability values, the policy learned is more robust to corrupt rewards.
arXiv Detail & Related papers (2021-02-12T15:53:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.