Beyond CAGE: Investigating Generalization of Learned Autonomous Network
Defense Policies
- URL: http://arxiv.org/abs/2211.15557v2
- Date: Wed, 30 Nov 2022 14:35:42 GMT
- Title: Beyond CAGE: Investigating Generalization of Learned Autonomous Network
Defense Policies
- Authors: Melody Wolk, Andy Applebaum, Camron Dennler, Patrick Dwyer, Marina
Moskowitz, Harold Nguyen, Nicole Nichols, Nicole Park, Paul Rachwalski, Frank
Rau, Adrian Webster
- Abstract summary: This work evaluates several reinforcement learning approaches implemented in the second edition of the CAGE Challenge.
We find that the ensemble RL technique performs strongest, outperforming our other models and taking second place in the competition.
In unseen environments, all of our approaches perform worse, with varied degradation based on the type of environmental change.
- Score: 0.8785883427835897
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Advancements in reinforcement learning (RL) have inspired new directions in
intelligent automation of network defense. However, many of these advancements
have either outpaced their application to network security or have not
considered the challenges associated with implementing them in the real-world.
To understand these problems, this work evaluates several RL approaches
implemented in the second edition of the CAGE Challenge, a public competition
to build an autonomous network defender agent in a high-fidelity network
simulator. Our approaches all build on the Proximal Policy Optimization (PPO)
family of algorithms, and include hierarchical RL, action masking, custom
training, and ensemble RL. We find that the ensemble RL technique performs
strongest, outperforming our other models and taking second place in the
competition. To understand applicability to real environments we evaluate each
method's ability to generalize to unseen networks and against an unknown attack
strategy. In unseen environments, all of our approaches perform worse, with
degradation varied based on the type of environmental change. Against an
unknown attacker strategy, we found that our models had reduced overall
performance even though the new strategy was less efficient than the ones our
models trained on. Together, these results highlight promising research
directions for autonomous network defense in the real world.
Related papers
- Hierarchical Multi-agent Reinforcement Learning for Cyber Network Defense [7.967738380932909]
We propose a hierarchical Proximal Policy Optimization (PPO) architecture that decomposes the cyber defense task into specific sub-tasks like network investigation and host recovery.
Our approach involves training sub-policies for each sub-task using PPO enhanced with domain expertise.
These sub-policies are then leveraged by a master defense policy that coordinates their selection to solve complex network defense tasks.
arXiv Detail & Related papers (2024-10-22T18:35:05Z) - Mastering the Digital Art of War: Developing Intelligent Combat Simulation Agents for Wargaming Using Hierarchical Reinforcement Learning [0.0]
dissertation proposes a comprehensive approach, including targeted observation abstractions, multi-model integration, a hybrid AI framework, and an overarching hierarchical reinforcement learning framework.
Our localized observation abstraction using piecewise linear spatial decay simplifies the RL problem, enhancing computational efficiency and demonstrating superior efficacy over traditional global observation methods.
Our hybrid AI framework synergizes RL with scripted agents, leveraging RL for high-level decisions and scripted agents for lower-level tasks, enhancing adaptability, reliability, and performance.
arXiv Detail & Related papers (2024-08-23T18:50:57Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - LAS-AT: Adversarial Training with Learnable Attack Strategy [82.88724890186094]
"Learnable attack strategy", dubbed LAS-AT, learns to automatically produce attack strategies to improve the model robustness.
Our framework is composed of a target network that uses AEs for training to improve robustness and a strategy network that produces attack strategies to control the AE generation.
arXiv Detail & Related papers (2022-03-13T10:21:26Z) - Improving Robustness of Reinforcement Learning for Power System Control
with Adversarial Training [71.7750435554693]
We show that several state-of-the-art RL agents proposed for power system control are vulnerable to adversarial attacks.
Specifically, we use an adversary Markov Decision Process to learn an attack policy, and demonstrate the potency of our attack.
We propose to use adversarial training to increase the robustness of RL agent against attacks and avoid infeasible operational decisions.
arXiv Detail & Related papers (2021-10-18T00:50:34Z) - REIN-2: Giving Birth to Prepared Reinforcement Learning Agents Using
Reinforcement Learning Agents [0.0]
In this paper, we introduce a meta-learning scheme that shifts the objective of learning to solve a task into the objective of learning to learn to solve a task (or a set of tasks)
Our model, named REIN-2, is a meta-learning scheme formulated within the RL framework, the goal of which is to develop a meta-RL agent that learns how to produce other RL agents.
Compared to traditional state-of-the-art Deep RL algorithms, experimental results show remarkable performance of our model in popular OpenAI Gym environments.
arXiv Detail & Related papers (2021-10-11T10:13:49Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z) - Robust Reinforcement Learning using Adversarial Populations [118.73193330231163]
Reinforcement Learning (RL) is an effective tool for controller design but can struggle with issues of robustness.
We show that using a single adversary does not consistently yield robustness to dynamics variations under standard parametrizations of the adversary.
We propose a population-based augmentation to the Robust RL formulation in which we randomly initialize a population of adversaries and sample from the population uniformly during training.
arXiv Detail & Related papers (2020-08-04T20:57:32Z) - The Adversarial Resilience Learning Architecture for AI-based Modelling,
Exploration, and Operation of Complex Cyber-Physical Systems [0.0]
We describe the concept of Adversarial Learning (ARL) that formulates a new approach to complex environment checking and resilient operation.
The quintessence of ARL lies in both agents exploring the system and training each other without any domain knowledge.
Here, we introduce the ARL software architecture that allows to use a wide range of model-free as well as model-based DRL-based algorithms.
arXiv Detail & Related papers (2020-05-27T19:19:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.