Reward Shaping for Happier Autonomous Cyber Security Agents
- URL: http://arxiv.org/abs/2310.13565v1
- Date: Fri, 20 Oct 2023 15:04:42 GMT
- Title: Reward Shaping for Happier Autonomous Cyber Security Agents
- Authors: Elizabeth Bates, Vasilios Mavroudis, Chris Hicks
- Abstract summary: One of the most promising directions uses deep reinforcement learning to train autonomous agents in computer network defense tasks.
This work studies the impact of the reward signal that is provided to the agents when training for this task.
- Score: 0.276240219662896
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As machine learning models become more capable, they have exhibited increased
potential in solving complex tasks. One of the most promising directions uses
deep reinforcement learning to train autonomous agents in computer network
defense tasks. This work studies the impact of the reward signal that is
provided to the agents when training for this task. Due to the nature of
cybersecurity tasks, the reward signal is typically 1) in the form of penalties
(e.g., when a compromise occurs), and 2) distributed sparsely across each
defense episode. Such reward characteristics are atypical of classic
reinforcement learning tasks where the agent is regularly rewarded for progress
(cf. to getting occasionally penalized for failures). We investigate reward
shaping techniques that could bridge this gap so as to enable agents to train
more sample-efficiently and potentially converge to a better performance. We
first show that deep reinforcement learning algorithms are sensitive to the
magnitude of the penalties and their relative size. Then, we combine penalties
with positive external rewards and study their effect compared to penalty-only
training. Finally, we evaluate intrinsic curiosity as an internal positive
reward mechanism and discuss why it might not be as advantageous for high-level
network monitoring tasks.
Related papers
- Multi Task Inverse Reinforcement Learning for Common Sense Reward [21.145179791929337]
We show that inverse reinforcement learning, even when it succeeds in training an agent, does not learn a useful reward function.
That is, training a new agent with the learned reward does not impair the desired behaviors.
That is, multi-task inverse reinforcement learning can be applied to learn a useful reward function.
arXiv Detail & Related papers (2024-02-17T19:49:00Z) - A State Augmentation based approach to Reinforcement Learning from Human
Preferences [20.13307800821161]
Preference Based Reinforcement Learning attempts to solve the issue by utilizing binary feedbacks on queried trajectory pairs.
We present a state augmentation technique that allows the agent's reward model to be robust.
arXiv Detail & Related papers (2023-02-17T07:10:50Z) - Distributional Reward Estimation for Effective Multi-Agent Deep
Reinforcement Learning [19.788336796981685]
We propose a novel Distributional Reward Estimation framework for effective Multi-Agent Reinforcement Learning (DRE-MARL)
Our main idea is to design the multi-action-branch reward estimation and policy-weighted reward aggregation for stabilized training.
The superiority of the DRE-MARL is demonstrated using benchmark multi-agent scenarios, compared with the SOTA baselines in terms of both effectiveness and robustness.
arXiv Detail & Related papers (2022-10-14T08:31:45Z) - Basis for Intentions: Efficient Inverse Reinforcement Learning using
Past Experience [89.30876995059168]
inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior.
This paper addresses the problem of IRL -- inferring the reward function of an agent from observing its behavior.
arXiv Detail & Related papers (2022-08-09T17:29:49Z) - The Effects of Reward Misspecification: Mapping and Mitigating
Misaligned Models [85.68751244243823]
Reward hacking -- where RL agents exploit gaps in misspecified reward functions -- has been widely observed, but not yet systematically studied.
We investigate reward hacking as a function of agent capabilities: model capacity, action space resolution, observation space noise, and training time.
We find instances of phase transitions: capability thresholds at which the agent's behavior qualitatively shifts, leading to a sharp decrease in the true reward.
arXiv Detail & Related papers (2022-01-10T18:58:52Z) - Curious Exploration and Return-based Memory Restoration for Deep
Reinforcement Learning [2.3226893628361682]
In this paper, we focus on training a single agent to score goals with binary success/failure reward function.
The proposed method can be utilized to train agents in environments with fairly complex state and action spaces.
arXiv Detail & Related papers (2021-05-02T16:01:34Z) - Disturbing Reinforcement Learning Agents with Corrupted Rewards [62.997667081978825]
We analyze the effects of different attack strategies based on reward perturbations on reinforcement learning algorithms.
We show that smoothly crafting adversarial rewards are able to mislead the learner, and that using low exploration probability values, the policy learned is more robust to corrupt rewards.
arXiv Detail & Related papers (2021-02-12T15:53:48Z) - Semi-supervised reward learning for offline reinforcement learning [71.6909757718301]
Training agents usually requires reward functions, but rewards are seldom available in practice and their engineering is challenging and laborious.
We propose semi-supervised learning algorithms that learn from limited annotations and incorporate unlabelled data.
In our experiments with a simulated robotic arm, we greatly improve upon behavioural cloning and closely approach the performance achieved with ground truth rewards.
arXiv Detail & Related papers (2020-12-12T20:06:15Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z) - Intrinsic Motivation for Encouraging Synergistic Behavior [55.10275467562764]
We study the role of intrinsic motivation as an exploration bias for reinforcement learning in sparse-reward synergistic tasks.
Our key idea is that a good guiding principle for intrinsic motivation in synergistic tasks is to take actions which affect the world in ways that would not be achieved if the agents were acting on their own.
arXiv Detail & Related papers (2020-02-12T19:34:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.