Ablation Study of How Run Time Assurance Impacts the Training and
Performance of Reinforcement Learning Agents
- URL: http://arxiv.org/abs/2207.04117v1
- Date: Fri, 8 Jul 2022 20:15:15 GMT
- Title: Ablation Study of How Run Time Assurance Impacts the Training and
Performance of Reinforcement Learning Agents
- Authors: Nathaniel Hamilton, Kyle Dunlap, Taylor T Johnson, Kerianne L Hobbs
- Abstract summary: We conduct an ablation study using evaluation best practices to investigate the impact of run time assurance (RTA) on effective learning.
Our conclusions shed light on the most promising directions of Safe Reinforcement Learning.
- Score: 5.801944210870593
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement Learning (RL) has become an increasingly important research
area as the success of machine learning algorithms and methods grows. To combat
the safety concerns surrounding the freedom given to RL agents while training,
there has been an increase in work concerning Safe Reinforcement Learning
(SRL). However, these new and safe methods have been held to less scrutiny than
their unsafe counterparts. For instance, comparisons among safe methods often
lack fair evaluation across similar initial condition bounds and hyperparameter
settings, use poor evaluation metrics, and cherry-pick the best training runs
rather than averaging over multiple random seeds. In this work, we conduct an
ablation study using evaluation best practices to investigate the impact of run
time assurance (RTA), which monitors the system state and intervenes to assure
safety, on effective learning. By studying multiple RTA approaches in both
on-policy and off-policy RL algorithms, we seek to understand which RTA methods
are most effective, whether the agents become dependent on the RTA, and the
importance of reward shaping versus safe exploration in RL agent training. Our
conclusions shed light on the most promising directions of SRL, and our
evaluation methodology lays the groundwork for creating better comparisons in
future SRL work.
Related papers
- Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding [5.5929450570003185]
Training RL agents in unknown, black-box environments poses an even greater safety risk when prior knowledge of the domain/task is unavailable.
We introduce ADVICE (Adaptive Shielding with a Contrastive Autoencoder), a novel post-shielding technique that distinguishes safe and unsafe features of state-action pairs during training.
arXiv Detail & Related papers (2024-05-28T13:47:21Z) - Safeguarded Progress in Reinforcement Learning: Safe Bayesian
Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL)
We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Self-Improving Safety Performance of Reinforcement Learning Based
Driving with Black-Box Verification Algorithms [0.0]
We propose a self-improving artificial intelligence system to enhance the safety performance of reinforcement learning (RL)-based autonomous driving (AD) agents.
Our approach efficiently discovers safety failures of action decisions in RL-based adaptive cruise control (ACC) applications.
arXiv Detail & Related papers (2022-10-29T11:34:17Z) - On the Robustness of Safe Reinforcement Learning under Observational
Perturbations [27.88525130218356]
We show that baseline adversarial attack techniques for standard RL tasks are not always effective for safe RL.
One interesting and counter-intuitive finding is that the maximum reward attack is strong, as it can both induce unsafe behaviors and make the attack stealthy by maintaining the reward.
This work sheds light on the inherited connection between observational robustness and safety in RL and provides a pioneer work for future safe RL studies.
arXiv Detail & Related papers (2022-05-29T15:25:03Z) - Provably Safe Reinforcement Learning: Conceptual Analysis, Survey, and
Benchmarking [12.719948223824483]
reinforcement learning (RL) algorithms are crucial to unlock their potential for many real-world tasks.
However, vanilla RL and most safe RL approaches do not guarantee safety.
We introduce a categorization of existing provably safe RL methods, present the conceptual foundations for both continuous and discrete action spaces, and empirically benchmark existing methods.
We provide practical guidance on selecting provably safe RL approaches depending on the safety specification, RL algorithm, and type of action space.
arXiv Detail & Related papers (2022-05-13T16:34:36Z) - Robust Reinforcement Learning on State Observations with Learned Optimal
Adversary [86.0846119254031]
We study the robustness of reinforcement learning with adversarially perturbed state observations.
With a fixed agent policy, we demonstrate that an optimal adversary to perturb state observations can be found.
For DRL settings, this leads to a novel empirical adversarial attack to RL agents via a learned adversary that is much stronger than previous ones.
arXiv Detail & Related papers (2021-01-21T05:38:52Z) - Conservative Safety Critics for Exploration [120.73241848565449]
We study the problem of safe exploration in reinforcement learning (RL)
We learn a conservative safety estimate of environment states through a critic.
We show that the proposed approach can achieve competitive task performance while incurring significantly lower catastrophic failure rates.
arXiv Detail & Related papers (2020-10-27T17:54:25Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.