On Assessing The Safety of Reinforcement Learning algorithms Using
Formal Methods
- URL: http://arxiv.org/abs/2111.04865v2
- Date: Wed, 10 Nov 2021 02:05:16 GMT
- Title: On Assessing The Safety of Reinforcement Learning algorithms Using
Formal Methods
- Authors: Paulina Stevia Nouwou Mindom and Amin Nikanjam and Foutse Khomh, and
John Mullins
- Abstract summary: Safety mechanisms such as adversarial training, adversarial detection, and robust learning are not always adapted to all disturbances in which the agent is deployed.
It is therefore necessary to propose new solutions adapted to the learning challenges faced by the agent.
We use reward shaping and a modified Q-learning algorithm as defense mechanisms to improve the agent's policy when facing adversarial perturbations.
- Score: 6.2822673562306655
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The increasing adoption of Reinforcement Learning in safety-critical systems
domains such as autonomous vehicles, health, and aviation raises the need for
ensuring their safety. Existing safety mechanisms such as adversarial training,
adversarial detection, and robust learning are not always adapted to all
disturbances in which the agent is deployed. Those disturbances include moving
adversaries whose behavior can be unpredictable by the agent, and as a matter
of fact harmful to its learning. Ensuring the safety of critical systems also
requires methods that give formal guarantees on the behaviour of the agent
evolving in a perturbed environment. It is therefore necessary to propose new
solutions adapted to the learning challenges faced by the agent. In this paper,
first we generate adversarial agents that exhibit flaws in the agent's policy
by presenting moving adversaries. Secondly, We use reward shaping and a
modified Q-learning algorithm as defense mechanisms to improve the agent's
policy when facing adversarial perturbations. Finally, probabilistic model
checking is employed to evaluate the effectiveness of both mechanisms. We have
conducted experiments on a discrete grid world with a single agent facing
non-learning and learning adversaries. Our results show a diminution in the
number of collisions between the agent and the adversaries. Probabilistic model
checking provides lower and upper probabilistic bounds regarding the agent's
safety in the adversarial environment.
Related papers
- PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety [70.84902425123406]
Multi-agent systems, when enhanced with Large Language Models (LLMs), exhibit profound capabilities in collective intelligence.
However, the potential misuse of this intelligence for malicious purposes presents significant risks.
We propose a framework (PsySafe) grounded in agent psychology, focusing on identifying how dark personality traits in agents can lead to risky behaviors.
Our experiments reveal several intriguing phenomena, such as the collective dangerous behaviors among agents, agents' self-reflection when engaging in dangerous behavior, and the correlation between agents' psychological assessments and dangerous behaviors.
arXiv Detail & Related papers (2024-01-22T12:11:55Z) - Safety Margins for Reinforcement Learning [53.10194953873209]
We show how to leverage proxy criticality metrics to generate safety margins.
We evaluate our approach on learned policies from APE-X and A3C within an Atari environment.
arXiv Detail & Related papers (2023-07-25T16:49:54Z) - Robustness Testing for Multi-Agent Reinforcement Learning: State
Perturbations on Critical Agents [2.5204420653245245]
Multi-Agent Reinforcement Learning (MARL) has been widely applied in many fields such as smart traffic and unmanned aerial vehicles.
This work proposes a novel Robustness Testing framework for MARL that attacks states of Critical Agents.
arXiv Detail & Related papers (2023-06-09T02:26:28Z) - Toward Evaluating Robustness of Reinforcement Learning with Adversarial Policy [32.1138935956272]
Reinforcement learning agents are susceptible to evasion attacks during deployment.
In this paper, we propose Intrinsically Motivated Adrial Policy (IMAP) for efficient black-box adversarial policy learning.
arXiv Detail & Related papers (2023-05-04T07:24:12Z) - Learning to Generate All Feasible Actions [4.333208181196761]
We introduce action mapping, a novel approach that divides the learning process into two steps: first learn feasibility and subsequently, the objective.
This paper focuses on the feasibility part by learning to generate all feasible actions through self-supervised querying of the feasibility model.
We demonstrate the agent's proficiency in generating actions across disconnected feasible action sets.
arXiv Detail & Related papers (2023-01-26T23:15:51Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Coach-assisted Multi-Agent Reinforcement Learning Framework for
Unexpected Crashed Agents [120.91291581594773]
We present a formal formulation of a cooperative multi-agent reinforcement learning system with unexpected crashes.
We propose a coach-assisted multi-agent reinforcement learning framework, which introduces a virtual coach agent to adjust the crash rate during training.
To the best of our knowledge, this work is the first to study the unexpected crashes in the multi-agent system.
arXiv Detail & Related papers (2022-03-16T08:22:45Z) - Robust Reinforcement Learning via Genetic Curriculum [5.421464476555662]
Genetic curriculum is an algorithm that automatically identifies scenarios in which the agent currently fails and generates an associated curriculum.
Our empirical studies show improvement in robustness over the existing state of the art algorithms, providing training curricula that result in agents being 2 - 8x times less likely to fail.
arXiv Detail & Related papers (2022-02-17T01:14:20Z) - Heterogeneous-Agent Trajectory Forecasting Incorporating Class
Uncertainty [54.88405167739227]
We present HAICU, a method for heterogeneous-agent trajectory forecasting that explicitly incorporates agents' class probabilities.
We additionally present PUP, a new challenging real-world autonomous driving dataset.
We demonstrate that incorporating class probabilities in trajectory forecasting significantly improves performance in the face of uncertainty.
arXiv Detail & Related papers (2021-04-26T10:28:34Z) - Safe Reinforcement Learning via Curriculum Induction [94.67835258431202]
In safety-critical applications, autonomous agents may need to learn in an environment where mistakes can be very costly.
Existing safe reinforcement learning methods make an agent rely on priors that let it avoid dangerous situations.
This paper presents an alternative approach inspired by human teaching, where an agent learns under the supervision of an automatic instructor.
arXiv Detail & Related papers (2020-06-22T10:48:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.