Falsification-Based Robust Adversarial Reinforcement Learning
- URL: http://arxiv.org/abs/2007.00691v3
- Date: Mon, 20 Mar 2023 06:57:52 GMT
- Title: Falsification-Based Robust Adversarial Reinforcement Learning
- Authors: Xiao Wang, Saasha Nair, and Matthias Althoff
- Abstract summary: falsification-based RARL (FRARL) is the first generic framework for integrating temporal logic falsification in adversarial learning to improve policy robustness.
Our experimental results demonstrate that policies trained with a falsification-based adversary generalize better and show less violation of the safety specification in test scenarios.
- Score: 13.467693018395863
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) has achieved enormous progress in solving various
sequential decision-making problems, such as control tasks in robotics. Since
policies are overfitted to training environments, RL methods have often failed
to be generalized to safety-critical test scenarios. Robust adversarial RL
(RARL) was previously proposed to train an adversarial network that applies
disturbances to a system, which improves the robustness in test scenarios.
However, an issue of neural network-based adversaries is that integrating
system requirements without handcrafting sophisticated reward signals are
difficult. Safety falsification methods allow one to find a set of initial
conditions and an input sequence, such that the system violates a given
property formulated in temporal logic. In this paper, we propose
falsification-based RARL (FRARL): this is the first generic framework for
integrating temporal logic falsification in adversarial learning to improve
policy robustness. By applying our falsification method, we do not need to
construct an extra reward function for the adversary. Moreover, we evaluate our
approach on a braking assistance system and an adaptive cruise control system
of autonomous vehicles. Our experimental results demonstrate that policies
trained with a falsification-based adversary generalize better and show less
violation of the safety specification in test scenarios than those trained
without an adversary or with an adversarial network.
Related papers
- Multi-agent Reinforcement Learning-based Network Intrusion Detection System [3.4636217357968904]
Intrusion Detection Systems (IDS) play a crucial role in ensuring the security of computer networks.
We propose a novel multi-agent reinforcement learning (RL) architecture, enabling automatic, efficient, and robust network intrusion detection.
Our solution introduces a resilient architecture designed to accommodate the addition of new attacks and effectively adapt to changes in existing attack patterns.
arXiv Detail & Related papers (2024-07-08T09:18:59Z) - Kick Bad Guys Out! Conditionally Activated Anomaly Detection in Federated Learning with Zero-Knowledge Proof Verification [22.078088272837068]
Federated Learning (FL) systems are susceptible to adversarial attacks.
Current defense methods are often impractical for real-world FL systems.
We propose a novel anomaly detection strategy that is designed for real-world FL systems.
arXiv Detail & Related papers (2023-10-06T07:09:05Z) - Approximate Model-Based Shielding for Safe Reinforcement Learning [83.55437924143615]
We propose a principled look-ahead shielding algorithm for verifying the performance of learned RL policies.
Our algorithm differs from other shielding approaches in that it does not require prior knowledge of the safety-relevant dynamics of the system.
We demonstrate superior performance to other safety-aware approaches on a set of Atari games with state-dependent safety-labels.
arXiv Detail & Related papers (2023-07-27T15:19:45Z) - Imitating, Fast and Slow: Robust learning from demonstrations via
decision-time planning [96.72185761508668]
Planning at Test-time (IMPLANT) is a new meta-algorithm for imitation learning.
We demonstrate that IMPLANT significantly outperforms benchmark imitation learning approaches on standard control environments.
arXiv Detail & Related papers (2022-04-07T17:16:52Z) - Robust Policy Learning over Multiple Uncertainty Sets [91.67120465453179]
Reinforcement learning (RL) agents need to be robust to variations in safety-critical environments.
We develop an algorithm that enjoys the benefits of both system identification and robust RL.
arXiv Detail & Related papers (2022-02-14T20:06:28Z) - Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs.
We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z) - Adversarial Training is Not Ready for Robot Learning [55.493354071227174]
Adversarial training is an effective method to train deep learning models that are resilient to norm-bounded perturbations.
We show theoretically and experimentally that neural controllers obtained via adversarial training are subjected to three types of defects.
Our results suggest that adversarial training is not yet ready for robot learning.
arXiv Detail & Related papers (2021-03-15T07:51:31Z) - Robust Reinforcement Learning using Adversarial Populations [118.73193330231163]
Reinforcement Learning (RL) is an effective tool for controller design but can struggle with issues of robustness.
We show that using a single adversary does not consistently yield robustness to dynamics variations under standard parametrizations of the adversary.
We propose a population-based augmentation to the Robust RL formulation in which we randomly initialize a population of adversaries and sample from the population uniformly during training.
arXiv Detail & Related papers (2020-08-04T20:57:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.