Related papers: Falsification-Based Robust Adversarial Reinforcement Learning

Falsification-Based Robust Adversarial Reinforcement Learning

URL: http://arxiv.org/abs/2007.00691v3
Date: Mon, 20 Mar 2023 06:57:52 GMT
Title: Falsification-Based Robust Adversarial Reinforcement Learning
Authors: Xiao Wang, Saasha Nair, and Matthias Althoff
Abstract summary: falsification-based RARL (FRARL) is the first generic framework for integrating temporal logic falsification in adversarial learning to improve policy robustness. Our experimental results demonstrate that policies trained with a falsification-based adversary generalize better and show less violation of the safety specification in test scenarios.
Score: 13.467693018395863
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement learning (RL) has achieved enormous progress in solving various sequential decision-making problems, such as control tasks in robotics. Since policies are overfitted to training environments, RL methods have often failed to be generalized to safety-critical test scenarios. Robust adversarial RL (RARL) was previously proposed to train an adversarial network that applies disturbances to a system, which improves the robustness in test scenarios. However, an issue of neural network-based adversaries is that integrating system requirements without handcrafting sophisticated reward signals are difficult. Safety falsification methods allow one to find a set of initial conditions and an input sequence, such that the system violates a given property formulated in temporal logic. In this paper, we propose falsification-based RARL (FRARL): this is the first generic framework for integrating temporal logic falsification in adversarial learning to improve policy robustness. By applying our falsification method, we do not need to construct an extra reward function for the adversary. Moreover, we evaluate our approach on a braking assistance system and an adaptive cruise control system of autonomous vehicles. Our experimental results demonstrate that policies trained with a falsification-based adversary generalize better and show less violation of the safety specification in test scenarios than those trained without an adversary or with an adversarial network.

Related papers

Multi-agent Reinforcement Learning-based Network Intrusion Detection System [3.4636217357968904]
Intrusion Detection Systems (IDS) play a crucial role in ensuring the security of computer networks. We propose a novel multi-agent reinforcement learning (RL) architecture, enabling automatic, efficient, and robust network intrusion detection. Our solution introduces a resilient architecture designed to accommodate the addition of new attacks and effectively adapt to changes in existing attack patterns.
arXiv Detail & Related papers (2024-07-08T09:18:59Z)
Kick Bad Guys Out! Conditionally Activated Anomaly Detection in Federated Learning with Zero-Knowledge Proof Verification [22.078088272837068]
Federated Learning (FL) systems are susceptible to adversarial attacks. Current defense methods are often impractical for real-world FL systems. We propose a novel anomaly detection strategy that is designed for real-world FL systems.
arXiv Detail & Related papers (2023-10-06T07:09:05Z)
Approximate Model-Based Shielding for Safe Reinforcement Learning [83.55437924143615]
We propose a principled look-ahead shielding algorithm for verifying the performance of learned RL policies. Our algorithm differs from other shielding approaches in that it does not require prior knowledge of the safety-relevant dynamics of the system. We demonstrate superior performance to other safety-aware approaches on a set of Atari games with state-dependent safety-labels.
arXiv Detail & Related papers (2023-07-27T15:19:45Z)
Imitating, Fast and Slow: Robust learning from demonstrations via decision-time planning [96.72185761508668]
Planning at Test-time (IMPLANT) is a new meta-algorithm for imitation learning. We demonstrate that IMPLANT significantly outperforms benchmark imitation learning approaches on standard control environments.
arXiv Detail & Related papers (2022-04-07T17:16:52Z)
Robust Policy Learning over Multiple Uncertainty Sets [91.67120465453179]
Reinforcement learning (RL) agents need to be robust to variations in safety-critical environments. We develop an algorithm that enjoys the benefits of both system identification and robust RL.
arXiv Detail & Related papers (2022-02-14T20:06:28Z)
Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs. We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z)
Adversarial Training is Not Ready for Robot Learning [55.493354071227174]
Adversarial training is an effective method to train deep learning models that are resilient to norm-bounded perturbations. We show theoretically and experimentally that neural controllers obtained via adversarial training are subjected to three types of defects. Our results suggest that adversarial training is not yet ready for robot learning.
arXiv Detail & Related papers (2021-03-15T07:51:31Z)
Robust Reinforcement Learning using Adversarial Populations [118.73193330231163]
Reinforcement Learning (RL) is an effective tool for controller design but can struggle with issues of robustness. We show that using a single adversary does not consistently yield robustness to dynamics variations under standard parametrizations of the adversary. We propose a population-based augmentation to the Robust RL formulation in which we randomly initialize a population of adversaries and sample from the population uniformly during training.
arXiv Detail & Related papers (2020-08-04T20:57:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.