Training Verifiably Robust Agents Using Set-Based Reinforcement Learning
- URL: http://arxiv.org/abs/2408.09112v1
- Date: Sat, 17 Aug 2024 06:26:17 GMT
- Title: Training Verifiably Robust Agents Using Set-Based Reinforcement Learning
- Authors: Manuel Wendl, Lukas Koller, Tobias Ladner, Matthias Althoff,
- Abstract summary: We train neural networks utilizing entire sets of perturbed inputs and maximize the worst-case reward.
The obtained agents are verifiably more robust than agents obtained by related work, making them more applicable in safety-critical environments.
- Score: 8.217552831952
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning often uses neural networks to solve complex control tasks. However, neural networks are sensitive to input perturbations, which makes their deployment in safety-critical environments challenging. This work lifts recent results from formally verifying neural networks against such disturbances to reinforcement learning in continuous state and action spaces using reachability analysis. While previous work mainly focuses on adversarial attacks for robust reinforcement learning, we train neural networks utilizing entire sets of perturbed inputs and maximize the worst-case reward. The obtained agents are verifiably more robust than agents obtained by related work, making them more applicable in safety-critical environments. This is demonstrated with an extensive empirical evaluation of four different benchmarks.
Related papers
- Strengthening the Internal Adversarial Robustness in Lifted Neural Networks [9.781171732431169]
We first investigate how adversarial robustness in this framework can be further strengthened by solely modifying the training loss.
In a second step we fix some remaining limitations and arrive at a novel training loss for lifted neural networks, that combines targeted and untargeted adversarial perturbations.
arXiv Detail & Related papers (2025-03-10T20:00:38Z) - Set-Based Training for Neural Network Verification [8.97708612393722]
Small input perturbations can significantly affect the outputs of a neural network.
In safety-critical environments, the inputs often contain noisy sensor data.
We employ an end-to-end set-based training procedure that trains robust neural networks for formal verification.
arXiv Detail & Related papers (2024-01-26T15:52:41Z) - Towards Improving Robustness Against Common Corruptions in Object
Detectors Using Adversarial Contrastive Learning [10.27974860479791]
This paper proposes an innovative adversarial contrastive learning framework to enhance neural network robustness simultaneously against adversarial attacks and common corruptions.
By focusing on improving performance under adversarial and real-world conditions, our approach aims to bolster the robustness of neural networks in safety-critical applications.
arXiv Detail & Related papers (2023-11-14T06:13:52Z) - Learning Dynamics and Generalization in Reinforcement Learning [59.530058000689884]
We show theoretically that temporal difference learning encourages agents to fit non-smooth components of the value function early in training.
We show that neural networks trained using temporal difference algorithms on dense reward tasks exhibit weaker generalization between states than randomly networks and gradient networks trained with policy methods.
arXiv Detail & Related papers (2022-06-05T08:49:16Z) - Building Compact and Robust Deep Neural Networks with Toeplitz Matrices [93.05076144491146]
This thesis focuses on the problem of training neural networks which are compact, easy to train, reliable and robust to adversarial examples.
We leverage the properties of structured matrices from the Toeplitz family to build compact and secure neural networks.
arXiv Detail & Related papers (2021-09-02T13:58:12Z) - Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations.
We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z) - Increasing the Confidence of Deep Neural Networks by Coverage Analysis [71.57324258813674]
This paper presents a lightweight monitoring architecture based on coverage paradigms to enhance the model against different unsafe inputs.
Experimental results show that the proposed approach is effective in detecting both powerful adversarial examples and out-of-distribution inputs.
arXiv Detail & Related papers (2021-01-28T16:38:26Z) - Improving Adversarial Robustness by Enforcing Local and Global
Compactness [19.8818435601131]
Adversary training is the most successful method that consistently resists a wide range of attacks.
We propose the Adversary Divergence Reduction Network which enforces local/global compactness and the clustering assumption.
The experimental results demonstrate that augmenting adversarial training with our proposed components can further improve the robustness of the network.
arXiv Detail & Related papers (2020-07-10T00:43:06Z) - Rethinking Clustering for Robustness [56.14672993686335]
ClusTR is a clustering-based and adversary-free training framework to learn robust models.
textitClusTR outperforms adversarially-trained networks by up to $4%$ under strong PGD attacks.
arXiv Detail & Related papers (2020-06-13T16:55:51Z) - Protecting the integrity of the training procedure of neural networks [0.0]
neural networks are used for an ever-increasing number of applications.
One of the most striking IT security problems aggravated by the opacity of neural networks is the possibility of poisoning attacks during the training phase.
We propose an approach to this problem which allows provably verifying the integrity of the training procedure by making use of standard cryptographic mechanisms.
arXiv Detail & Related papers (2020-05-14T12:57:23Z) - Learn2Perturb: an End-to-end Feature Perturbation Learning to Improve
Adversarial Robustness [79.47619798416194]
Learn2Perturb is an end-to-end feature perturbation learning approach for improving the adversarial robustness of deep neural networks.
Inspired by the Expectation-Maximization, an alternating back-propagation training algorithm is introduced to train the network and noise parameters consecutively.
arXiv Detail & Related papers (2020-03-02T18:27:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.