Related papers: An Abstraction-based Method to Verify Multi-Agent Deep Reinforcement-Learning Behaviours

An Abstraction-based Method to Verify Multi-Agent Deep Reinforcement-Learning Behaviours

URL: http://arxiv.org/abs/2102.01434v1
Date: Tue, 2 Feb 2021 11:12:30 GMT
Title: An Abstraction-based Method to Verify Multi-Agent Deep Reinforcement-Learning Behaviours
Authors: Pierre El Mqirmi, Francesco Belardinelli and Borja G. Le\'on
Abstract summary: Multi-agent reinforcement learning (RL) often struggles to ensure the safe behaviours of the learning agents. We present a methodology that combines formal verification with (deep) RL algorithms to guarantee the satisfaction of formally-specified safety constraints.
Score: 8.95294551927446
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-agent reinforcement learning (RL) often struggles to ensure the safe behaviours of the learning agents, and therefore it is generally not adapted to safety-critical applications. To address this issue, we present a methodology that combines formal verification with (deep) RL algorithms to guarantee the satisfaction of formally-specified safety constraints both in training and testing. The approach we propose expresses the constraints to verify in Probabilistic Computation Tree Logic (PCTL) and builds an abstract representation of the system to reduce the complexity of the verification step. This abstract model allows for model checking techniques to identify a set of abstract policies that meet the safety constraints expressed in PCTL. Then, the agents' behaviours are restricted according to these safe abstract policies. We provide formal guarantees that by using this method, the actions of the agents always meet the safety constraints, and provide a procedure to generate an abstract model automatically. We empirically evaluate and show the effectiveness of our method in a multi-agent environment.

Related papers

Verification-Guided Falsification for Safe RL via Explainable Abstraction and Risk-Aware Exploration [8.246285288584625]
We propose a hybrid framework that integrates explainability, model checking, and risk-guided falsification to achieve both rigor and coverage.<n>Our approach begins by constructing a human-interpretable abstraction of the RL policy using Comprehensible Abstract Policy Summarization (CAPS)<n>If no violation is detected, we cannot conclude satisfaction due to potential limitation in the abstraction and coverage of the offline dataset.
arXiv Detail & Related papers (2025-06-04T00:54:01Z)
Advancing Neural Network Verification through Hierarchical Safety Abstract Interpretation [52.626086874715284]
We introduce a novel problem formulation called Abstract DNN-Verification, which verifies a hierarchical structure of unsafe outputs.<n>By leveraging abstract interpretation and reasoning about output reachable sets, our approach enables assessing multiple safety levels during the formal verification process.<n>Our contributions include a theoretical exploration of the relationship between our novel abstract safety formulation and existing approaches.
arXiv Detail & Related papers (2025-05-08T13:29:46Z)
Learning Verifiable Control Policies Using Relaxed Verification [49.81690518952909]
This work proposes to perform verification throughout training to aim for policies whose properties can be evaluated throughout runtime. The approach is to use differentiable reachability analysis and incorporate new components into the loss function.
arXiv Detail & Related papers (2025-04-23T16:54:35Z)
Probabilistic Shielding for Safe Reinforcement Learning [51.35559820893218]
In real-life scenarios, a Reinforcement Learning (RL) agent must often also behave in a safe manner, including at training time. We present a new, scalable method, which enjoys strict formal guarantees for Safe RL. We show that our approach provides a strict formal safety guarantee that the agent stays safe at training and test time.
arXiv Detail & Related papers (2025-03-09T17:54:33Z)
Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL) We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z)
Online Safety Property Collection and Refinement for Safe Deep Reinforcement Learning in Mapless Navigation [79.89605349842569]
We introduce the Collection and Refinement of Online Properties (CROP) framework to design properties at training time. CROP employs a cost signal to identify unsafe interactions and use them to shape safety properties. We evaluate our approach in several robotic mapless navigation tasks and demonstrate that the violation metric computed with CROP allows higher returns and lower violations over previous Safe DRL approaches.
arXiv Detail & Related papers (2023-02-13T21:19:36Z)
Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL. We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection. To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z)
Risk Consistent Multi-Class Learning from Label Proportions [64.0125322353281]
This study addresses a multiclass learning from label proportions (MCLLP) setting in which training instances are provided in bags. Most existing MCLLP methods impose bag-wise constraints on the prediction of instances or assign them pseudo-labels. A risk-consistent method is proposed for instance classification using the empirical risk minimization framework.
arXiv Detail & Related papers (2022-03-24T03:49:04Z)
Verified Probabilistic Policies for Deep Reinforcement Learning [6.85316573653194]
We tackle the problem of verifying probabilistic policies for deep reinforcement learning. We propose an abstraction approach, based on interval Markov decision processes, that yields guarantees on a policy's execution. We present techniques to build and solve these models using abstract interpretation, mixed-integer linear programming, entropy-based refinement and probabilistic model checking.
arXiv Detail & Related papers (2022-01-10T23:55:04Z)
Reinforcement Learning for Task Specifications with Action-Constraints [4.046919218061427]
We propose a method to learn optimal control policies for a finite-state Markov Decision Process. We assume that the set of action sequences that are deemed unsafe and/or safe are given in terms of a finite-state automaton. We present a version of the Q-learning algorithm for learning optimal policies in the presence of non-Markovian action-sequence and state constraints.
arXiv Detail & Related papers (2022-01-02T04:22:01Z)
Distillation of RL Policies with Formal Guarantees via Variational Abstraction of Markov Decision Processes (Technical Report) [0.0]
We consider the challenge of policy simplification and verification in the context of policies learned through reinforcement learning (RL) We derive new bisimulation bounds between the unknown environment and a learned discrete latent model of it. We show how one can use a policy obtained via state-of-the-art RL to efficiently train a variational autoencoder that yields a discrete latent model with provably approximately correct bisimulation guarantees.
arXiv Detail & Related papers (2021-12-17T17:57:32Z)
Evaluating the Safety of Deep Reinforcement Learning Models using Semi-Formal Verification [81.32981236437395]
We present a semi-formal verification approach for decision-making tasks based on interval analysis. Our method obtains comparable results over standard benchmarks with respect to formal verifiers. Our approach allows to efficiently evaluate safety properties for decision-making models in practical applications.
arXiv Detail & Related papers (2020-10-19T11:18:06Z)
Constrained Markov Decision Processes via Backward Value Functions [43.649330976089004]
We model the problem of learning with constraints as a Constrained Markov Decision Process. A key contribution of our approach is to translate cumulative cost constraints into state-based constraints. We provide theoretical guarantees under which the agent converges while ensuring safety over the course of training.
arXiv Detail & Related papers (2020-08-26T20:56:16Z)
Learning with Safety Constraints: Sample Complexity of Reinforcement Learning for Constrained MDPs [13.922754427601491]
We characterize the relationship between safety constraints and the number of samples needed to ensure a desired level of accuracy. Our main finding is that compared to the best known bounds of the unconstrained regime, the sample of constrained RL algorithms are increased by a factor that is logarithmic in the number of constraints.
arXiv Detail & Related papers (2020-08-01T18:17:08Z)
SAMBA: Safe Model-Based & Active Reinforcement Learning [59.01424351231993]
SAMBA is a framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics. We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations. We provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.
arXiv Detail & Related papers (2020-06-12T10:40:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.