Model-based Dynamic Shielding for Safe and Efficient Multi-Agent
Reinforcement Learning
- URL: http://arxiv.org/abs/2304.06281v1
- Date: Thu, 13 Apr 2023 06:08:10 GMT
- Title: Model-based Dynamic Shielding for Safe and Efficient Multi-Agent
Reinforcement Learning
- Authors: Wenli Xiao, Yiwei Lyu, John Dolan
- Abstract summary: Multi-Agent Reinforcement Learning (MARL) discovers policies that maximize reward but do not have safety guarantees during the learning and deployment phases.
Model-based Dynamic Shielding (MBDS) to support MARL algorithm design.
- Score: 7.103977648997475
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-Agent Reinforcement Learning (MARL) discovers policies that maximize
reward but do not have safety guarantees during the learning and deployment
phases. Although shielding with Linear Temporal Logic (LTL) is a promising
formal method to ensure safety in single-agent Reinforcement Learning (RL), it
results in conservative behaviors when scaling to multi-agent scenarios.
Additionally, it poses computational challenges for synthesizing shields in
complex multi-agent environments. This work introduces Model-based Dynamic
Shielding (MBDS) to support MARL algorithm design. Our algorithm synthesizes
distributive shields, which are reactive systems running in parallel with each
MARL agent, to monitor and rectify unsafe behaviors. The shields can
dynamically split, merge, and recompute based on agents' states. This design
enables efficient synthesis of shields to monitor agents in complex
environments without coordination overheads. We also propose an algorithm to
synthesize shields without prior knowledge of the dynamics model. The proposed
algorithm obtains an approximate world model by interacting with the
environment during the early stage of exploration, making our MBDS enjoy formal
safety guarantees with high probability. We demonstrate in simulations that our
framework can surpass existing baselines in terms of safety guarantees and
learning performance.
Related papers
- Think Smart, Act SMARL! Analyzing Probabilistic Logic Driven Safety in Multi-Agent Reinforcement Learning [3.0846824529023382]
This paper introduces Shielded MARL (SMARL) to enable shielded independent Q-learning.
It also introduces Probabilistic Logic Temporal Difference Learning (PLTD) to enable shielded independent Q-learning.
$ii$ show its positive effect and use as an equilibrium selection mechanism in various game-theoretic environments.
arXiv Detail & Related papers (2024-11-07T16:59:32Z) - Progressive Safeguards for Safe and Model-Agnostic Reinforcement Learning [5.593642806259113]
We model a meta-learning process where each task is synchronized with a safeguard that monitors safety and provides a reward signal to the agent.
The design of the safeguard is manual but it is high-level and model-agnostic, which gives rise to an end-to-end safe learning approach.
We evaluate our framework in a Minecraft-inspired Gridworld, a VizDoom game environment, and an LLM fine-tuning application.
arXiv Detail & Related papers (2024-10-31T16:28:33Z) - What Makes and Breaks Safety Fine-tuning? A Mechanistic Study [64.9691741899956]
Safety fine-tuning helps align Large Language Models (LLMs) with human preferences for their safe deployment.
We design a synthetic data generation framework that captures salient aspects of an unsafe input.
Using this, we investigate three well-known safety fine-tuning methods.
arXiv Detail & Related papers (2024-07-14T16:12:57Z) - Shield Synthesis for LTL Modulo Theories [2.034732821736745]
We develop a novel approach for generating shields conforming to complex safety specifications.
To the best of our knowledge, this is the first approach for synthesizing shields for such expressivity.
arXiv Detail & Related papers (2024-06-06T15:40:29Z) - InferAligner: Inference-Time Alignment for Harmlessness through
Cross-Model Guidance [56.184255657175335]
We develop textbfInferAligner, a novel inference-time alignment method that utilizes cross-model guidance for harmlessness alignment.
Experimental results show that our method can be very effectively applied to domain-specific models in finance, medicine, and mathematics.
It significantly diminishes the Attack Success Rate (ASR) of both harmful instructions and jailbreak attacks, while maintaining almost unchanged performance in downstream tasks.
arXiv Detail & Related papers (2024-01-20T10:41:03Z) - Approximate Model-Based Shielding for Safe Reinforcement Learning [83.55437924143615]
We propose a principled look-ahead shielding algorithm for verifying the performance of learned RL policies.
Our algorithm differs from other shielding approaches in that it does not require prior knowledge of the safety-relevant dynamics of the system.
We demonstrate superior performance to other safety-aware approaches on a set of Atari games with state-dependent safety-labels.
arXiv Detail & Related papers (2023-07-27T15:19:45Z) - Approximate Shielding of Atari Agents for Safe Exploration [83.55437924143615]
We propose a principled algorithm for safe exploration based on the concept of shielding.
We present preliminary results that show our approximate shielding algorithm effectively reduces the rate of safety violations.
arXiv Detail & Related papers (2023-04-21T16:19:54Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Automata Learning meets Shielding [1.1417805445492082]
Safety is still one of the major research challenges in reinforcement learning (RL)
In this paper, we address the problem of how to avoid safety violations of RL agents during exploration in probabilistic and partially unknown environments.
Our approach combines automata learning for Markov Decision Processes (MDPs) and shield synthesis in an iterative approach.
arXiv Detail & Related papers (2022-12-04T14:58:12Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - Safe-Critical Modular Deep Reinforcement Learning with Temporal Logic
through Gaussian Processes and Control Barrier Functions [3.5897534810405403]
Reinforcement learning (RL) is a promising approach and has limited success towards real-world applications.
In this paper, we propose a learning-based control framework consisting of several aspects.
We show such an ECBF-based modular deep RL algorithm achieves near-perfect success rates and guard safety with a high probability.
arXiv Detail & Related papers (2021-09-07T00:51:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.