DeepSafeMPC: Deep Learning-Based Model Predictive Control for Safe
Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2403.06397v2
- Date: Tue, 12 Mar 2024 02:13:51 GMT
- Title: DeepSafeMPC: Deep Learning-Based Model Predictive Control for Safe
Multi-Agent Reinforcement Learning
- Authors: Xuefeng Wang, Henglin Pu, Hyung Jun Kim and Husheng Li
- Abstract summary: We propose a novel method called Deep Learning-Based Model Predictive Control for Safe Multi-Agent Reinforcement Learning (DeepSafeMPC)
The key insight of DeepSafeMPC is leveraging a entralized deep learning model to well predict environmental dynamics.
We demonstrate the effectiveness of our approach using the Safe Multi-agent MuJoCo environment.
- Score: 11.407941376728258
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Safe Multi-agent reinforcement learning (safe MARL) has increasingly gained
attention in recent years, emphasizing the need for agents to not only optimize
the global return but also adhere to safety requirements through behavioral
constraints. Some recent work has integrated control theory with multi-agent
reinforcement learning to address the challenge of ensuring safety. However,
there have been only very limited applications of Model Predictive Control
(MPC) methods in this domain, primarily due to the complex and implicit
dynamics characteristic of multi-agent environments. To bridge this gap, we
propose a novel method called Deep Learning-Based Model Predictive Control for
Safe Multi-Agent Reinforcement Learning (DeepSafeMPC). The key insight of
DeepSafeMPC is leveraging a entralized deep learning model to well predict
environmental dynamics. Our method applies MARL principles to search for
optimal solutions. Through the employment of MPC, the actions of agents can be
restricted within safe states concurrently. We demonstrate the effectiveness of
our approach using the Safe Multi-agent MuJoCo environment, showcasing
significant advancements in addressing safety concerns in MARL.
Related papers
- Think Smart, Act SMARL! Analyzing Probabilistic Logic Driven Safety in Multi-Agent Reinforcement Learning [3.0846824529023382]
This paper introduces Shielded MARL (SMARL) to enable shielded independent Q-learning.
It also introduces Probabilistic Logic Temporal Difference Learning (PLTD) to enable shielded independent Q-learning.
$ii$ show its positive effect and use as an equilibrium selection mechanism in various game-theoretic environments.
arXiv Detail & Related papers (2024-11-07T16:59:32Z) - ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning [48.536695794883826]
We present ActSafe, a novel model-based RL algorithm for safe and efficient exploration.
We show that ActSafe guarantees safety during learning while also obtaining a near-optimal policy in finite time.
In addition, we propose a practical variant of ActSafe that builds on latest model-based RL advancements.
arXiv Detail & Related papers (2024-10-12T10:46:02Z) - Diffusion Models for Offline Multi-agent Reinforcement Learning with Safety Constraints [0.0]
We introduce an innovative framework integrating diffusion models within the Multi-agent Reinforcement Learning paradigm.
This approach notably enhances the safety of actions taken by multiple agents through risk mitigation while modeling coordinated action.
arXiv Detail & Related papers (2024-06-30T16:05:31Z) - Multi-Agent Reinforcement Learning with Control-Theoretic Safety Guarantees for Dynamic Network Bridging [0.11249583407496219]
This work introduces a hybrid approach that integrates Multi-Agent Reinforcement Learning with control-theoretic methods to ensure safe and efficient distributed strategies.
Our contributions include a novel setpoint update algorithm that dynamically adjusts agents' positions to preserve safety conditions without compromising the mission's objectives.
arXiv Detail & Related papers (2024-04-02T01:30:41Z) - The Art of Defending: A Systematic Evaluation and Analysis of LLM
Defense Strategies on Safety and Over-Defensiveness [56.174255970895466]
Large Language Models (LLMs) play an increasingly pivotal role in natural language processing applications.
This paper presents Safety and Over-Defensiveness Evaluation (SODE) benchmark.
arXiv Detail & Related papers (2023-12-30T17:37:06Z) - Safeguarded Progress in Reinforcement Learning: Safe Bayesian
Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL)
We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z) - Approximate Model-Based Shielding for Safe Reinforcement Learning [83.55437924143615]
We propose a principled look-ahead shielding algorithm for verifying the performance of learned RL policies.
Our algorithm differs from other shielding approaches in that it does not require prior knowledge of the safety-relevant dynamics of the system.
We demonstrate superior performance to other safety-aware approaches on a set of Atari games with state-dependent safety-labels.
arXiv Detail & Related papers (2023-07-27T15:19:45Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance [73.3242641337305]
Recent work learns risk measures which measure the probability of violating constraints, which can then be used to enable safety.
We cast safe exploration as an offline meta-RL problem, where the objective is to leverage examples of safe and unsafe behavior across a range of environments.
We then propose MEta-learning for Safe Adaptation (MESA), an approach for meta-learning Simulation a risk measure for safe RL.
arXiv Detail & Related papers (2021-12-07T08:57:35Z) - Multi-Agent Constrained Policy Optimisation [17.772811770726296]
We formulate the safe MARL problem as a constrained Markov game and solve it with policy optimisation methods.
Our solutions -- Multi-Agent Constrained Policy optimisation (MACPO) and MAPPO-Lagrangian -- leverage the theories from both constrained policy optimisation and multi-agent trust region learning.
We develop the benchmark suite of Safe Multi-Agent MuJoCo that involves a variety of MARL baselines.
arXiv Detail & Related papers (2021-10-06T14:17:09Z) - Learning Safe Multi-Agent Control with Decentralized Neural Barrier
Certificates [19.261536710315028]
We study the multi-agent safe control problem where agents should avoid collisions to static obstacles and collisions with each other while reaching their goals.
Our core idea is to learn the multi-agent control policy jointly with learning the control barrier functions as safety certificates.
We propose a novel joint-learning framework that can be implemented in a decentralized fashion, with generalization guarantees for certain function classes.
arXiv Detail & Related papers (2021-01-14T03:17:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.