Solving Multi-Agent Safe Optimal Control with Distributed Epigraph Form MARL
- URL: http://arxiv.org/abs/2504.15425v1
- Date: Mon, 21 Apr 2025 20:34:55 GMT
- Title: Solving Multi-Agent Safe Optimal Control with Distributed Epigraph Form MARL
- Authors: Songyuan Zhang, Oswin So, Mitchell Black, Zachary Serlin, Chuchu Fan,
- Abstract summary: Tasks for multi-robot systems often require the robots to collaborate and complete a team goal while maintaining safety.<n>This problem is usually formalized as a constrained Markov decision process (CMDP), which targets minimizing a global cost and bringing the mean of constraint violation below a user-defined threshold.<n>Inspired by real-world robotic applications, we define safety as zero constraint violation.<n>We use the epigraph form for constrained optimization to improve training stability and prove that the centralized epigraph form problem can be solved in a distributed fashion by each agent.<n>This results in a novel centralized training distributed execution MARL algorithm named Def-MARL
- Score: 12.261657830457754
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tasks for multi-robot systems often require the robots to collaborate and complete a team goal while maintaining safety. This problem is usually formalized as a constrained Markov decision process (CMDP), which targets minimizing a global cost and bringing the mean of constraint violation below a user-defined threshold. Inspired by real-world robotic applications, we define safety as zero constraint violation. While many safe multi-agent reinforcement learning (MARL) algorithms have been proposed to solve CMDPs, these algorithms suffer from unstable training in this setting. To tackle this, we use the epigraph form for constrained optimization to improve training stability and prove that the centralized epigraph form problem can be solved in a distributed fashion by each agent. This results in a novel centralized training distributed execution MARL algorithm named Def-MARL. Simulation experiments on 8 different tasks across 2 different simulators show that Def-MARL achieves the best overall performance, satisfies safety constraints, and maintains stable training. Real-world hardware experiments on Crazyflie quadcopters demonstrate the ability of Def-MARL to safely coordinate agents to complete complex collaborative tasks compared to other methods.
Related papers
- Collab: Controlled Decoding using Mixture of Agents for LLM Alignment [90.6117569025754]
Reinforcement learning from human feedback has emerged as an effective technique to align Large Language models.<n>Controlled Decoding provides a mechanism for aligning a model at inference time without retraining.<n>We propose a mixture of agent-based decoding strategies leveraging the existing off-the-shelf aligned LLM policies.
arXiv Detail & Related papers (2025-03-27T17:34:25Z) - Learning Efficient Flocking Control based on Gibbs Random Fields [8.715391538937707]
Multi-agent reinforcement learning framework built on Gibbs Random Fields (GRFs)<n>An action attention module is introduced to implicitly anticipate the motion intentions of neighboring robots.<n>Proposed framework enables learning an efficient distributed control policy for multi-robot systems in challenging environments with success rate around $99%$.
arXiv Detail & Related papers (2025-02-05T08:27:58Z) - Safe Multi-Agent Reinforcement Learning with Convergence to Generalized Nash Equilibrium [6.169364905804677]
Multi-agent reinforcement learning (MARL) has achieved notable success in cooperative tasks.
deploying MARL agents in real-world applications presents critical safety challenges.
We propose a novel theoretical framework for safe MARL with $textitstate-wise$ constraints, where safety requirements are enforced at every state the agents visit.
For practical deployment in complex high-dimensional systems, we propose $textitMulti-Agent Dual Actor-Critic$ (MADAC)
arXiv Detail & Related papers (2024-11-22T16:08:42Z) - Safe Multi-Agent Reinforcement Learning with Bilevel Optimization in Autonomous Driving [3.5293763645151404]
We propose a safe MARL method grounded in a Stackelberg model with bi-level optimization.
We develop two practical algorithms, namely Constrained Stackelberg Q-learning (CSQ) and Constrained Stackelberg Multi-Agent Deep Deterministic Policy Gradient (CS-MADDPG)
Our algorithms, CSQ and CS-MADDPG, outperform several strong MARL baselines, such as Bi-AC, MACPO, and MAPPO-L, regarding reward and safety performance.
arXiv Detail & Related papers (2024-05-28T14:15:18Z) - Uniformly Safe RL with Objective Suppression for Multi-Constraint Safety-Critical Applications [73.58451824894568]
The widely adopted CMDP model constrains the risks in expectation, which makes room for dangerous behaviors in long-tail states.
In safety-critical domains, such behaviors could lead to disastrous outcomes.
We propose Objective Suppression, a novel method that adaptively suppresses the task reward maximizing objectives according to a safety critic.
arXiv Detail & Related papers (2024-02-23T23:22:06Z) - Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - Maximum Entropy Heterogeneous-Agent Reinforcement Learning [45.377385280485065]
Multi-agent reinforcement learning (MARL) has been shown effective for cooperative games in recent years.<n>We propose a unified framework for learning policies to resolve issues related to sample complexity, training instability, and the risk of converging to a suboptimal Nash Equilibrium.<n>Based on the MaxEnt framework, we propose Heterogeneous-Agent Soft Actor-Critic (HASAC) algorithm.<n>We evaluate HASAC on six benchmarks: Bi-DexHands, Multi-Agent MuJoCo, StarCraft Challenge, Google Research Football, Multi-Agent Particle Environment, and Light Aircraft Game.
arXiv Detail & Related papers (2023-06-19T06:22:02Z) - Model-based Dynamic Shielding for Safe and Efficient Multi-Agent
Reinforcement Learning [7.103977648997475]
Multi-Agent Reinforcement Learning (MARL) discovers policies that maximize reward but do not have safety guarantees during the learning and deployment phases.
Model-based Dynamic Shielding (MBDS) to support MARL algorithm design.
arXiv Detail & Related papers (2023-04-13T06:08:10Z) - Train Hard, Fight Easy: Robust Meta Reinforcement Learning [78.16589993684698]
A major challenge of reinforcement learning (RL) in real-world applications is the variation between environments, tasks or clients.
Standard MRL methods optimize the average return over tasks, but often suffer from poor results in tasks of high risk or difficulty.
In this work, we define a robust MRL objective with a controlled level.
The data inefficiency is addressed via the novel Robust Meta RL algorithm (RoML)
arXiv Detail & Related papers (2023-01-26T14:54:39Z) - Decentralized Safe Multi-agent Stochastic Optimal Control using Deep
FBSDEs and ADMM [16.312625634442092]
We propose a novel safe and scalable decentralized solution for multi-agent control in the presence of disturbances.
Decentralization is achieved by augmenting to each agent's optimization variables, copy variables, for its neighbors.
To enable safe consensus solutions, we incorporate an ADMM-based approach.
arXiv Detail & Related papers (2022-02-22T03:57:23Z) - MALib: A Parallel Framework for Population-based Multi-agent
Reinforcement Learning [61.28547338576706]
Population-based multi-agent reinforcement learning (PB-MARL) refers to the series of methods nested with reinforcement learning (RL) algorithms.
We present MALib, a scalable and efficient computing framework for PB-MARL.
arXiv Detail & Related papers (2021-06-05T03:27:08Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.