Hierarchical Multi-Agent Reinforcement Learning with Control Barrier Functions for Safety-Critical Autonomous Systems
- URL: http://arxiv.org/abs/2507.14850v1
- Date: Sun, 20 Jul 2025 07:43:18 GMT
- Title: Hierarchical Multi-Agent Reinforcement Learning with Control Barrier Functions for Safety-Critical Autonomous Systems
- Authors: H. M. Sabbir Ahmad, Ehsan Sabouni, Alexander Wasilkoff, Param Budhraja, Zijian Guo, Songyuan Zhang, Chuchu Fan, Christos Cassandras, Wenchao Li,
- Abstract summary: We propose a safe Hierarchical Multi-Agent Reinforcement Learning (HMARL) approach based on Control Barrier Functions (CBFs)<n>Our proposed hierarchical approach decomposes the overall reinforcement learning problem into two levels learning joint cooperative behavior at the higher level and learning safe individual behavior at the lower or agent level conditioned on the high-level policy.<n>Specifically, we propose a skill-based HMARL-CBF algorithm in which the higher level problem involves learning a joint policy over the skills for all the agents and the lower-level problem involves learning policies to execute the skills safely with CBFs.
- Score: 42.86066246726967
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: We address the problem of safe policy learning in multi-agent safety-critical autonomous systems. In such systems, it is necessary for each agent to meet the safety requirements at all times while also cooperating with other agents to accomplish the task. Toward this end, we propose a safe Hierarchical Multi-Agent Reinforcement Learning (HMARL) approach based on Control Barrier Functions (CBFs). Our proposed hierarchical approach decomposes the overall reinforcement learning problem into two levels learning joint cooperative behavior at the higher level and learning safe individual behavior at the lower or agent level conditioned on the high-level policy. Specifically, we propose a skill-based HMARL-CBF algorithm in which the higher level problem involves learning a joint policy over the skills for all the agents and the lower-level problem involves learning policies to execute the skills safely with CBFs. We validate our approach on challenging environment scenarios whereby a large number of agents have to safely navigate through conflicting road networks. Compared with existing state of the art methods, our approach significantly improves the safety achieving near perfect (within 5%) success/safety rate while also improving performance across all the environments.
Related papers
- Towards provable probabilistic safety for scalable embodied AI systems [79.31011047593492]
Embodied AI systems are increasingly prevalent across various applications.<n> Ensuring their safety in complex operating environments remains a major challenge.<n>We introduce provable probabilistic safety, which aims to ensure that the residual risk of large-scale deployment remains below a predefined threshold.
arXiv Detail & Related papers (2025-06-05T15:46:25Z) - Tackling Uncertainties in Multi-Agent Reinforcement Learning through Integration of Agent Termination Dynamics [9.263837897126871]
Multi-Agent Reinforcement Learning (MARL) has gained significant traction for solving complex real-world tasks.<n>The inherentity and uncertainty in these environments pose substantial challenges to efficient and robust policy learning.<n>We propose a novel approach that integrates distributional learning with a safety-focused loss function to improve convergence in cooperative MARL tasks.
arXiv Detail & Related papers (2025-01-21T11:31:01Z) - Stacked Universal Successor Feature Approximators for Safety in Reinforcement Learning [1.2534672170380357]
We investigate the utility of a stacked, continuous-control variation of universal successor feature approximation (USFA) adapted for soft actor-critic (SAC)
Our method improves performance on secondary objectives compared to SAC baselines using an intervening secondary controller such as a runtime assurance (RTA) controller.
arXiv Detail & Related papers (2024-09-06T22:20:07Z) - Safety-Aware Multi-Agent Learning for Dynamic Network Bridging [0.11249583407496219]
We focus on a dynamic network bridging task, where agents must learn to maintain a communication path between two moving targets.<n>We integrate a control-theoretic safety filter that enforces collision avoidance through local setpoint updates.<n>The results suggest that local safety enforcement and decentralized learning can be effectively combined in distributed multi-agent tasks.
arXiv Detail & Related papers (2024-04-02T01:30:41Z) - Safeguarded Progress in Reinforcement Learning: Safe Bayesian
Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL)
We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z) - Learning Adaptive Safety for Multi-Agent Systems [14.076785738848924]
We show how emergent behavior can be profoundly influenced by the CBF configuration.
We present ASRL, a novel adaptive safe RL framework, to enhance safety and long-term performance.
We evaluate ASRL in a multi-robot system and a competitive multi-agent racing scenario.
arXiv Detail & Related papers (2023-09-19T14:39:39Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Learning Safe Multi-Agent Control with Decentralized Neural Barrier
Certificates [19.261536710315028]
We study the multi-agent safe control problem where agents should avoid collisions to static obstacles and collisions with each other while reaching their goals.
Our core idea is to learn the multi-agent control policy jointly with learning the control barrier functions as safety certificates.
We propose a novel joint-learning framework that can be implemented in a decentralized fashion, with generalization guarantees for certain function classes.
arXiv Detail & Related papers (2021-01-14T03:17:17Z) - Neural Certificates for Safe Control Policies [108.4560749465701]
This paper develops an approach to learn a policy of a dynamical system that is guaranteed to be both provably safe and goal-reaching.
We show the effectiveness of the method to learn both safe and goal-reaching policies on various systems, including pendulums, cart-poles, and UAVs.
arXiv Detail & Related papers (2020-06-15T15:14:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.