MAD-PINN: A Decentralized Physics-Informed Machine Learning Framework for Safe and Optimal Multi-Agent Control
- URL: http://arxiv.org/abs/2509.23960v1
- Date: Sun, 28 Sep 2025 16:31:22 GMT
- Title: MAD-PINN: A Decentralized Physics-Informed Machine Learning Framework for Safe and Optimal Multi-Agent Control
- Authors: Manan Tayal, Aditya Singh, Shishir Kolathaya, Somil Bansal,
- Abstract summary: Co-optimizing safety and performance in large-scale multi-agent systems remains a fundamental challenge.<n>We propose MAD-PINN, a decentralized machine learning framework for solving the multi-agent state-constrained optimal control problem.<n> Experiments on multi-agent navigation tasks demonstrate that MAD-PINN achieves superior safety-performance trade-offs, maintains scalability as the number of agents grows, and consistently outperforms state-of-the-art baselines.
- Score: 13.531665564516155
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Co-optimizing safety and performance in large-scale multi-agent systems remains a fundamental challenge. Existing approaches based on multi-agent reinforcement learning (MARL), safety filtering, or Model Predictive Control (MPC) either lack strict safety guarantees, suffer from conservatism, or fail to scale effectively. We propose MAD-PINN, a decentralized physics-informed machine learning framework for solving the multi-agent state-constrained optimal control problem (MASC-OCP). Our method leverages an epigraph-based reformulation of SC-OCP to simultaneously capture performance and safety, and approximates its solution via a physics-informed neural network. Scalability is achieved by training the SC-OCP value function on reduced-agent systems and deploying them in a decentralized fashion, where each agent relies only on local observations of its neighbours for decision-making. To further enhance safety and efficiency, we introduce an Hamilton-Jacobi (HJ) reachability-based neighbour selection strategy to prioritize safety-critical interactions, and a receding-horizon policy execution scheme that adapts to dynamic interactions while reducing computational burden. Experiments on multi-agent navigation tasks demonstrate that MAD-PINN achieves superior safety-performance trade-offs, maintains scalability as the number of agents grows, and consistently outperforms state-of-the-art baselines.
Related papers
- BarrierSteer: LLM Safety via Learning Barrier Steering [83.12893815611052]
BarrierSteer is a novel framework that formalizes safety by embedding learned non-linear safety constraints directly into the model's latent representation space.<n>We show that BarrierSteer substantially reduces adversarial success rates, decreases unsafe generations, and outperforms existing methods.
arXiv Detail & Related papers (2026-02-23T18:19:46Z) - Safe Reinforcement Learning via Recovery-based Shielding with Gaussian Process Dynamics Models [57.006252510102506]
Reinforcement learning (RL) is a powerful framework for optimal decision-making and control but often lacks provable guarantees for safety-critical applications.<n>We introduce a novel recovery-based shielding framework that enables safe RL with a provable safety lower bound for unknown and non-linear continuous dynamical systems.
arXiv Detail & Related papers (2026-02-12T22:03:35Z) - UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models [67.91151588917396]
Large Language Models (LLMs) have achieved remarkable progress across a wide range of tasks, but remain vulnerable to safety risks such as harmful content generation and jailbreak attacks.<n>We propose UpSafe$circ$C, a unified framework for enhancing LLM safety through safety-aware upcycling.<n>Our results highlight a new direction for LLM safety: moving from static alignment toward dynamic, modular, and inference-aware control.
arXiv Detail & Related papers (2025-10-02T16:43:33Z) - Vulnerable Agent Identification in Large-Scale Multi-Agent Reinforcement Learning [49.31650627835956]
Partial agent failure becomes inevitable when systems scale up, making it crucial to identify the subset of agents whose compromise would most severely degrade overall performance.<n>In this paper, we study this Vulnerable Agent Identification (VAI) problem in large-scale multi-agent reinforcement learning (MARL)<n> Experiments show our method effectively identifies more vulnerable agents in large-scale MARL and the rule-based system, fooling system into worse failures, and learning a value function that reveals the vulnerability of each agent.
arXiv Detail & Related papers (2025-09-18T16:03:50Z) - Scalable Safe Multi-Agent Reinforcement Learning for Multi-Agent System [1.0124625066746598]
Existing Multi-Agent Reinforcement Learning (MARL) algorithms that rely solely on reward shaping are ineffective in ensuring safety.<n>We propose a novel framework, Scalable Safe MARL (SS-MARL), to enhance the safety and scalability of MARL methods.<n>We show that SS-MARL achieves a better trade-off between optimality and safety compared to baselines, and its scalability significantly outperforms the latest methods in scenarios with a large number of agents.
arXiv Detail & Related papers (2025-01-23T15:01:19Z) - Think Smart, Act SMARL! Analyzing Probabilistic Logic Shields for Multi-Agent Reinforcement Learning [3.7957452405531265]
Shielded Multi-Agent Reinforcement Learning (SMARL) is a general framework for steering MARL towards norm-compliant outcomes.<n>Our key contributions are:.<n>a novel Probabilistic Logic Temporal Difference (PLTD) update for shielded, independent Q-learning;.<n>a probabilistic logic policy gradient method for shielded PPO with formal safety guarantees for MARL;.<n> comprehensive evaluation across symmetric and asymmetrically shielded $n$ of player game-theoretic benchmarks.
arXiv Detail & Related papers (2024-11-07T16:59:32Z) - DePAint: A Decentralized Safe Multi-Agent Reinforcement Learning Algorithm considering Peak and Average Constraints [1.1549572298362787]
We propose a momentum-based decentralized gradient policy method, DePAint, to solve the problem.
This is the first privacy-preserving fully decentralized multi-agent reinforcement learning algorithm that considers both peak and average constraints.
arXiv Detail & Related papers (2023-10-22T16:36:03Z) - Learning Adaptive Safety for Multi-Agent Systems [14.076785738848924]
We show how emergent behavior can be profoundly influenced by the CBF configuration.
We present ASRL, a novel adaptive safe RL framework, to enhance safety and long-term performance.
We evaluate ASRL in a multi-robot system and a competitive multi-agent racing scenario.
arXiv Detail & Related papers (2023-09-19T14:39:39Z) - Safe Model-Based Multi-Agent Mean-Field Reinforcement Learning [48.667697255912614]
Mean-field reinforcement learning addresses the policy of a representative agent interacting with the infinite population of identical agents.
We propose Safe-M$3$-UCRL, the first model-based mean-field reinforcement learning algorithm that attains safe policies even in the case of unknown transitions.
Our algorithm effectively meets the demand in critical areas while ensuring service accessibility in regions with low demand.
arXiv Detail & Related papers (2023-06-29T15:57:07Z) - Decentralized Safe Multi-agent Stochastic Optimal Control using Deep
FBSDEs and ADMM [16.312625634442092]
We propose a novel safe and scalable decentralized solution for multi-agent control in the presence of disturbances.
Decentralization is achieved by augmenting to each agent's optimization variables, copy variables, for its neighbors.
To enable safe consensus solutions, we incorporate an ADMM-based approach.
arXiv Detail & Related papers (2022-02-22T03:57:23Z) - Lyapunov-Based Reinforcement Learning for Decentralized Multi-Agent
Control [3.3788926259119645]
In decentralized multi-agent control, systems are complex with unknown or highly uncertain dynamics.
Deep reinforcement learning (DRL) is promising to learn the controller/policy from data without the knowing system dynamics.
Existing multi-agent reinforcement learning (MARL) algorithms cannot ensure the closed-loop stability of a multi-agent system.
We propose a new MARL algorithm for decentralized multi-agent control with a stability guarantee.
arXiv Detail & Related papers (2020-09-20T06:11:42Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.