Related papers: Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning

Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning

URL: http://arxiv.org/abs/2510.11824v2
Date: Thu, 23 Oct 2025 08:39:40 GMT
Title: Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning
Authors: Simin Li, Zihao Mao, Hanxiao Li, Zonglei Jing, Zhuohang bian, Jun Guo, Li Wang, Zhuoran Han, Ruixiao Xu, Xin Yu, Chengdong Ma, Yuqing Ma, Bo An, Yaodong Yang, Weifeng Lv, Xianglong Liu,
Abstract summary: Building trustworthy Multi-Agent Reinforcement Learning systems requires understanding of robustness.<n>We present a large-scale empirical study comprising over 82,620 experiments to evaluate cooperation, robustness, and resilience in MARL.
Score: 37.910012648322265
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In cooperative Multi-Agent Reinforcement Learning (MARL), it is a common practice to tune hyperparameters in ideal simulated environments to maximize cooperative performance. However, policies tuned for cooperation often fail to maintain robustness and resilience under real-world uncertainties. Building trustworthy MARL systems requires a deep understanding of robustness, which ensures stability under uncertainties, and resilience, the ability to recover from disruptions--a concept extensively studied in control systems but largely overlooked in MARL. In this paper, we present a large-scale empirical study comprising over 82,620 experiments to evaluate cooperation, robustness, and resilience in MARL across 4 real-world environments, 13 uncertainty types, and 15 hyperparameters. Our key findings are: (1) Under mild uncertainty, optimizing cooperation improves robustness and resilience, but this link weakens as perturbations intensify. Robustness and resilience also varies by algorithm and uncertainty type. (2) Robustness and resilience do not generalize across uncertainty modalities or agent scopes: policies robust to action noise for all agents may fail under observation noise on a single agent. (3) Hyperparameter tuning is critical for trustworthy MARL: surprisingly, standard practices like parameter sharing, GAE, and PopArt can hurt robustness, while early stopping, high critic learning rates, and Leaky ReLU consistently help. By optimizing hyperparameters only, we observe substantial improvement in cooperation, robustness and resilience across all MARL backbones, with the phenomenon also generalizing to robust MARL methods across these backbones. Code and results available at https://github.com/BUAA-TrustworthyMARL/adv_marl_benchmark .

Related papers

MASPO: Unifying Gradient Utilization, Probability Mass, and Signal Reliability for Robust and Sample-Efficient LLM Reasoning [16.012761588513026]
Reinforcement Learning with Verifiable Rewards (RLVR) algorithms rely on rigid, uniform, and symmetric trust region mechanisms.<n>We propose Mass-Adaptive Soft Policy Optimization (MASPO), a unified framework designed to harmonize these three dimensions.<n> MASPO integrates a differentiable soft Gaussian gating to maximize gradient utility, a mass-adaptive limiter to balance exploration across the probability spectrum, and an asymmetric risk controller to align update magnitudes with signal confidence.
arXiv Detail & Related papers (2026-02-19T17:05:20Z)
RMBRec: Robust Multi-Behavior Recommendation towards Target Behaviors [26.88506691092044]
We propose Robust Multi-Behavior Recommendation towards Target Behaviors (RMBRec)<n>RMBRec is a robust multi-behavior recommendation framework grounded in an information-theoretic robustness principle.<n>We show that RMBRec outperforms state-of-the-art methods in accuracy and maintains remarkable stability under various noise perturbations.
arXiv Detail & Related papers (2026-01-13T16:34:17Z)
Harnessing Consistency for Robust Test-Time LLM Ensemble [88.55393815158608]
CoRE is a plug-and-play technique that harnesses model consistency for robust LLM ensemble.<n> Token-level consistency captures fine-grained disagreements by applying a low-pass filter to downweight uncertain tokens.<n>Model-level consistency models global agreement by promoting model outputs with high self-confidence.
arXiv Detail & Related papers (2025-10-12T04:18:45Z)
Understanding and Benchmarking the Trustworthiness in Multimodal LLMs for Video Understanding [59.50808215134678]
This study introduces Trust-videoLLMs, a first comprehensive benchmark evaluating 23 state-of-the-art videoLLMs.<n>Results reveal significant limitations in dynamic scene comprehension, cross-modal resilience and real-world risk mitigation.
arXiv Detail & Related papers (2025-06-14T04:04:54Z)
Adaptive Pruning with Module Robustness Sensitivity: Balancing Compression and Robustness [7.742297876120561]
This paper introduces Module Robustness Sensitivity (MRS), a novel metric that quantifies layer-wise sensitivity to adversarial perturbations.<n>We propose Module Robust Pruning and Fine-Tuning (MRPF), an adaptive pruning algorithm compatible with any adversarial training method.
arXiv Detail & Related papers (2024-10-19T18:35:52Z)
Reward-Robust RLHF in LLMs [25.31456438114974]
Large Language Models (LLMs) continue to progress toward more advanced forms of intelligence. The reliance on reward-model-based (RM-based) alignment methods introduces significant challenges. We introduce a reward-robust RLHF framework aimed at addressing these fundamental challenges.
arXiv Detail & Related papers (2024-09-18T02:35:41Z)
Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty [40.55653383218379]
This work focuses on learning in distributionally robust Markov games (RMGs) We propose a sample-efficient model-based algorithm (DRNVI) with finite-sample complexity guarantees for learning robust variants of various notions of game-theoretic equilibria.
arXiv Detail & Related papers (2024-04-29T17:51:47Z)
Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms [79.61176746380718]
Multi-Agent Reinforcement Learning (MARL) has shown promising results across several domains. MARL policies often lack robustness and are sensitive to small changes in their environment. We show that we can gain robustness by controlling a policy's Lipschitz constant. We propose a new robust MARL framework, ERNIE, that promotes the Lipschitz continuity of the policies.
arXiv Detail & Related papers (2023-10-16T20:14:06Z)
Seeing is not Believing: Robust Reinforcement Learning against Spurious Correlation [57.351098530477124]
We consider one critical type of robustness against spurious correlation, where different portions of the state do not have correlations induced by unobserved confounders. A model that learns such useless or even harmful correlation could catastrophically fail when the confounder in the test case deviates from the training one. Existing robust algorithms that assume simple and unstructured uncertainty sets are therefore inadequate to address this challenge.
arXiv Detail & Related papers (2023-07-15T23:53:37Z)
Maximum Entropy Heterogeneous-Agent Reinforcement Learning [45.377385280485065]
Multi-agent reinforcement learning (MARL) has been shown effective for cooperative games in recent years.<n>We propose a unified framework for learning policies to resolve issues related to sample complexity, training instability, and the risk of converging to a suboptimal Nash Equilibrium.<n>Based on the MaxEnt framework, we propose Heterogeneous-Agent Soft Actor-Critic (HASAC) algorithm.<n>We evaluate HASAC on six benchmarks: Bi-DexHands, Multi-Agent MuJoCo, StarCraft Challenge, Google Research Football, Multi-Agent Particle Environment, and Light Aircraft Game.
arXiv Detail & Related papers (2023-06-19T06:22:02Z)
Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment. We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.