Related papers: The Subtle Art of Defection: Understanding Uncooperative Behaviors in LLM based Multi-Agent Systems

The Subtle Art of Defection: Understanding Uncooperative Behaviors in LLM based Multi-Agent Systems

URL: http://arxiv.org/abs/2511.15862v1
Date: Wed, 19 Nov 2025 20:39:19 GMT
Title: The Subtle Art of Defection: Understanding Uncooperative Behaviors in LLM based Multi-Agent Systems
Authors: Devang Kulshreshtha, Wanyu Du, Raghav Jain, Srikanth Doss, Hang Su, Sandesh Swamy, Yanjun Qi,
Abstract summary: This paper introduces a novel framework for simulating and analyzing how uncooperative behaviors can destabilize or collapse multi-agent systems.<n>Our framework includes two key components: (1) a game theory-based taxonomy of uncooperative agent behaviors, and (2) a multi-stage simulation pipeline that dynamically generates and refines uncooperative behaviors as agents' states evolve.
Score: 22.357102759752234
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces a novel framework for simulating and analyzing how uncooperative behaviors can destabilize or collapse LLM-based multi-agent systems. Our framework includes two key components: (1) a game theory-based taxonomy of uncooperative agent behaviors, addressing a notable gap in the existing literature; and (2) a structured, multi-stage simulation pipeline that dynamically generates and refines uncooperative behaviors as agents' states evolve. We evaluate the framework via a collaborative resource management setting, measuring system stability using metrics such as survival time and resource overuse rate. Empirically, our framework achieves 96.7% accuracy in generating realistic uncooperative behaviors, validated by human evaluations. Our results reveal a striking contrast: cooperative agents maintain perfect system stability (100% survival over 12 rounds with 0% resource overuse), while any uncooperative behavior can trigger rapid system collapse within 1 to 7 rounds. These findings demonstrate that uncooperative agents can significantly degrade collective outcomes, highlighting the need for designing more resilient multi-agent systems.

Related papers

Guided Collaboration in Heterogeneous LLM-Based Multi-Agent Systems via Entropy-Based Understanding Assessment and Experience Retrieval [35.96356869281219]
We describe a counterintuitive phenomenon in the strong-weak system: a strong-weak collaboration may under-perform weak-weak combinations.<n>We propose an Entropy-Based Adaptive Guidance Framework that dynamically aligns the guidance with the cognitive state of each agent.<n>Our approach consistently enhances the effectiveness and stability of heterogeneous collaboration.
arXiv Detail & Related papers (2026-02-14T07:10:04Z)
Beyond Task Performance: A Metric-Based Analysis of Sequential Cooperation in Heterogeneous Multi-Agent Destructive Foraging [41.439643274006364]
This work addresses the problem of analyzing cooperation in heterogeneous multi-agent systems.<n>The proposed suite of metrics is structured into three main categories that jointly provide a multilevel characterization of cooperation.<n>They have been validated in a realistic destructive foraging scenario inspired by dynamic aquatic surface cleaning using heterogeneous autonomous vehicles.
arXiv Detail & Related papers (2026-02-11T09:39:24Z)
MedSAM-Agent: Empowering Interactive Medical Image Segmentation with Multi-turn Agentic Reinforcement Learning [53.37068897861388]
MedSAM-Agent is a framework that reformulates interactive segmentation as a multi-step autonomous decision-making process.<n>We develop a two-stage training pipeline that integrates multi-turn, end-to-end outcome verification.<n>Experiments across 6 medical modalities and 21 datasets demonstrate that MedSAM-Agent achieves state-of-the-art performance.
arXiv Detail & Related papers (2026-02-03T09:47:49Z)
Bio-inspired Agentic Self-healing Framework for Resilient Distributed Computing Continuum Systems [4.003029907200818]
ReCiSt is a bio-inspired agentic self-healing framework designed to achieve resilience in Distributed Computing Continuum Systems (DCCS)<n>ReCiSt reconstructs the biological phases of Hemostasis, Inflammation, Proliferation, and Remodeling into the computational layers Containment, Diagnosis, Meta-Cognitive, and Knowledge for DCCS.<n>These four layers perform autonomous fault isolation, causal diagnosis, adaptive recovery, and long-term knowledge consolidation through Language Model (LM)-powered agents.
arXiv Detail & Related papers (2026-01-01T13:30:38Z)
Causal symmetrization as an empirical signature of operational autonomy in complex systems [0.0]
Theoretical biology has proposed that autonomous systems sustain their identity through reciprocal constraints between structure and activity.<n>We empirically assess this framework in artificial sociotechnical systems by identifying a statistical signature consistent with operational autonomy.
arXiv Detail & Related papers (2025-12-09T10:32:39Z)
Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia [100.74015791021044]
Large Language Model (LLM) agents have demonstrated impressive capabilities for social interaction.<n>Existing evaluation methods fail to measure how well these capabilities generalize to novel social situations.<n>We present empirical results from the NeurIPS 2024 Concordia Contest, where agents were evaluated on their ability to achieve mutual gains.
arXiv Detail & Related papers (2025-12-03T00:11:05Z)
LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering [90.84806758077536]
We introduce textbfLoCoBench-Agent, a comprehensive evaluation framework specifically designed to assess large language models (LLMs) agents in realistic, long-context software engineering.<n>Our framework extends LoCoBench's 8,000 scenarios into interactive agent environments, enabling systematic evaluation of multi-turn conversations.<n>Our framework provides agents with 8 specialized tools (file operations, search, code analysis) and evaluates them across context lengths ranging from 10K to 1M tokens.
arXiv Detail & Related papers (2025-11-17T23:57:24Z)
Emergent Coordination in Multi-Agent Language Models [2.504366738288215]
We introduce an information-theoretic framework to test whether multi-agent systems show signs of higher-order structure.<n>This information decomposition lets us measure whether dynamical emergence is present in multi-agent LLM systems.<n>We apply our framework to experiments using a simple guessing game without direct agent communication.
arXiv Detail & Related papers (2025-10-05T11:26:41Z)
AgentCompass: Towards Reliable Evaluation of Agentic Workflows in Production [4.031479494871582]
We present Agent, the first evaluation framework designed specifically for post-deployment monitoring and reasoning of agentic pipeline.<n>Agent achieves state-of-the-art results on key metrics, while uncovering critical issues missed in human annotations.
arXiv Detail & Related papers (2025-09-18T05:59:04Z)
Organ-Agents: Virtual Human Physiology Simulator via LLMs [66.40796430669158]
Organ-Agents is a multi-agent framework that simulates human physiology via LLM-driven agents.<n>We curated data from 7,134 sepsis patients and 7,895 controls, generating high-resolution trajectories across 9 systems and 125 variables.<n>Organ-Agents achieved high simulation accuracy on 4,509 held-out patients, with per-system MSEs 0.16 and robustness across SOFA-based severity strata.
arXiv Detail & Related papers (2025-08-20T01:58:45Z)
From MAS to MARS: Coordination Failures and Reasoning Trade-offs in Hierarchical Multi-Agent Robotic Systems within a Healthcare Scenario [3.5262044630932254]
Multi-agent robotic systems (MARS) build upon multi-agent systems by integrating physical and task-related constraints.<n>Despite the availability of advanced multi-agent frameworks, their real-world deployment on robots remains limited.
arXiv Detail & Related papers (2025-08-06T17:54:10Z)
Risk Analysis Techniques for Governed LLM-based Multi-Agent Systems [0.0]
This report addresses the early stages of risk identification and analysis for multi-agent AI systems.<n>We examine six critical failure modes: cascading reliability failures, inter-agent communication failures, monoculture collapse, conformity bias, deficient theory of mind, and mixed motive dynamics.
arXiv Detail & Related papers (2025-08-06T06:06:57Z)
Multi-Agent Collaboration via Evolving Orchestration [55.574417128944226]
Large language models (LLMs) have achieved remarkable results across diverse downstream tasks, but their monolithic nature restricts scalability and efficiency in complex problem-solving.<n>We propose a puppeteer-style paradigm for LLM-based multi-agent collaboration, where a centralized orchestrator ("puppeteer") dynamically directs agents ("puppets") in response to evolving task states.<n> Experiments on closed- and open-domain scenarios show that this method achieves superior performance with reduced computational costs.
arXiv Detail & Related papers (2025-05-26T07:02:17Z)
Collaborative Value Function Estimation Under Model Mismatch: A Federated Temporal Difference Analysis [55.13545823385091]
Federated reinforcement learning (FedRL) enables collaborative learning while preserving data privacy by preventing direct data exchange between agents.<n>In real-world applications, each agent may experience slightly different transition dynamics, leading to inherent model mismatches.<n>We show that even moderate levels of information sharing significantly mitigate environment-specific errors.
arXiv Detail & Related papers (2025-03-21T18:06:28Z)
Collaboration Dynamics and Reliability Challenges of Multi-Agent LLM Systems in Finite Element Analysis [3.437656066916039]
How interagent dynamics influence reasoning quality and verification reliability remains unclear.<n>We study these mechanisms using an AutoGen-based multi-agent framework for linear-elastic Finite Element Analysis (FEA)<n>From 1,120 controlled trials, we find that collaboration effectiveness depends more on functional complementarity than team size.
arXiv Detail & Related papers (2024-08-23T23:11:08Z)
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors [93.38830440346783]
We propose a multi-agent framework framework that can collaboratively adjust its composition as a greater-than-the-sum-of-its-parts system. Our experiments demonstrate that framework framework can effectively deploy multi-agent groups that outperform a single agent. In view of these behaviors, we discuss some possible strategies to leverage positive ones and mitigate negative ones for improving the collaborative potential of multi-agent groups.
arXiv Detail & Related papers (2023-08-21T16:47:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.