Related papers: Colosseum: Auditing Collusion in Cooperative Multi-Agent Systems

Colosseum: Auditing Collusion in Cooperative Multi-Agent Systems

URL: http://arxiv.org/abs/2602.15198v1
Date: Mon, 16 Feb 2026 21:27:38 GMT
Title: Colosseum: Auditing Collusion in Cooperative Multi-Agent Systems
Authors: Mason Nakamura, Abhinav Kumar, Saswat Das, Sahar Abdelnabi, Saaduddin Mahmud, Ferdinando Fioretto, Shlomo Zilberstein, Eugene Bagdasarian,
Abstract summary: We present Colosseum, a framework for auditing LLM agents' collusive behavior in multi-agent settings.<n>Colosseum tests each LLM for collusion under different objectives, persuasion tactics, and network topologies.<n>We discover collusion on paper'' when agents plan to collude in text but would often pick non-collusive actions.
Score: 55.51100373104311
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multi-agent systems, where LLM agents communicate through free-form language, enable sophisticated coordination for solving complex cooperative tasks. This surfaces a unique safety problem when individual agents form a coalition and \emph{collude} to pursue secondary goals and degrade the joint objective. In this paper, we present Colosseum, a framework for auditing LLM agents' collusive behavior in multi-agent settings. We ground how agents cooperate through a Distributed Constraint Optimization Problem (DCOP) and measure collusion via regret relative to the cooperative optimum. Colosseum tests each LLM for collusion under different objectives, persuasion tactics, and network topologies. Through our audit, we show that most out-of-the-box models exhibited a propensity to collude when a secret communication channel was artificially formed. Furthermore, we discover ``collusion on paper'' when agents plan to collude in text but would often pick non-collusive actions, thus providing little effect on the joint task. Colosseum provides a new way to study collusion by measuring communications and actions in rich yet verifiable environments.

Related papers

When Numbers Start Talking: Implicit Numerical Coordination Among LLM-Based Agents [45.43445469098021]
This paper presents a game-theoretic study of covert communication in multi-agent systems.<n>We analyse interactions across four canonical game-theoretic settings under different communication regimes.<n>Considering heterogeneous agent personalities and both one-shot and repeated games, we characterise when covert signals emerge and how they shape coordination and strategic outcomes.
arXiv Detail & Related papers (2026-01-07T12:07:48Z)
Communication and Verification in LLM Agents towards Collaboration under Information Asymmetry [17.472005826931127]
This paper studies Large Language Model (LLM) agents in task collaboration.<n>We extend Einstein Puzzles, a symbolic puzzle, to a table-top game.<n> Empirical results highlight the critical importance of aligned communication.
arXiv Detail & Related papers (2025-10-29T15:03:53Z)
Can an Individual Manipulate the Collective Decisions of Multi-Agents? [53.01767232004823]
M-Spoiler is a framework that simulates agent interactions within a multi-agent system to generate adversarial samples.<n>M-Spoiler introduces a stubborn agent that actively aids in optimizing adversarial samples.<n>Our findings confirm the risks posed by the knowledge of an individual agent in multi-agent systems.
arXiv Detail & Related papers (2025-09-20T01:54:20Z)
When Disagreements Elicit Robustness: Investigating Self-Repair Capabilities under LLM Multi-Agent Disagreements [56.29265568399648]
We argue that disagreements prevent premature consensus and expand the explored solution space.<n>Disagreements on task-critical steps can derail collaboration depending on the topology of solution paths.
arXiv Detail & Related papers (2025-02-21T02:24:43Z)
Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game [25.823665278297057]
This study focuses on the ad hoc teamwork problem where the agent operates in an environment driven by natural language. Our findings reveal the potential of LLM agents in team collaboration, highlighting issues related to hallucinations in communication. To address this issue, we develop CodeAct, a general agent that equips LLM with enhanced memory and code-driven reasoning.
arXiv Detail & Related papers (2023-12-29T08:26:54Z)
Enhancing Multi-Agent Coordination through Common Operating Picture Integration [14.927199437011044]
We present an approach to multi-agent coordination, where each agent is equipped with the capability to integrate its history of observations, actions and messages received into a Common Operating Picture (COP) Our results demonstrate the efficacy of COP integration, and show that COP-based training leads to robust policies compared to state-of-the-art Multi-Agent Reinforcement Learning (MARL) methods when faced with out-of-distribution initial states.
arXiv Detail & Related papers (2023-11-08T15:08:55Z)
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors [93.38830440346783]
We propose a multi-agent framework framework that can collaboratively adjust its composition as a greater-than-the-sum-of-its-parts system. Our experiments demonstrate that framework framework can effectively deploy multi-agent groups that outperform a single agent. In view of these behaviors, we discuss some possible strategies to leverage positive ones and mitigate negative ones for improving the collaborative potential of multi-agent groups.
arXiv Detail & Related papers (2023-08-21T16:47:11Z)
Building Cooperative Embodied Agents Modularly with Large Language Models [104.57849816689559]
We address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments. We harness the commonsense knowledge, reasoning ability, language comprehension, and text generation prowess of LLMs and seamlessly incorporate them into a cognitive-inspired modular framework. Our experiments on C-WAH and TDW-MAT demonstrate that CoELA driven by GPT-4 can surpass strong planning-based methods and exhibit emergent effective communication.
arXiv Detail & Related papers (2023-07-05T17:59:27Z)
A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks [111.34055449929487]
We introduce the novel task FurnMove in which agents work together to move a piece of furniture through a living room to a goal. Unlike existing tasks, FurnMove requires agents to coordinate at every timestep. We identify two challenges when training agents to complete FurnMove: existing decentralized action sampling procedures do not permit expressive joint action policies. Using SYNC-policies and CORDIAL, our agents achieve a 58% completion rate on FurnMove, an impressive absolute gain of 25 percentage points over competitive decentralized baselines.
arXiv Detail & Related papers (2020-07-09T17:59:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.