Related papers: COMMA: A Communicative Multimodal Multi-Agent Benchmark

Related papers

MT-PingEval: Evaluating Multi-Turn Collaboration with Private Information Games [70.37904949359938]
We evaluate language models in multi-turn interactions using a suite of collaborative games that require effective communication about private information.<n>We find that language models are unable to use interactive collaboration to improve over the non-interactive baseline scenario.<n>We analyze the linguistic features of these dialogues, assessing the roles of sycophancy, information density, and discourse coherence.
arXiv Detail & Related papers (2026-02-27T17:13:20Z)
AnyMAC: Cascading Flexible Multi-Agent Collaboration via Next-Agent Prediction [70.60422261117816]
We propose a new framework that rethinks multi-agent coordination through a sequential structure rather than a graph structure.<n>Our method focuses on two key directions: (1) Next-Agent Prediction, which selects the most suitable agent role at each step, and (2) Next-Context Selection, which enables each agent to selectively access relevant information from any previous step.
arXiv Detail & Related papers (2025-06-21T18:34:43Z)
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents [59.825725526176655]
Large Language Models (LLMs) have shown remarkable capabilities as autonomous agents. Existing benchmarks either focus on single-agent tasks or are confined to narrow domains, failing to capture the dynamics of multi-agent coordination and competition. We introduce MultiAgentBench, a benchmark designed to evaluate LLM-based multi-agent systems across diverse, interactive scenarios.
arXiv Detail & Related papers (2025-03-03T05:18:50Z)
Beyond Self-Talk: A Communication-Centric Survey of LLM-Based Multi-Agent Systems [23.379992200838053]
Large language model-based multi-agent systems have recently gained significant attention due to their potential for complex, collaborative, and intelligent problem-solving capabilities.<n>Existing surveys typically categorize LLM-MAS according to their application domains or architectures, overlooking the central role of communication in coordinating agent behaviors and interactions.<n>This review aims to help researchers and practitioners gain a clear understanding of the communication mechanisms in LLM-MAS, thereby facilitating the design and deployment of robust, scalable, and secure multi-agent systems.
arXiv Detail & Related papers (2025-02-20T07:18:34Z)
Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration [51.452664740963066]
Collaborative Gym is a framework enabling asynchronous, tripartite interaction among agents, humans, and task environments. We instantiate Co-Gym with three representative tasks in both simulated and real-world conditions. Our findings reveal that collaborative agents consistently outperform their fully autonomous counterparts in task performance.
arXiv Detail & Related papers (2024-12-20T09:21:15Z)
Towards Effective GenAI Multi-Agent Collaboration: Design and Evaluation for Enterprise Applications [15.480315462362531]
This report presents a comprehensive evaluation of coordination and routing capabilities in a novel multi-agent collaboration framework. For coordination capabilities, we demonstrate the effectiveness of inter-agent communication and payload referencing mechanisms, achieving end-to-end goal success rates of 90%. Our analysis yields several key findings: multi-agent collaboration enhances goal success rates by up to 70% compared to single-agent approaches in our benchmarks.
arXiv Detail & Related papers (2024-12-06T22:14:17Z)
Communication Learning in Multi-Agent Systems from Graph Modeling Perspective [62.13508281188895]
We introduce a novel approach wherein we conceptualize the communication architecture among agents as a learnable graph. We introduce a temporal gating mechanism for each agent, enabling dynamic decisions on whether to receive shared information at a given time.
arXiv Detail & Related papers (2024-11-01T05:56:51Z)
BattleAgentBench: A Benchmark for Evaluating Cooperation and Competition Capabilities of Language Models in Multi-Agent Systems [15.159418172629701]
Large Language Models (LLMs) are becoming increasingly powerful and capable of handling complex tasks. Compared to single agents, multi-agent systems have higher requirements for the collaboration capabilities of language models. We propose a benchmark, called BattleAgentBench, which defines seven sub-stages of three varying difficulty levels.
arXiv Detail & Related papers (2024-08-28T17:43:55Z)
Improving Multi-Agent Debate with Sparse Communication Topology [9.041025703879905]
Multi-agent debate has proven effective in improving large language models quality for reasoning and factuality tasks. In this paper, we investigate the effect of communication connectivity in multi-agent systems. Our experiments on GPT and Mistral models reveal that multi-agent debates leveraging sparse communication topology can achieve comparable or superior performance.
arXiv Detail & Related papers (2024-06-17T17:33:09Z)
Scaling Large-Language-Model-based Multi-Agent Collaboration [75.5241464256688]
Pioneering advancements in large language model-powered agents have underscored the design pattern of multi-agent collaboration. Inspired by the neural scaling law, this study investigates whether a similar principle applies to increasing agents in multi-agent collaboration.
arXiv Detail & Related papers (2024-06-11T11:02:04Z)
Large Language Model Enhanced Multi-Agent Systems for 6G Communications [94.45712802626794]
We propose a multi-agent system with customized communication knowledge and tools for solving communication related tasks using natural language. We validate the effectiveness of the proposed multi-agent system by designing a semantic communication system.
arXiv Detail & Related papers (2023-12-13T02:35:57Z)
Multi-Agent Consensus Seeking via Large Language Models [6.922356864800498]
Multi-agent systems driven by large language models (LLMs) have shown promising abilities for solving complex tasks in a collaborative manner. This work considers a fundamental problem in multi-agent collaboration: consensus seeking.
arXiv Detail & Related papers (2023-10-31T03:37:11Z)
Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiation [52.930183136111864]
We propose using scorable negotiation to evaluate Large Language Models (LLMs) To reach an agreement, agents must have strong arithmetic, inference, exploration, and planning capabilities. We provide procedures to create new games and increase games' difficulty to have an evolving benchmark.
arXiv Detail & Related papers (2023-09-29T13:33:06Z)
ProAgent: Building Proactive Cooperative Agents with Large Language Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents. ProAgent can analyze the present state, and infer the intentions of teammates from observations. ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z)
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors [93.38830440346783]
We propose a multi-agent framework framework that can collaboratively adjust its composition as a greater-than-the-sum-of-its-parts system. Our experiments demonstrate that framework framework can effectively deploy multi-agent groups that outperform a single agent. In view of these behaviors, we discuss some possible strategies to leverage positive ones and mitigate negative ones for improving the collaborative potential of multi-agent groups.
arXiv Detail & Related papers (2023-08-21T16:47:11Z)
Building Cooperative Embodied Agents Modularly with Large Language Models [104.57849816689559]
We address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments. We harness the commonsense knowledge, reasoning ability, language comprehension, and text generation prowess of LLMs and seamlessly incorporate them into a cognitive-inspired modular framework. Our experiments on C-WAH and TDW-MAT demonstrate that CoELA driven by GPT-4 can surpass strong planning-based methods and exhibit emergent effective communication.
arXiv Detail & Related papers (2023-07-05T17:59:27Z)
CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society [58.04479313658851]
This paper explores the potential of building scalable techniques to facilitate autonomous cooperation among communicative agents. We propose a novel communicative agent framework named role-playing. Our contributions include introducing a novel communicative agent framework, offering a scalable approach for studying the cooperative behaviors and capabilities of multi-agent systems.
arXiv Detail & Related papers (2023-03-31T01:09:00Z)
The Emergence of Adversarial Communication in Multi-Agent Reinforcement Learning [6.18778092044887]
Many real-world problems require the coordination of multiple autonomous agents. Recent work has shown the promise of Graph Neural Networks (GNNs) to learn explicit communication strategies that enable complex multi-agent coordination. We show how a single self-interested agent is capable of learning highly manipulative communication strategies that allows it to significantly outperform a cooperative team of agents.
arXiv Detail & Related papers (2020-08-06T12:48:08Z)
Learning Individually Inferred Communication for Multi-Agent Cooperation [37.56115000150748]
We propose Individually Inferred Communication (I2C) to enable agents to learn a prior for agent-agent communication. The prior knowledge is learned via causal inference and realized by a feed-forward neural network. I2C can not only reduce communication overhead but also improve the performance in a variety of multi-agent cooperative scenarios.
arXiv Detail & Related papers (2020-06-11T14:07:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.