From MAS to MARS: Coordination Failures and Reasoning Trade-offs in Hierarchical Multi-Agent Robotic Systems within a Healthcare Scenario
- URL: http://arxiv.org/abs/2508.04691v1
- Date: Wed, 06 Aug 2025 17:54:10 GMT
- Title: From MAS to MARS: Coordination Failures and Reasoning Trade-offs in Hierarchical Multi-Agent Robotic Systems within a Healthcare Scenario
- Authors: Yuanchen Bai, Zijian Ding, Shaoyue Wen, Xiang Chang, Angelique Taylor,
- Abstract summary: Multi-agent robotic systems (MARS) build upon multi-agent systems by integrating physical and task-related constraints.<n>Despite the availability of advanced multi-agent frameworks, their real-world deployment on robots remains limited.
- Score: 3.5262044630932254
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Multi-agent robotic systems (MARS) build upon multi-agent systems by integrating physical and task-related constraints, increasing the complexity of action execution and agent coordination. However, despite the availability of advanced multi-agent frameworks, their real-world deployment on robots remains limited, hindering the advancement of MARS research in practice. To bridge this gap, we conducted two studies to investigate performance trade-offs of hierarchical multi-agent frameworks in a simulated real-world multi-robot healthcare scenario. In Study 1, using CrewAI, we iteratively refine the system's knowledge base, to systematically identify and categorize coordination failures (e.g., tool access violations, lack of timely handling of failure reports) not resolvable by providing contextual knowledge alone. In Study 2, using AutoGen, we evaluate a redesigned bidirectional communication structure and further measure the trade-offs between reasoning and non-reasoning models operating within the same robotic team setting. Drawing from our empirical findings, we emphasize the tension between autonomy and stability and the importance of edge-case testing to improve system reliability and safety for future real-world deployment. Supplementary materials, including codes, task agent setup, trace outputs, and annotated examples of coordination failures and reasoning behaviors, are available at: https://byc-sophie.github.io/mas-to-mars/.
Related papers
- Agentic Web: Weaving the Next Web with AI Agents [109.13815627467514]
The emergence of AI agents powered by large language models (LLMs) marks a pivotal shift toward the Agentic Web.<n>In this paradigm, agents interact directly with one another to plan, coordinate, and execute complex tasks on behalf of users.<n>We present a structured framework for understanding and building the Agentic Web.
arXiv Detail & Related papers (2025-07-28T17:58:12Z) - Deep Research Agents: A Systematic Examination And Roadmap [79.04813794804377]
Deep Research (DR) agents are designed to tackle complex, multi-turn informational research tasks.<n>In this paper, we conduct a detailed analysis of the foundational technologies and architectural components that constitute DR agents.
arXiv Detail & Related papers (2025-06-22T16:52:48Z) - From Virtual Agents to Robot Teams: A Multi-Robot Framework Evaluation in High-Stakes Healthcare Context [2.016235597066821]
Current frameworks treat agents as conceptual task executors rather than physically embodied entities.<n>We propose three design guidelines emphasizing process transparency, proactive failure recovery, and contextual grounding.<n>Our work informs the development of more resilient and robust multi-agent robotic systems.
arXiv Detail & Related papers (2025-06-04T04:05:38Z) - Distinguishing Autonomous AI Agents from Collaborative Agentic Systems: A Comprehensive Framework for Understanding Modern Intelligent Architectures [0.0]
The emergence of large language models has catalyzed two distinct yet interconnected paradigms in artificial intelligence: standalone AI Agents and collaborative Agentic AI ecosystems.<n>This study establishes a definitive framework for distinguishing these architectures through systematic analysis of their operational principles, structural compositions, and deployment methodologies.
arXiv Detail & Related papers (2025-06-02T08:52:23Z) - SentinelAgent: Graph-based Anomaly Detection in Multi-Agent Systems [11.497269773189254]
We present a system-level anomaly detection framework tailored for large language model (LLM)-based multi-agent systems (MAS)<n>We propose a graph-based framework that models agent interactions as dynamic execution graphs, enabling semantic anomaly detection at node, edge, and path levels.<n>Second, we introduce a pluggable SentinelAgent, an LLM-powered oversight agent that observes, analyzes, and intervenes in MAS execution based on security policies and contextual reasoning.
arXiv Detail & Related papers (2025-05-30T04:25:19Z) - From Glue-Code to Protocols: A Critical Analysis of A2A and MCP Integration for Scalable Agent Systems [0.8909482883800253]
Two open standards, Google's Agent to Agent (A2A) protocol for inter-agent communication and Anthropic's Model Context Protocol (MCP) for standardized tool access, promise to overcome the limitations of fragmented, custom integration approaches.<n>This paper argues that effectively integrating A2A and MCP presents unique, emergent challenges at their intersection.
arXiv Detail & Related papers (2025-05-06T16:40:39Z) - Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems [50.29939179830491]
Failure attribution in LLM multi-agent systems remains underexplored and labor-intensive.<n>We develop and evaluate three automated failure attribution methods, summarizing their corresponding pros and cons.<n>The best method achieves 53.5% accuracy in identifying failure-responsible agents but only 14.2% in pinpointing failure steps.
arXiv Detail & Related papers (2025-04-30T23:09:44Z) - RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints [27.467048581838405]
We propose the concept of compositional constraints for embodied multi-agent systems.<n>We design interfaces tailored to different types of constraints, enabling seamless interaction with the physical world.<n>We introduce the first benchmark for embodied multi-agent manipulation, RoboFactory.
arXiv Detail & Related papers (2025-03-20T17:58:38Z) - MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents [59.825725526176655]
Large Language Models (LLMs) have shown remarkable capabilities as autonomous agents.<n>Existing benchmarks either focus on single-agent tasks or are confined to narrow domains, failing to capture the dynamics of multi-agent coordination and competition.<n>We introduce MultiAgentBench, a benchmark designed to evaluate LLM-based multi-agent systems across diverse, interactive scenarios.
arXiv Detail & Related papers (2025-03-03T05:18:50Z) - Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence [79.5316642687565]
Existing multi-agent frameworks often struggle with integrating diverse capable third-party agents.
We propose the Internet of Agents (IoA), a novel framework that addresses these limitations.
IoA introduces an agent integration protocol, an instant-messaging-like architecture design, and dynamic mechanisms for agent teaming and conversation flow control.
arXiv Detail & Related papers (2024-07-09T17:33:24Z) - MMRNet: Improving Reliability for Multimodal Object Detection and
Segmentation for Bin Picking via Multimodal Redundancy [68.7563053122698]
We propose a reliable object detection and segmentation system with MultiModal Redundancy (MMRNet)
This is the first system that introduces the concept of multimodal redundancy to address sensor failure issues during deployment.
We present a new label-free multi-modal consistency (MC) score that utilizes the output from all modalities to measure the overall system output reliability and uncertainty.
arXiv Detail & Related papers (2022-10-19T19:15:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.