FuguReport

Summary

This week's papers center on how to organize LLM-based multi-agent systems for complex, real-world tasks. The focus moves beyond single-agent designs toward frameworks that combine hierarchical planning, evolving coordination, or self-organization to improve adaptability and scalability.

Situation

Although LLMs have grown strong at reasoning, feedback handling, and some multimodal tasks, single agents remain limited when tasks require sustained coordination, tool interaction, and decomposition into many subtasks. AutoGen presents multi-agent conversation as a general abstraction for combining LLMs, tools, and humans, arguing that conversable agents and flexible conversation patterns can support applications across domains.

The newer representative papers push this agenda toward more robust organizational structures. AgentOrchestra emphasizes hierarchical planning with specialized sub-agents to address weak generalization, multimodal reasoning limits, and poor maintainability, while AgentNet argues that centralized controllers and fixed workflows create bottlenecks in scalability, fault tolerance, and privacy-sensitive cross-organizational collaboration. Together, these introductions mark a shift from simple multi-agent role playing toward systems designed around extensibility, dynamic coordination, and real-world operational constraints.

Infographic (English)

LLM Multi-Agent Frameworks situation infographic

Progress

OrgAgent: Organize Your Multi-Agent System like a Company <See Details on Fugu-MT>

OrgAgent proposes a company-style hierarchical framework for organizing multi-agent systems, mapping corporate structures onto agent coordination. Compared with flatter or ad-hoc multi-agent abstractions, it provides evidence that an enterprise-like hierarchy can outperform other organizational structures.

Experience as a Compass: Multi-agent RAG with Evolving Orchestration and Agent Prompts <See Details on Fugu-MT>

HERA introduces a hierarchical framework that jointly evolves orchestration patterns and role-specific agent prompts over experience. Unlike fixed workflows, it reports emergent self-organization into compact, high-utility multi-agent networks with substantial gains over recent baselines.

Drop the Hierarchy and Roles: How Self-Organizing LLM Agents Outperform Designed Structures <See Details on Fugu-MT>

This study tests self-organizing LLM agents across 25,000 tasks, varying agent counts (4–256) and eight coordination protocols. Rather than assuming designed hierarchies are necessary, it finds that useful autonomous coordination already emerges across both open- and closed-source models.

An Empirical Study of Multi-Agent Collaboration for Automated Research <See Details on Fugu-MT>

This empirical study compares single-agent, sub-agent, and agent-team structures for automated ML research optimization. Rather than proposing a new framework, it surfaces fundamental trade-offs between operational stability and exploratory breadth across multi-agent designs.

ClinicalAgents: Multi-Agent Orchestration for Clinical Decision Making with Dual-Memory <See Details on Fugu-MT>

ClinicalAgents applies multi-agent orchestration to clinical decision making, using specialist-like agents with dual-memory support. Compared with strict sequential chains, it employs an MCTS-based dynamic orchestrator that can generate hypotheses, verify evidence, and backtrack when information is missing.

Outlook

Near-term progress in LLM multi-agent systems is likely to focus on coordination policies—adaptive routing, lightweight orchestration, and mechanisms for discovering or reassigning specialist agents in large heterogeneous pools. Representative future-work sections call for optimal workflow design, decentralized routing, and efficiency improvements, and this week's papers reinforce these directions through new hierarchical organizers, jointly evolved coordination structures, and large-scale evidence that self-organization can emerge without fully fixed roles.

A second direction is deployment-oriented robustness: richer domain-specific tool use, stronger data and memory infrastructure, and better observability and safety controls. This week's movement into automated research and clinical decision support aligns with representative papers that call for broader specialist ecosystems, improved multimodal tool generation, and clearer monitoring and human oversight as these systems approach real-world deployment.

Infographic (English)

LLM Multi-Agent Frameworks outlook infographic

References

This page was created using generative AI such as GPT-5, Claude Opus 4, Gemini 3, Gemini 3.1 Flash Image, and their higher-end successor versions. No guarantee can be made regarding its contents.