FuguReport

OrgAgent: Organize Your Multi-Agent System like a Company

Authors Yiru Wang, Xinyue Shen, Yaohui Han, Michael Backes, Pin-Yu Chen, Tsung-Yi Ho
Affiliations CISPA Helmholtz Center for Information Security / IBM / The Chinese University of Hong Kong
Categories Method / Multi-Agent Organization / Hierarchical corporate-style system design, Application / Multi-Agent Systems / Organizing agents like a company, Theory / Organizational Models / Comparative organizational structure analysis
License CC BY 4.0

Abstract Overview

This paper introduces OrgAgent, a company-style hierarchical multi-agent system that separates collaboration into governance, execution, and compliance layers. The framework defines corporate-inspired roles (CEO, CTO, COO, Drafter, Reviewer, Specialist, CSO, CCO) and supports multiple execution modes (DIRECT, LIGHT MAS, FULL MAS) and policies (STRICT, BALANCE, NOCAP, AUTO). The authors evaluate hierarchical and flat organizations across MuSiQue, MuSR, and SQuAD 2.0 using GPT-5 mini, GPT-OSS-120B, and Llama 3.1 8B. Results indicate that hierarchical organization generally improves performance over flat and single-agent baselines on MuSiQue and SQuAD 2.0, while also reducing token consumption relative to flat collaboration in all reported settings, though results on MuSR are mixed.

Novelty

The paper treats organizational structure itself as the central variable in multi-agent system design and evaluation, rather than focusing solely on local interaction mechanisms. It proposes a corporate-style hierarchy with explicit governance, execution, and compliance layers, combined with configurable execution modes and policies, and provides the first systematic empirical comparison of flat versus hierarchical MAS on general reasoning tasks.

Results

Hierarchical OrgAgent achieves the strongest results on MuSiQue and SQuAD 2.0 for all three tested models, with reported gains over flat MAS ranging from +18.97% to +123.99% on MuSiQue and +58.96% to +120.47% on SQuAD 2.0. It consistently uses fewer tokens than flat MAS, with reductions ranging from 46.38% to 79.31% across all benchmarks and models. However, on MuSR, flat organization outperforms hierarchical coordination for GPT-OSS-120B and LLaMA-3.1-8B.

Key Points

  1. OrgAgent structures multi-agent reasoning into governance, execution, and compliance layers with distinct corporate-style roles, a skill-based worker pool, and configurable execution modes and policies.
  2. Hierarchical coordination outperforms flat collaboration on MuSiQue and SQuAD 2.0 across all three models while consistently reducing token usage by 46–79%, though on MuSR flat organization remains better for GPT-OSS-120B and LLaMA-3.1-8B.
  3. Coordination behavior analysis reveals model-dependent skill specialization patterns and substantially higher abstention rates (up to 39.78%) on unanswerable SQuAD 2.0 questions under hierarchical policies compared to near-zero abstention in flat and baseline settings.

References

This page was created using generative AI such as GPT-5, Claude Opus 4, Gemini 3, Gemini 3.1 Flash Image, and their higher-end successor versions. No guarantee can be made regarding its contents.