Fugu-MT 論文翻訳(概要): MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems

論文の概要: MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems

arxiv url: http://arxiv.org/abs/2603.09909v1
Date: Tue, 10 Mar 2026 17:03:11 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-11 15:25:24.487433
Title: MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems
Title（参考訳）: MedMASLab:マルチモーダル医療マルチエージェントシステムのベンチマークのための統一オーケストレーションフレームワーク
Authors: Yunhang Qian, Xiaobin Hu, Jiaquan Yu, Siyang Xin, Xiaokun Chen, Jiangning Zhang, Peng-Tao Jiang, Jiawei Liu, Hongwei Bran Li,
Abstract要約: マルチエージェントシステム(MAS)は複雑な臨床診断支援の可能性を秘めている。現在のMAS医学研究は、不均一なデータ摂取と一貫性のない視覚的推論評価に悩まされている。我々は,シームレスなマルチエージェントシステムのための統合フレームワークとベンチマークプラットフォームであるMedMASLabを紹介する。
参考スコア（独自算出の注目度）: 38.36687601516826
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While Multi-Agent Systems (MAS) show potential for complex clinical decision support, the field remains hindered by architectural fragmentation and the lack of standardized multimodal integration. Current medical MAS research suffers from non-uniform data ingestion pipelines, inconsistent visual-reasoning evaluation, and a lack of cross-specialty benchmarking. To address these challenges, we present MedMASLab, a unified framework and benchmarking platform for multimodal medical multi-agent systems. MedMASLab introduces: (1) A standardized multimodal agent communication protocol that enables seamless integration of 11 heterogeneous MAS architectures across 24 medical modalities. (2) An automated clinical reasoning evaluator, a zero-shot semantic evaluation paradigm that overcomes the limitations of lexical string-matching by leveraging large vision-language models to verify diagnostic logic and visual grounding. (3) The most extensive benchmark to date, spanning 11 organ systems and 473 diseases, standardizing data from 11 clinical benchmarks. Our systematic evaluation reveals a critical domain-specific performance gap: while MAS improves reasoning depth, current architectures exhibit significant fragility when transitioning between specialized medical sub-domains. We provide a rigorous ablation of interaction mechanisms and cost-performance trade-offs, establishing a new technical baseline for future autonomous clinical systems. The source code and data is publicly available at: https://github.com/NUS-Project/MedMASLab/
Abstract（参考訳）: マルチエージェントシステム(Multi-Agent Systems:MAS)は、複雑な臨床診断支援の可能性を秘めているが、その分野は、アーキテクチャの断片化と標準化されたマルチモーダル統合の欠如によって妨げられている。現在の医療MAS研究は、不均一なデータ取り込みパイプライン、一貫性のないビジュアル推論評価、およびクロススペクタリティベンチマークの欠如に悩まされている。これらの課題に対処するために,マルチモーダル医療マルチエージェントシステムのための統合フレームワークおよびベンチマークプラットフォームであるMedMASLabを紹介する。 MedMASLabは、標準化されたマルチモーダルエージェント通信プロトコルを導入し、24の医療モードにわたる11の異種MASアーキテクチャをシームレスに統合する。 2) 診断ロジックと視覚的グラウンドの検証に大規模視覚言語モデルを活用することにより,語彙的文字列マッチングの限界を克服するゼロショットセマンティック評価パラダイムである自動臨床推論評価器について検討した。 (3)11の臓器系と473の疾患にまたがる最も広範なベンチマークは、11の臨床ベンチマークのデータの標準化である。 MASは推論の深さを改善するが、現在のアーキテクチャは専門の医療サブドメイン間を移行する際の重大な脆弱さを示す。我々は、相互作用機構とコストパフォーマンストレードオフの厳格なアブレーションを提供し、将来の自律臨床システムのための新しい技術基盤を確立する。ソースコードとデータは、https://github.com/NUS-Project/MedMASLab/で公開されている。

論文の概要: MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems

関連論文リスト