Fugu-MT 論文翻訳(概要): Meissa: Multi-modal Medical Agentic Intelligence

論文の概要: Meissa: Multi-modal Medical Agentic Intelligence

arxiv url: http://arxiv.org/abs/2603.09018v1
Date: Mon, 09 Mar 2026 23:22:55 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-11 15:25:23.894239
Title: Meissa: Multi-modal Medical Agentic Intelligence
Title（参考訳）: Meissa: マルチモーダル医療エージェントインテリジェンス
Authors: Yixiong Chen, Xinyi Bai, Yue Pan, Zongwei Zhou, Alan Yuille,
Abstract要約: エージェント機能をオフラインで提供する軽量医療用MM-LLMであるMeissaを紹介する。メサは外的相互作用(戦略選択)をいつ行うかと、フロンティアモデルから構造化軌跡を蒸留することによって多段階の相互作用(戦略実行)を実行する方法の両方を学ぶ。 Meissaは、APIベースのデプロイメントに比べて、エンドツーエンドのレイテンシが22倍低く、完全にオフラインで動作する。
参考スコア（独自算出の注目度）: 24.222326685491648
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multi-modal large language models (MM-LLMs) have shown strong performance in medical image understanding and clinical reasoning. Recent medical agent systems extend them with tool use and multi-agent collaboration, enabling complex decision-making. However, these systems rely almost entirely on frontier models (e.g., GPT), whose API-based deployment incurs high cost, high latency, and privacy risks that conflict with on-premise clinical requirements. We present Meissa, a lightweight 4B-parameter medical MM-LLM that brings agentic capability offline. Instead of imitating static answers, Meissa learns both when to engage external interaction (strategy selection) and how to execute multi-step interaction (strategy execution) by distilling structured trajectories from frontier models. Specifically, we propose: (1) Unified trajectory modeling: trajectories (reasoning and action traces) are represented within a single state-action-observation formalism, allowing one model to generalize across heterogeneous medical environments. (2) Three-tier stratified supervision: the model's own errors trigger progressive escalation from direct reasoning to tool-augmented and multi-agent interaction, explicitly learning difficulty-aware strategy selection. (3) Prospective-retrospective supervision: pairing exploratory forward traces with hindsight-rationalized execution traces enables stable learning of effective interaction policies. Trained on 40K curated trajectories, Meissa matches or exceeds proprietary frontier agents in 10 of 16 evaluation settings across 13 medical benchmarks spanning radiology, pathology, and clinical reasoning. Using over 25x fewer parameters than typical frontier models like Gemini-3, Meissa operates fully offline with 22x lower end-to-end latency compared to API-based deployment. Data, models, and environments are released at https://github.com/Schuture/Meissa.
Abstract（参考訳）: マルチモーダル大言語モデル (MM-LLM) は, 医用画像の理解と臨床推論において高い性能を示した。最近の医療エージェントシステムは、ツールの使用とマルチエージェントのコラボレーションによってそれらを拡張し、複雑な意思決定を可能にしている。しかし、これらのシステムは、ほとんど完全にフロンティアモデル(例えば、GPT)に依存しており、APIベースのデプロイメントは、オンプレミスの臨床要件と矛盾する高コスト、高レイテンシ、プライバシリスクを引き起こす。エージェント機能をオフラインで提供する軽量な4Bパラメータ医療MM-LLMであるMeissaを紹介する。静的な答えを模倣する代わりに、メサは外部の相互作用(戦略選択)と、フロンティアモデルから構造化軌跡を蒸留することで多段階の相互作用(戦略実行)を実行する方法の両方を学ぶ。具体的には,(1)統一軌跡モデリング:1つの状態-動作-観測形式内に軌道(推論と行動トレース)を表現し,一モデルが異種医療環境全体にわたって一般化できるようにする。 2) モデル自体のエラーは, 直接推論からツール強化・マルチエージェントインタラクションへの段階的エスカレーションを引き起こし, 難解な戦略選択を明示的に学習する。 3) 先見的・振り返り的監督: 後見的合理的な実行トレースと探索的前方トレースのペア化により, 効果的なインタラクションポリシの安定した学習が可能となる。 40Kのコースで訓練されたMeissaは、放射線学、病理学、臨床理学を対象とする13の医療ベンチマークにおいて、16の評価設定のうち10のプロプライエタリなフロンティアエージェントに適合するか、あるいは超えている。 Gemini-3のような一般的なフロンティアモデルよりも25倍少ないパラメータを使用すれば、APIベースのデプロイメントに比べて22倍のレイテンシで完全にオフラインで動作する。データ、モデル、環境はhttps://github.com/Schuture/Meissa.comで公開されている。

論文の概要: Meissa: Multi-modal Medical Agentic Intelligence

関連論文リスト