Fugu-MT 論文翻訳(概要): Towards Reliable Fetal Ultrasound Interpretation with Multi-Agent Collaboration

論文の概要: Towards Reliable Fetal Ultrasound Interpretation with Multi-Agent Collaboration

arxiv url: http://arxiv.org/abs/2605.25357v1
Date: Mon, 25 May 2026 02:22:53 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-26 19:50:19.250519
Title: Towards Reliable Fetal Ultrasound Interpretation with Multi-Agent Collaboration
Title（参考訳）: マルチエージェント協調による胎児超音波の信頼性向上に向けて
Authors: Xiaotian Hu, Mingxuan Liu, Junwei Huang, Kasidit Anmahapong, Yifei Chen, Yiming Huang, Xuguang Bai, Zihan Li, Hongjia Yang, Yingqi Hao, Hong Xu, Yu Jiang, Tian Tian, Yi Liao, Haibo Qu, Qiyuan Tian,
Abstract要約: FetUSAgentsは、包括的胎児超音波解釈のためのツール拡張マルチエージェントシステムである。視覚的質問応答(VQA)、レポート生成、画像キャプション、ビデオ要約をサポートする。胎児超音波専用のVQAベンチマークであるFetUS-VQAを1,892枚の画像と3,205枚の質問応答対から構成する。
参考スコア（独自算出の注目度）: 29.755530162930707
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Automated fetal ultrasound interpretation requires a workflow from visual perception, including plane recognition and anatomical segmentation, to clinical understanding, including biometric measurement and diagnostic reporting. However, the prevailing "one-task, one-model" paradigm limits systematic integration of evidence across this multi-step process. Although multimodal large language models (MLLMs) show promising visual understanding, their limited domain-specific grounding and hallucination risks restrict reliability in fetal ultrasound analysis. To address these limitations, we propose FetUSAgents, a tool-augmented multi-agent system for comprehensive fetal ultrasound interpretation, supporting visual question answering (VQA), report generation, image captioning, and video summarization. FetUSAgents coordinates task-specific visual tools through collaborative LLM agents and decomposes clinical queries into subtasks that progress from anatomical recognition to quantitative measurement. We further introduce Dual-Path Evidence Arbitration (DPEA), which integrates LLM-based deliberative reasoning with structured computational evidence from specialized visual tools. A retrieval-enhanced evidence bank consolidates intermediate findings to support traceable and clinically grounded conclusions. In addition, we construct FetUS-VQA, a dedicated VQA benchmark for fetal ultrasound, comprising 1,892 images and 3,205 question-answer pairs across 10 clinical tasks. Extensive out-of-distribution experiments show that FetUSAgents outperforms general and medical MLLMs, exceeding the strongest baseline by more than 25 percent in VQA accuracy. These results suggest a scalable route toward evidence-driven clinical assistants for prenatal imaging. Code is available.
Abstract（参考訳）: 胎児超音波の自動解釈は、平面認識や解剖学的セグメント化を含む視覚知覚から、生体計測や診断報告を含む臨床理解まで、ワークフローを必要とする。しかし、一般的な「ワンタスク・ワンモデル」パラダイムは、この多段階プロセスにおける証拠の体系的な統合を制限している。マルチモーダル大言語モデル (MLLM) は有望な視覚的理解を示すが、その限られた領域固有の接地と幻覚リスクは胎児超音波解析の信頼性を制限する。これらの制約に対処するために,包括的胎児超音波解釈のためのツール強化マルチエージェントシステムFetUSAgentsを提案し,視覚的質問応答(VQA),レポート生成,画像キャプション,映像要約をサポートする。 FetUSAgentsは、共同LLMエージェントを介してタスク固有の視覚ツールをコーディネートし、臨床クエリを解剖学的認識から定量的な測定まで進歩するサブタスクに分解する。さらに、LLMに基づく熟考的推論と、特殊な視覚ツールによる構造化された計算的証拠を統合したDual-Path Evidence Arbitration (DPEA)を紹介する。検索強化された証拠銀行は中間結果を統合し、追跡可能で臨床的に根拠付けられた結論を支持する。さらに, 胎児超音波専用のVQAベンチマークFetUS-VQAを構築し, 10種類の臨床的課題に対して1,892枚の画像と3,205枚の質問応答ペアからなる。大規模なアウト・オブ・ディストリビューション実験により、FetUSAgentsは一般および医療MLLMよりも優れており、VQA精度で25%以上の最強のベースラインを上回ります。これらの結果は、出生前イメージングのためのエビデンス駆動型臨床助手へのスケーラブルな道のりを示唆している。コードは利用可能。

関連論文リスト

Echo-α: Large Agentic Multimodal Reasoning Model for Ultrasound Interpretation [76.6507710204181]
超音波解釈のためのエージェント型マルチモーダル推論モデルであるEcho-を提案する。 Echo-は臓器固有の検出出力を調整し、それらをグローバルな視覚的コンテキストに統合し、その結果の証拠を根拠となる診断決定に変換するように訓練されている。以上の結果から, エージェントによるマルチモーダル推論は, 特定検出器を検証可能な臨床証拠にすることができることが示唆された。
論文参考訳（メタデータ） (2026-04-30T15:31:00Z)
FetalAgents: A Multi-Agent System for Fetal Ultrasound Image and Video Analysis [25.102840990979985]
包括的胎児US分析のための最初のマルチエージェントシステムであるFetalAgentsを提案する。軽量でエージェント的な調整フレームワークを通じて、FetalAgentsは専門の視覚専門家を動的に編成し、診断、測定、セグメンテーションのパフォーマンスを最大化する。さらに、FetalAgentsは、エンドツーエンドのビデオストリームの要約をサポートすることで、静的画像解析を超えて前進する。
論文参考訳（メタデータ） (2026-03-10T14:37:28Z)
Beyond Benchmarks of IUGC: Rethinking Requirements of Deep Learning Methods for Intrapartum Ultrasound Biometry from Fetal Ultrasound Videos [58.71502465551297]
MICCAI 2024と共同でIUGC(Intrapartum Ultrasound Grand Challenge)が打ち上げられた。 IUGCは、標準的な平面分類、胎児の頭頂部生理的セグメンテーション、バイオメトリーを統合した、臨床指向のマルチタスク自動測定フレームワークを導入している。この課題は、これまでに3つの病院から収集された774のビデオ(68,106フレーム)を含む、最大規模のマルチセンターの超音波ビデオデータセットをリリースしている。
論文参考訳（メタデータ） (2026-02-13T13:28:22Z)
FETAL-GAUGE: A Benchmark for Assessing Vision-Language Models in Fetal Ultrasound [2.8097961263689406]
出生前超音波画像の需要は、訓練されたソノグラフィーの世界的な不足を増している。深層学習は、ソノグラフィーの効率を高め、新しい実践者の訓練を支援する可能性がある。 We present Fetal-Gauge, the first and largest visual question answering benchmark designed to evaluate Vision-Language Models (VLMs)。対象は,42,000枚以上の画像と93,000枚の質問応答対,解剖学的平面同定,解剖学的形態の視覚的グラウンドニング,胎児の配向評価,臨床像の適合性,臨床診断である。
論文参考訳（メタデータ） (2025-12-25T04:54:37Z)
Epistemic-aware Vision-Language Foundation Model for Fetal Ultrasound Interpretation [83.02147613524032]
医療用AIシステムFetalMindについて報告する。本稿では、専門家による2部グラフをモデルに注入し、ビュー・ディスリーズ関連を分離するSED(Salient Epistemic Disentanglement)を提案する。 FetalMindはすべての妊娠期のオープンソースおよびクローズドソースベースラインを上回り、平均利得は+14%、臨界条件では+61.2%高い。
論文参考訳（メタデータ） (2025-10-14T19:57:03Z)
MedVQA-TREE: A Multimodal Reasoning and Retrieval Framework for Sarcopenia Prediction [1.7775777785480917]
MedVQA-TREEは階層的な画像解釈モジュール、ゲート機能レベルの融合機構、新しいマルチホップ・マルチクエリ検索戦略を統合したフレームワークである。ゲート融合機構は、視覚的特徴をテキストクエリと選択的に統合し、臨床知識は、PubMedにアクセスするUMLS誘導パイプラインとサルコピア固有の外部知識ベースを介して検索する。診断精度は99%まで向上し、従来の最先端の手法を10%以上上回った。
論文参考訳（メタデータ） (2025-08-26T13:31:01Z)
Breast Ultrasound Report Generation using LangChain [58.07183284468881]
本稿では,Large Language Models (LLM) を用いたLangChainによる複数の画像解析ツールを胸部報告プロセスに統合することを提案する。本手法は,超音波画像から関連する特徴を正確に抽出し,臨床的文脈で解釈し,包括的で標準化された報告を生成する。
論文参考訳（メタデータ） (2023-12-05T00:28:26Z)
Hybrid Attention for Automatic Segmentation of Whole Fetal Head in Prenatal Ultrasound Volumes [52.53375964591765]
胎児の頭部全体を米国全巻に分割する,最初の完全自動化ソリューションを提案する。セグメント化タスクは、まずエンコーダ-デコーダディープアーキテクチャの下で、エンドツーエンドのボリュームマッピングとして定式化される。次に,セグメンタとハイブリットアテンションスキーム(HAS)を組み合わせることで,識別的特徴を選択し,非情報量的特徴を抑える。
論文参考訳（メタデータ） (2020-04-28T14:43:05Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。