Fugu-MT 論文翻訳(概要): MMNavAgent: Multi-Magnification WSI Navigation Agent for Clinically Consistent Whole-Slide Analysis

論文の概要: MMNavAgent: Multi-Magnification WSI Navigation Agent for Clinically Consistent Whole-Slide Analysis

arxiv url: http://arxiv.org/abs/2603.02079v1
Date: Mon, 02 Mar 2026 17:02:44 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-03 19:50:56.99165
Title: MMNavAgent: Multi-Magnification WSI Navigation Agent for Clinically Consistent Whole-Slide Analysis
Title（参考訳）: MMNavAgent : 臨床完全すべり解析のための多機能WSIナビゲーションエージェント
Authors: Zhengyang Xu, Han Li, Jingsong Liu, Linrui Xie, Xun Ma, Xin You, Shihui Zu, Ayako Ito, Xinyu Hao, Hongming Xu, Shaohua Kevin Zhou, Nassir Navab, Peter J. Schüffler,
Abstract要約: 近年のAIナビゲーション手法は、空間探索をモデル化し、診断関連領域を選択することにより、WSI(Whole-Slide Image)の診断を改善することを目的としている。臨床実践では、病理医は複数の倍率のスライドを検査し、必要な尺度のみを選択的に検査する。このミスマッチは、既存の手法が交叉磁化相互作用や適応倍率選択をモデル化するのを防ぐ。
参考スコア（独自算出の注目度）: 33.71496118045585
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent AI navigation approaches aim to improve Whole-Slide Image (WSI) diagnosis by modeling spatial exploration and selecting diagnostically relevant regions, yet most operate at a single fixed magnification or rely on predefined magnification traversal. In clinical practice, pathologists examine slides across multiple magnifications and selectively inspect only necessary scales, dynamically integrating global and cellular evidence in a sequential manner. This mismatch prevents existing methods from modeling cross-magnification interactions and adaptive magnification selection inherent to real diagnostic workflows. To these, we propose a clinically consistent Multi-Magnification WSI Navigation Agent (MMNavAgent) that explicitly models multi magnification interaction and adaptive magnification selection. Specifically, we introduce a Cross-Magnification navigation Tool (CMT) that aggregates contextual information from adjacent magnifications to enhance discriminative representations along the navigation path. We further introduce a Magnification Selection Tool (MST) that leverages memory-driven reasoning within the agent framework to enable interactive and adaptive magnification selection, mimicking the sequential decision process of pathologists. Extensive experiments on a public dataset demonstrate improved diagnostic performance, with 1.45% gain of AUC and 2.93% gain of BACC over a non-agent baseline. Code will be public upon acceptance.
Abstract（参考訳）: 最近のAIナビゲーション手法は、空間探索をモデル化し、診断に関連のある領域を選択することで、WSI(Whole-Slide Image)の診断を改善することを目的としている。臨床実践では、病理学者は複数の倍率にまたがるスライドを調べ、必要なスケールのみを選択的に検査し、グローバルな証拠と細胞的な証拠を逐次的に統合する。このミスマッチは、既存の手法が、実際の診断ワークフローに固有の相互拡大相互作用や適応的な拡大選択をモデル化するのを防ぐ。そこで本研究では,多倍率相互作用と適応的倍率選択を明示的にモデル化した,臨床的に一貫した多機能WSIナビゲーションエージェント(MMNavAgent)を提案する。具体的には,CMT (Cross-Magnification Navigation Tool) を導入し,隣接する倍率からコンテキスト情報を集約し,ナビゲーション経路に沿った識別表現を強化する。さらに、エージェントフレームワーク内でのメモリ駆動推論を活用して、インタラクティブで適応的な拡大選択を可能にするMagnification Selection Tool (MST)を導入し、病理学者のシーケンシャルな意思決定プロセスを模倣する。公開データセットの大規模な実験では、診断性能が改善し、AUCは1.45%、BACCは2.93%向上した。コードは受理時に公開される。

関連論文リスト

MedSAM-Agent: Empowering Interactive Medical Image Segmentation with Multi-turn Agentic Reinforcement Learning [53.37068897861388]
MedSAM-Agentは、対話的なセグメンテーションを多段階の自律的な意思決定プロセスとして再構築するフレームワークである。マルチターン・エンド・ツー・エンドの成果検証を統合した2段階のトレーニングパイプラインを開発した。 6つの医療モダリティと21のデータセットにわたる実験は、MedSAM-Agentが最先端のパフォーマンスを達成することを示す。
論文参考訳（メタデータ） (2026-02-03T09:47:49Z)
Code-in-the-Loop Forensics: Agentic Tool Use for Image Forgery Detection [59.04089915447622]
ForenAgentはインタラクティブなIFDフレームワークで、MLLMが検出対象に関するPythonベースの低レベルツールを自律的に生成、実行、洗練することができる。人間の推論にインスパイアされた我々は、グローバルな認識、局所的な焦点、反復的探索、そして全体論的偏見を含む動的推論ループを設計する。実験の結果,ForenAgent は IFD 課題に対する創発的なツール利用能力と反射的推論を示すことがわかった。
論文参考訳（メタデータ） (2025-12-18T08:38:44Z)
Cross-Enhanced Multimodal Fusion of Eye-Tracking and Facial Features for Alzheimer's Disease Diagnosis [9.111075363945892]
視線追跡と顔の特徴は、注意分布と神経認知状態を反映する認知機能の重要な指標である。アルツハイマー病の診断に視線追跡と顔の特徴を活用する多モーダルクロスエンハンス融合フレームワークを提案する。我々のフレームワークは、従来のレイトフュージョンや特徴連結法よりも優れています。
論文参考訳（メタデータ） (2025-10-25T13:30:24Z)
NEARL-CLIP: Interacted Query Adaptation with Orthogonal Regularization for Medical Vision-Language Understanding [51.63264715941068]
textbfNEARL-CLIP (iunderlineNteracted quunderlineEry underlineAdaptation with ounderlineRthogonaunderlineL regularization)は、VLMベースの新しい相互モダリティ相互作用フレームワークである。
論文参考訳（メタデータ） (2025-08-06T05:44:01Z)
CPathAgent: An Agent-based Foundation Model for Interpretable High-Resolution Pathology Image Analysis Mimicking Pathologists' Diagnostic Logic [23.488576700623966]
我々は、病理医の診断ワークフローを模倣する革新的なエージェントベースのアプローチであるCPathAgentを紹介する。我々は、パッチレベル、リージョンレベル、WSIレベルの機能を単一のモデルに統合するマルチステージトレーニング戦略を開発します。 PathMMU-HR2は、大規模領域分析のための最初のエキスパート検証ベンチマークである。
論文参考訳（メタデータ） (2025-05-26T20:22:19Z)
Adaptive Interactive Segmentation for Multimodal Medical Imaging via Selection Engine [12.594586161567259]
本稿では,様々な医用画像モダリティのセグメンテーション性能を向上させる戦略駆動型インタラクティブモデル(SISeg)を提案する。本研究では,医療知識を必要とせずに最適なプロンプトフレームを動的に選択する適応フレーム選択エンジン(AFSE)を開発した。我々は, SISegモデルの頑健な適応性とマルチモーダルタスクの一般化を実証し, 7つの医用画像モダリティをカバーする10のデータセットについて広範な実験を行った。
論文参考訳（メタデータ） (2024-11-29T03:08:28Z)
Multimodal Outer Arithmetic Block Dual Fusion of Whole Slide Images and Omics Data for Precision Oncology [6.418265127069878]
本稿では, 局所(パッチレベル)から大域(スライダーレベル)の相互作用の相補的な情報を取得するために, 早期・後期融合におけるオミック埋め込みの利用を提案する。この二重融合戦略は、解釈可能性と分類性能を高め、臨床診断の可能性を強調している。
論文参考訳（メタデータ） (2024-11-26T13:25:53Z)
Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement Learning [53.00683059396803]
マスク画像モデル(MIM)は,マスク画像から元の情報を復元する簡便さと有効性から広く利用されている。本稿では、強化学習(RL)を利用して最適な画像マスキング比とマスキング戦略を自動検索する決定に基づくMIMを提案する。本手法は,ニューロン分節の課題において,代替自己監督法に対して有意な優位性を有する。
論文参考訳（メタデータ） (2023-10-06T10:40:46Z)
Multi-task Paired Masking with Alignment Modeling for Medical Vision-Language Pre-training [55.56609500764344]
本稿では,マルチタスク・ペアド・マスキング・アライメント(MPMA)に基づく統合フレームワークを提案する。また, メモリ拡張クロスモーダルフュージョン (MA-CMF) モジュールを導入し, 視覚情報を完全統合し, レポート再構築を支援する。
論文参考訳（メタデータ） (2023-05-13T13:53:48Z)
RetiFluidNet: A Self-Adaptive and Multi-Attention Deep Convolutional Network for Retinal OCT Fluid Segmentation [3.57686754209902]
OCTガイド下治療には網膜液の定量化が必要である。 RetiFluidNetと呼ばれる新しい畳み込みニューラルアーキテクチャは、多クラス網膜流体セグメンテーションのために提案されている。モデルは、テクスチャ、コンテキスト、エッジといった特徴の階層的な表現学習の恩恵を受ける。
論文参考訳（メタデータ） (2022-09-26T07:18:00Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。