Fugu-MT 論文翻訳(概要): MoEMambaMIL: Structure-Aware Selective State Space Modeling for Whole-Slide Image Analysis

論文の概要: MoEMambaMIL: Structure-Aware Selective State Space Modeling for Whole-Slide Image Analysis

arxiv url: http://arxiv.org/abs/2603.06378v1
Date: Fri, 06 Mar 2026 15:28:07 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-09 13:17:46.076964
Title: MoEMambaMIL: Structure-Aware Selective State Space Modeling for Whole-Slide Image Analysis
Title（参考訳）: MoEMambaMIL:全スライディング画像解析のための構造を考慮した選択状態空間モデリング
Authors: Dongqing Xie, Yonghuang Wu,
Abstract要約: MoEMambaMILは、全スライディング画像(WSI)解析のための構造認識フレームワークである。エリアネスト選択走査とMix of-Experts(MoE)モデリングを統合している。 MoEMambaMILは、9つの下流タスクで最高のパフォーマンスを達成する。
参考スコア（独自算出の注目度）: 0.7898424058509471
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Whole-slide image (WSI) analysis is challenging due to the gigapixel scale of slides and their inherent hierarchical multi-resolution structure. Existing multiple instance learning (MIL) approaches often model WSIs as unordered collections of patches, which limits their ability to capture structured dependencies between global tissue organization and local cellular patterns. Although recent State Space Models (SSMs) enable efficient modeling of long sequences, how to structure WSI tokens to fully exploit their spatial hierarchy remains an open problem.We propose MoEMambaMIL, a structure-aware SSM framework for WSI analysis that integrates region-nested selective scanning with mixture-of-experts (MoE) modeling. Leveraging multi-resolution preprocessing, MoEMambaMIL organizes patch tokens into region-aware sequences that preserve spatial containment across resolutions. On top of this structured sequence, we decouple resolution-aware encoding and region-adaptive contextual modeling via a combination of static, resolution-specific experts and dynamic sparse experts with learned routing. This design enables efficient long-sequence modeling while promoting expert specialization across heterogeneous diagnostic patterns. Experiments demonstrate that MoEMambaMIL achieves the best performance across 9 downstream tasks.
Abstract（参考訳）: スライドのギガピクセルスケールと、その固有の階層的多重解像度構造のために、WSI解析は困難である。既存のMIL(Multiple Case Learning)アプローチは、WSIを無秩序なパッチのコレクションとしてモデル化することが多い。近年のステート・スペース・モデル(SSM)は、長いシーケンスの効率的なモデリングを可能にするが、その空間的階層を完全に活用するためにWSIトークンをどう構成するかは未解決の問題であり、我々は、WSI分析のための構造を意識したSSMフレームワークであるMoEMambaMILを提案する。マルチレゾリューション前処理を活用して、MoEMambaMILはパッチトークンを領域認識シーケンスに整理し、解像度の空間的包摂を保存する。この構造的シーケンスの上に、静的な解像度特異的な専門家と学習ルーティングを持つ動的スパースの専門家を組み合わせることで、解像度認識符号化と領域適応型コンテキストモデリングを分離する。この設計は、異種診断パターンをまたいだ専門家の専門化を推進しつつ、効率的な時系列モデリングを可能にする。実験によると、MoEMambaMILは9つの下流タスクで最高のパフォーマンスを達成している。

論文の概要: MoEMambaMIL: Structure-Aware Selective State Space Modeling for Whole-Slide Image Analysis

関連論文リスト