Fugu-MT 論文翻訳(概要): Fusing Pixels and Genes: Spatially-Aware Learning in Computational Pathology

論文の概要: Fusing Pixels and Genes: Spatially-Aware Learning in Computational Pathology

arxiv url: http://arxiv.org/abs/2602.13944v1
Date: Sun, 15 Feb 2026 00:59:13 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-17 14:17:28.573767
Title: Fusing Pixels and Genes: Spatially-Aware Learning in Computational Pathology
Title（参考訳）: Fusing Pixels and Genes:Computational Pathologyにおける空間認識学習
Authors: Minghao Han, Dingkang Yang, Linhao Qu, Zizhi Chen, Gang Li, Han Wang, Jiacong Wang, Lihua Zhang,
Abstract要約: STAMPは空間的トランスクリプトミクスを付加したマルチモーダルな病理表現学習フレームワークである。本研究は、自己教師型遺伝子誘導訓練が、病理画像表現の学習に堅牢でタスクに依存しない信号を提供することを示す。 6つのデータセットと4つの下流タスクにまたがってSTAMPを検証する。
参考スコア（独自算出の注目度）: 46.83014413674925
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent years have witnessed remarkable progress in multimodal learning within computational pathology. Existing models primarily rely on vision and language modalities; however, language alone lacks molecular specificity and offers limited pathological supervision, leading to representational bottlenecks. In this paper, we propose STAMP, a Spatial Transcriptomics-Augmented Multimodal Pathology representation learning framework that integrates spatially-resolved gene expression profiles to enable molecule-guided joint embedding of pathology images and transcriptomic data. Our study shows that self-supervised, gene-guided training provides a robust and task-agnostic signal for learning pathology image representations. Incorporating spatial context and multi-scale information further enhances model performance and generalizability. To support this, we constructed SpaVis-6M, the largest Visium-based spatial transcriptomics dataset to date, and trained a spatially-aware gene encoder on this resource. Leveraging hierarchical multi-scale contrastive alignment and cross-scale patch localization mechanisms, STAMP effectively aligns spatial transcriptomics with pathology images, capturing spatial structure and molecular variation. We validate STAMP across six datasets and four downstream tasks, where it consistently achieves strong performance. These results highlight the value and necessity of integrating spatially resolved molecular supervision for advancing multimodal learning in computational pathology. The code is included in the supplementary materials. The pretrained weights and SpaVis-6M are available at: https://github.com/Hanminghao/STAMP.
Abstract（参考訳）: 近年,計算病理学におけるマルチモーダル学習の進歩が目覚ましい。既存のモデルは、主に視覚と言語モダリティに頼っているが、言語だけでは分子的特異性に欠けており、病理的な監督が限られており、表現上のボトルネックに繋がる。本稿では,空間的トランスクリプトミクスを付加したマルチモーダルな病理表現学習フレームワークSTAMPを提案する。本研究は、自己監督型遺伝子誘導訓練が、病理画像表現の学習に堅牢でタスクに依存しない信号を提供することを示す。空間コンテキストとマルチスケール情報を組み込むことで、モデルの性能と一般化性がさらに向上する。そこで我々は,これまでビシウムをベースとした空間転写学データセットとして最大であるSpaVis-6Mを構築し,空間認識型遺伝子エンコーダをトレーニングした。階層的マルチスケールのコントラストアライメントとクロススケールのパッチローカライゼーション機構を活用して、STAMPは空間転写学を病理像と効果的に整合させ、空間構造と分子変動を捉える。 6つのデータセットと4つの下流タスクにまたがってSTAMPを検証する。これらの結果は、計算病理学におけるマルチモーダル学習を促進するために、空間的に解決された分子の監督を統合することの価値と必要性を強調している。コードは補足資料に含まれています。事前トレーニングされたウェイトとSpaVis-6Mは、https://github.com/Hanminghao/STAMPで利用可能である。

論文の概要: Fusing Pixels and Genes: Spatially-Aware Learning in Computational Pathology

関連論文リスト