Multimodal Cross-Task Interaction for Survival Analysis in Whole Slide Pathological Images
- URL: http://arxiv.org/abs/2406.17225v1
- Date: Tue, 25 Jun 2024 02:18:35 GMT
- Title: Multimodal Cross-Task Interaction for Survival Analysis in Whole Slide Pathological Images
- Authors: Songhan Jiang, Zhengyu Gan, Linghan Cai, Yifeng Wang, Yongbing Zhang,
- Abstract summary: Survival prediction, utilizing pathological images and genomic profiles, is increasingly important in cancer analysis and prognosis.
Existing multimodal methods often rely on alignment strategies to integrate complementary information.
We propose a Multimodal Cross-Task Interaction (MCTI) framework to explore the intrinsic correlations between subtype classification and survival analysis tasks.
- Score: 10.996711454572331
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Survival prediction, utilizing pathological images and genomic profiles, is increasingly important in cancer analysis and prognosis. Despite significant progress, precise survival analysis still faces two main challenges: (1) The massive pixels contained in whole slide images (WSIs) complicate the process of pathological images, making it difficult to generate an effective representation of the tumor microenvironment (TME). (2) Existing multimodal methods often rely on alignment strategies to integrate complementary information, which may lead to information loss due to the inherent heterogeneity between pathology and genes. In this paper, we propose a Multimodal Cross-Task Interaction (MCTI) framework to explore the intrinsic correlations between subtype classification and survival analysis tasks. Specifically, to capture TME-related features in WSIs, we leverage the subtype classification task to mine tumor regions. Simultaneously, multi-head attention mechanisms are applied in genomic feature extraction, adaptively performing genes grouping to obtain task-related genomic embedding. With the joint representation of pathological images and genomic data, we further introduce a Transport-Guided Attention (TGA) module that uses optimal transport theory to model the correlation between subtype classification and survival analysis tasks, effectively transferring potential information. Extensive experiments demonstrate the superiority of our approaches, with MCTI outperforming state-of-the-art frameworks on three public benchmarks. \href{https://github.com/jsh0792/MCTI}{https://github.com/jsh0792/MCTI}.
Related papers
- GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation [68.63955715643974]
Modality-prompted Heterogeneous Graph for Omnimodal Learning (GTP-4o)
We propose an innovative Modality-prompted Heterogeneous Graph for Omnimodal Learning (GTP-4o)
arXiv Detail & Related papers (2024-07-08T01:06:13Z) - Pathology-genomic fusion via biologically informed cross-modality graph learning for survival analysis [7.996257103473235]
We propose Pathology-Genome Heterogeneous Graph (PGHG) that integrates whole slide images (WSI) and bulk RNA-Seq expression data with heterogeneous graph neural network for cancer survival analysis.
The PGHG consists of biological knowledge-guided representation learning network and pathology-genome heterogeneous graph.
We evaluate the model on low-grade gliomas, glioblastoma, and kidney renal papillary cell carcinoma datasets from the Cancer Genome Atlas.
arXiv Detail & Related papers (2024-04-11T09:07:40Z) - MGCT: Mutual-Guided Cross-Modality Transformer for Survival Outcome
Prediction using Integrative Histopathology-Genomic Features [2.3942863352287787]
Mutual-Guided Cross-Modality Transformer (MGCT) is a weakly-supervised, attention-based multimodal learning framework.
We propose MGCT to combine histology features and genomic features to model the genotype-phenotype interactions within the tumor microenvironment.
arXiv Detail & Related papers (2023-11-20T10:49:32Z) - Genetic InfoMax: Exploring Mutual Information Maximization in
High-Dimensional Imaging Genetics Studies [50.11449968854487]
Genome-wide association studies (GWAS) are used to identify relationships between genetic variations and specific traits.
Representation learning for imaging genetics is largely under-explored due to the unique challenges posed by GWAS.
We introduce a trans-modal learning framework Genetic InfoMax (GIM) to address the specific challenges of GWAS.
arXiv Detail & Related papers (2023-09-26T03:59:21Z) - Cross-Modal Translation and Alignment for Survival Analysis [7.657906359372181]
We present a framework to explore the intrinsic cross-modal correlations and transfer potential complementary information.
Our experiments on five public TCGA datasets demonstrate that our proposed framework outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T13:29:14Z) - Multimodal Optimal Transport-based Co-Attention Transformer with Global
Structure Consistency for Survival Prediction [5.445390550440809]
Survival prediction is a complicated ordinal regression task that aims to predict the ranking risk of death.
Due to the large size of pathological images, it is difficult to effectively represent the gigapixel whole slide images (WSIs)
Interactions within tumor microenvironment (TME) in histology are essential for survival analysis.
arXiv Detail & Related papers (2023-06-14T08:01:24Z) - Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction [3.2274401541163322]
We propose a memory-efficient multimodal Transformer that can model interactions between pathway and histology patch tokens.
Our proposed model, SURVPATH, achieves state-of-the-art performance when evaluated against both unimodal and multimodal baselines.
arXiv Detail & Related papers (2023-04-13T21:02:32Z) - AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context
Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images.
AMIGO uses the celluar graph within the tissue to provide a single representation for a patient.
We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z) - Hierarchical Transformer for Survival Prediction Using Multimodality
Whole Slide Images and Genomics [63.76637479503006]
Learning good representation of giga-pixel level whole slide pathology images (WSI) for downstream tasks is critical.
This paper proposes a hierarchical-based multimodal transformer framework that learns a hierarchical mapping between pathology images and corresponding genes.
Our architecture requires fewer GPU resources compared with benchmark methods while maintaining better WSI representation ability.
arXiv Detail & Related papers (2022-11-29T23:47:56Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Pathological Retinal Region Segmentation From OCT Images Using Geometric
Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape.
The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.