Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction
- URL: http://arxiv.org/abs/2304.06819v2
- Date: Mon, 15 Apr 2024 13:24:46 GMT
- Title: Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction
- Authors: Guillaume Jaume, Anurag Vaidya, Richard Chen, Drew Williamson, Paul Liang, Faisal Mahmood,
- Abstract summary: We propose a memory-efficient multimodal Transformer that can model interactions between pathway and histology patch tokens.
Our proposed model, SURVPATH, achieves state-of-the-art performance when evaluated against both unimodal and multimodal baselines.
- Score: 3.2274401541163322
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Integrating whole-slide images (WSIs) and bulk transcriptomics for predicting patient survival can improve our understanding of patient prognosis. However, this multimodal task is particularly challenging due to the different nature of these data: WSIs represent a very high-dimensional spatial description of a tumor, while bulk transcriptomics represent a global description of gene expression levels within that tumor. In this context, our work aims to address two key challenges: (1) how can we tokenize transcriptomics in a semantically meaningful and interpretable way?, and (2) how can we capture dense multimodal interactions between these two modalities? Specifically, we propose to learn biological pathway tokens from transcriptomics that can encode specific cellular functions. Together with histology patch tokens that encode the different morphological patterns in the WSI, we argue that they form appropriate reasoning units for downstream interpretability analyses. We propose fusing both modalities using a memory-efficient multimodal Transformer that can model interactions between pathway and histology patch tokens. Our proposed model, SURVPATH, achieves state-of-the-art performance when evaluated against both unimodal and multimodal baselines on five datasets from The Cancer Genome Atlas. Our interpretability framework identifies key multimodal prognostic factors, and, as such, can provide valuable insights into the interaction between genotype and phenotype, enabling a deeper understanding of the underlying biological mechanisms at play. We make our code public at: https://github.com/ajv012/SurvPath.
Related papers
- ViKL: A Mammography Interpretation Framework via Multimodal Aggregation of Visual-knowledge-linguistic Features [54.37042005469384]
We announce MVKL, the first multimodal mammography dataset encompassing multi-view images, detailed manifestations and reports.
Based on this dataset, we focus on the challanging task of unsupervised pretraining.
We propose ViKL, a framework that synergizes Visual, Knowledge, and Linguistic features.
arXiv Detail & Related papers (2024-09-24T05:01:23Z) - Revisiting Adaptive Cellular Recognition Under Domain Shifts: A Contextual Correspondence View [49.03501451546763]
We identify the importance of implicit correspondences across biological contexts for exploiting domain-invariant pathological composition.
We propose self-adaptive dynamic distillation to secure instance-aware trade-offs across different model constituents.
arXiv Detail & Related papers (2024-07-14T04:41:16Z) - Multimodal Prototyping for cancer survival prediction [45.61869793509184]
Multimodal survival methods combining gigapixel histology whole-slide images (WSIs) and transcriptomic profiles are particularly promising for patient prognostication and stratification.
Current approaches involve tokenizing the WSIs into smaller patches (>10,000 patches) and transcriptomics into gene groups, which are then integrated using a Transformer for predicting outcomes.
This process generates many tokens, which leads to high memory requirements for computing attention and complicates post-hoc interpretability analyses.
Our framework outperforms state-of-the-art methods with much less computation while unlocking new interpretability analyses.
arXiv Detail & Related papers (2024-06-28T20:37:01Z) - Multimodal Cross-Task Interaction for Survival Analysis in Whole Slide Pathological Images [10.996711454572331]
Survival prediction, utilizing pathological images and genomic profiles, is increasingly important in cancer analysis and prognosis.
Existing multimodal methods often rely on alignment strategies to integrate complementary information.
We propose a Multimodal Cross-Task Interaction (MCTI) framework to explore the intrinsic correlations between subtype classification and survival analysis tasks.
arXiv Detail & Related papers (2024-06-25T02:18:35Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Pathology-and-genomics Multimodal Transformer for Survival Outcome
Prediction [43.1748594898772]
We propose a multimodal transformer (PathOmics) integrating pathology and genomics insights into colon-related cancer survival prediction.
We emphasize the unsupervised pretraining to capture the intrinsic interaction between tissue microenvironments in gigapixel whole slide images.
We evaluate our approach on both TCGA colon and rectum cancer cohorts, showing that the proposed approach is competitive and outperforms state-of-the-art studies.
arXiv Detail & Related papers (2023-07-22T00:59:26Z) - Agentivit\`a e telicit\`a in GilBERTo: implicazioni cognitive [77.71680953280436]
The goal of this study is to investigate whether a Transformer-based neural language model infers lexical semantics.
The semantic properties considered are telicity (also combined with definiteness) and agentivity.
arXiv Detail & Related papers (2023-07-06T10:52:22Z) - Learning Topological Interactions for Multi-Class Medical Image
Segmentation [7.95994513875072]
We introduce a novel topological interaction module to encode the topological interactions into a deep neural network.
The implementation is completely convolution-based and thus can be very efficient.
We demonstrate the generalizability of the method on both proprietary and public challenge datasets.
arXiv Detail & Related papers (2022-07-20T05:09:43Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.