Adaptive Prototype Learning for Multimodal Cancer Survival Analysis
- URL: http://arxiv.org/abs/2503.04643v1
- Date: Thu, 06 Mar 2025 17:32:15 GMT
- Title: Adaptive Prototype Learning for Multimodal Cancer Survival Analysis
- Authors: Hong Liu, Haosen Yang, Federica Eduati, Josien P. W. Pluim, Mitko Veta,
- Abstract summary: We propose Adaptive Prototype Learning (APL), a novel and effective approach for multimodal cancer survival analysis.<n>APL adaptively learns representative prototypes in a data-driven manner, reducing redundancy while preserving critical information.<n>Our method employs two sets of learnable query vectors that serve as a bridge between high-dimensional representations and survival prediction.
- Score: 8.179859593451285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Leveraging multimodal data, particularly the integration of whole-slide histology images (WSIs) and transcriptomic profiles, holds great promise for improving cancer survival prediction. However, excessive redundancy in multimodal data can degrade model performance. In this paper, we propose Adaptive Prototype Learning (APL), a novel and effective approach for multimodal cancer survival analysis. APL adaptively learns representative prototypes in a data-driven manner, reducing redundancy while preserving critical information. Our method employs two sets of learnable query vectors that serve as a bridge between high-dimensional representations and survival prediction, capturing task-relevant features. Additionally, we introduce a multimodal mixed self-attention mechanism to enable cross-modal interactions, further enhancing information fusion. Extensive experiments on five benchmark cancer datasets demonstrate the superiority of our approach over existing methods. The code is available at https://github.com/HongLiuuuuu/APL.
Related papers
- Continually Evolved Multimodal Foundation Models for Cancer Prognosis [50.43145292874533]
Cancer prognosis is a critical task that involves predicting patient outcomes and survival rates.<n>Previous studies have integrated diverse data modalities, such as clinical notes, medical images, and genomic data, leveraging their complementary information.<n>Existing approaches face two major limitations. First, they struggle to incorporate newly arrived data with varying distributions into training, such as patient records from different hospitals.<n>Second, most multimodal integration methods rely on simplistic concatenation or task-specific pipelines, which fail to capture the complex interdependencies across modalities.
arXiv Detail & Related papers (2025-01-30T06:49:57Z) - A Multi-Modal Deep Learning Framework for Pan-Cancer Prognosis [15.10417643788382]
In this paper, a deep-learning based model, named UMPSNet, is proposed.<n>UMPSNet integrates four types of important meta data (demographic information, cancer type information, treatment protocols, and diagnosis results) into text templates, and then introduces a text encoder to extract textual features.<n>By incorporating the multi-modality of patient data and joint training, UMPSNet outperforms all SOTA approaches.
arXiv Detail & Related papers (2025-01-13T02:29:42Z) - M2EF-NNs: Multimodal Multi-instance Evidence Fusion Neural Networks for Cancer Survival Prediction [24.323961146023358]
We propose a neural network model called M2EF-NNs for accurate cancer survival prediction.
To capture global information in the images, we use a pre-trained Vision Transformer (ViT) model.
We are the first to apply the Dempster-Shafer evidence theory (DST) to cancer survival prediction.
arXiv Detail & Related papers (2024-08-08T02:31:04Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - SELECTOR: Heterogeneous graph network with convolutional masked autoencoder for multimodal robust prediction of cancer survival [8.403756148610269]
Multimodal prediction of cancer patient survival offers a more comprehensive and precise approach.
This paper introduces SELECTOR, a heterogeneous graph-aware network based on convolutional mask encoders.
Our method significantly outperforms state-of-the-art methods in both modality-missing and intra-modality information-confirmed cases.
arXiv Detail & Related papers (2024-03-14T11:23:39Z) - Debiasing Multimodal Models via Causal Information Minimization [65.23982806840182]
We study bias arising from confounders in a causal graph for multimodal data.
Robust predictive features contain diverse information that helps a model generalize to out-of-distribution data.
We use these features as confounder representations and use them via methods motivated by causal theory to remove bias from models.
arXiv Detail & Related papers (2023-11-28T16:46:14Z) - TMSS: An End-to-End Transformer-based Multimodal Network for
Segmentation and Survival Prediction [0.0]
oncologists do not do this in their analysis but rather fuse the information in their brain from multiple sources such as medical images and patient history.
This work proposes a deep learning method that mimics oncologists' analytical behavior when quantifying cancer and estimating patient survival.
arXiv Detail & Related papers (2022-09-12T06:22:05Z) - A Multi-modal Fusion Framework Based on Multi-task Correlation Learning
for Cancer Prognosis Prediction [8.476394437053477]
We present a multi-modal fusion framework based on multi-task correlation learning (MultiCoFusion) for survival analysis and cancer grade classification.
We systematically evaluate our framework using glioma datasets from The Cancer Genome Atlas (TCGA)
arXiv Detail & Related papers (2022-01-22T15:16:24Z) - M2Net: Multi-modal Multi-channel Network for Overall Survival Time
Prediction of Brain Tumor Patients [151.4352001822956]
Early and accurate prediction of overall survival (OS) time can help to obtain better treatment planning for brain tumor patients.
Existing prediction methods rely on radiomic features at the local lesion area of a magnetic resonance (MR) volume.
We propose an end-to-end OS time prediction model; namely, Multi-modal Multi-channel Network (M2Net)
arXiv Detail & Related papers (2020-06-01T05:21:37Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.