TTMFN: Two-stream Transformer-based Multimodal Fusion Network for
Survival Prediction
- URL: http://arxiv.org/abs/2311.07033v1
- Date: Mon, 13 Nov 2023 02:31:20 GMT
- Title: TTMFN: Two-stream Transformer-based Multimodal Fusion Network for
Survival Prediction
- Authors: Ruiquan Ge, Xiangyang Hu, Rungen Huang, Gangyong Jia, Yaqi Wang,
Renshu Gu, Changmiao Wang, Elazab Ahmed, Linyan Wang, Juan Ye, Ye Li
- Abstract summary: We propose a novel framework named Two-stream Transformer-based Multimodal Fusion Network for survival prediction (TTMFN)
In TTMFN, we present a two-stream multimodal co-attention transformer module to take full advantage of the complex relationships between different modalities.
The experiment results on four datasets from The Cancer Genome Atlas demonstrate that TTMFN can achieve the best performance or competitive results.
- Score: 7.646155781863875
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Survival prediction plays a crucial role in assisting clinicians with the
development of cancer treatment protocols. Recent evidence shows that
multimodal data can help in the diagnosis of cancer disease and improve
survival prediction. Currently, deep learning-based approaches have experienced
increasing success in survival prediction by integrating pathological images
and gene expression data. However, most existing approaches overlook the
intra-modality latent information and the complex inter-modality correlations.
Furthermore, existing modalities do not fully exploit the immense
representational capabilities of neural networks for feature aggregation and
disregard the importance of relationships between features. Therefore, it is
highly recommended to address these issues in order to enhance the prediction
performance by proposing a novel deep learning-based method. We propose a novel
framework named Two-stream Transformer-based Multimodal Fusion Network for
survival prediction (TTMFN), which integrates pathological images and gene
expression data. In TTMFN, we present a two-stream multimodal co-attention
transformer module to take full advantage of the complex relationships between
different modalities and the potential connections within the modalities.
Additionally, we develop a multi-head attention pooling approach to effectively
aggregate the feature representations of the two modalities. The experiment
results on four datasets from The Cancer Genome Atlas demonstrate that TTMFN
can achieve the best performance or competitive results compared to the
state-of-the-art methods in predicting the overall survival of patients.
Related papers
- M2EF-NNs: Multimodal Multi-instance Evidence Fusion Neural Networks for Cancer Survival Prediction [24.323961146023358]
We propose a neural network model called M2EF-NNs for accurate cancer survival prediction.
To capture global information in the images, we use a pre-trained Vision Transformer (ViT) model.
We are the first to apply the Dempster-Shafer evidence theory (DST) to cancer survival prediction.
arXiv Detail & Related papers (2024-08-08T02:31:04Z) - Confidence-aware multi-modality learning for eye disease screening [58.861421804458395]
We propose a novel multi-modality evidential fusion pipeline for eye disease screening.
It provides a measure of confidence for each modality and elegantly integrates the multi-modality information.
Experimental results on both public and internal datasets demonstrate that our model excels in robustness.
arXiv Detail & Related papers (2024-05-28T13:27:30Z) - FORESEE: Multimodal and Multi-view Representation Learning for Robust Prediction of Cancer Survival [3.4686401890974197]
We propose a new end-to-end framework, FORESEE, for robustly predicting patient survival by mining multimodal information.
Cross-fusion transformer effectively utilizes features at the cellular level, tissue level, and tumor heterogeneity level to correlate prognosis.
The hybrid attention encoder (HAE) uses the denoising contextual attention module to obtain the contextual relationship features.
We also propose an asymmetrically masked triplet masked autoencoder to reconstruct lost information within modalities.
arXiv Detail & Related papers (2024-05-13T12:39:08Z) - SELECTOR: Heterogeneous graph network with convolutional masked autoencoder for multimodal robust prediction of cancer survival [8.403756148610269]
Multimodal prediction of cancer patient survival offers a more comprehensive and precise approach.
This paper introduces SELECTOR, a heterogeneous graph-aware network based on convolutional mask encoders.
Our method significantly outperforms state-of-the-art methods in both modality-missing and intra-modality information-confirmed cases.
arXiv Detail & Related papers (2024-03-14T11:23:39Z) - XAI for In-hospital Mortality Prediction via Multimodal ICU Data [57.73357047856416]
We propose an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data.
We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions.
Our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.
arXiv Detail & Related papers (2023-12-29T14:28:04Z) - Cross-modality Attention-based Multimodal Fusion for Non-small Cell Lung
Cancer (NSCLC) Patient Survival Prediction [0.6476298550949928]
We propose a cross-modality attention-based multimodal fusion pipeline designed to integrate modality-specific knowledge for patient survival prediction in non-small cell lung cancer (NSCLC)
Compared with single modality, which achieved c-index of 0.5772 and 0.5885 using solely tissue image data or RNA-seq data, respectively, the proposed fusion approach achieved c-index 0.6587 in our experiment.
arXiv Detail & Related papers (2023-08-18T21:42:52Z) - MaxCorrMGNN: A Multi-Graph Neural Network Framework for Generalized
Multimodal Fusion of Medical Data for Outcome Prediction [3.2889220522843625]
We develop an innovative fusion approach called MaxCorr MGNN that models non-linear modality correlations within and across patients.
We then design, for the first time, a generalized multi-layered graph neural network (MGNN) for task-informed reasoning in multi-layered graphs.
We evaluate our model an outcome prediction task on a Tuberculosis dataset consistently outperforming several state-of-the-art neural, graph-based and traditional fusion techniques.
arXiv Detail & Related papers (2023-07-13T23:52:41Z) - Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma
Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result.
Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z) - Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal
Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations.
Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z) - M2Net: Multi-modal Multi-channel Network for Overall Survival Time
Prediction of Brain Tumor Patients [151.4352001822956]
Early and accurate prediction of overall survival (OS) time can help to obtain better treatment planning for brain tumor patients.
Existing prediction methods rely on radiomic features at the local lesion area of a magnetic resonance (MR) volume.
We propose an end-to-end OS time prediction model; namely, Multi-modal Multi-channel Network (M2Net)
arXiv Detail & Related papers (2020-06-01T05:21:37Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.