AdaMSS: Adaptive Multi-Modality Segmentation-to-Survival Learning for Survival Outcome Prediction from PET/CT Images
- URL: http://arxiv.org/abs/2305.09946v3
- Date: Wed, 16 Oct 2024 01:10:44 GMT
- Title: AdaMSS: Adaptive Multi-Modality Segmentation-to-Survival Learning for Survival Outcome Prediction from PET/CT Images
- Authors: Mingyuan Meng, Bingxin Gu, Michael Fulham, Shaoli Song, Dagan Feng, Lei Bi, Jinman Kim,
- Abstract summary: Deep survival models based on deep learning have been widely adopted to perform end-to-end survival prediction from medical images.
Recent deep survival models achieved promising performance by jointly performing tumor segmentation with survival prediction.
Existing deep survival models are unable to effectively leverage multi-modality images.
We propose a data-driven strategy to fuse multi-modality information, which realizes adaptive optimization of fusion strategies.
- Score: 11.028672732944251
- License:
- Abstract: Survival prediction is a major concern for cancer management. Deep survival models based on deep learning have been widely adopted to perform end-to-end survival prediction from medical images. Recent deep survival models achieved promising performance by jointly performing tumor segmentation with survival prediction, where the models were guided to extract tumor-related information through Multi-Task Learning (MTL). However, these deep survival models have difficulties in exploring out-of-tumor prognostic information. In addition, existing deep survival models are unable to effectively leverage multi-modality images. Empirically-designed fusion strategies were commonly adopted to fuse multi-modality information via task-specific manually-designed networks, thus limiting the adaptability to different scenarios. In this study, we propose an Adaptive Multi-modality Segmentation-to-Survival model (AdaMSS) for survival prediction from PET/CT images. Instead of adopting MTL, we propose a novel Segmentation-to-Survival Learning (SSL) strategy, where our AdaMSS is trained for tumor segmentation and survival prediction sequentially in two stages. This strategy enables the AdaMSS to focus on tumor regions in the first stage and gradually expand its focus to include other prognosis-related regions in the second stage. We also propose a data-driven strategy to fuse multi-modality information, which realizes adaptive optimization of fusion strategies based on training data during training. With the SSL and data-driven fusion strategies, our AdaMSS is designed as an adaptive model that can self-adapt its focus regions and fusion strategy for different training stages. Extensive experiments with two large clinical datasets show that our AdaMSS outperforms state-of-the-art survival prediction methods.
Related papers
- Enhanced Survival Prediction in Head and Neck Cancer Using Convolutional Block Attention and Multimodal Data Fusion [7.252280210331731]
This paper proposes a deep learning-based approach to predict survival outcomes in head and neck cancer patients.
Our method integrates feature extraction with a Convolutional Block Attention Module (CBAM) and a multi-modal data fusion layer.
The final prediction is achieved through a fully parametric discrete-time survival model.
arXiv Detail & Related papers (2024-10-29T07:56:04Z) - M2EF-NNs: Multimodal Multi-instance Evidence Fusion Neural Networks for Cancer Survival Prediction [24.323961146023358]
We propose a neural network model called M2EF-NNs for accurate cancer survival prediction.
To capture global information in the images, we use a pre-trained Vision Transformer (ViT) model.
We are the first to apply the Dempster-Shafer evidence theory (DST) to cancer survival prediction.
arXiv Detail & Related papers (2024-08-08T02:31:04Z) - Advancing Head and Neck Cancer Survival Prediction via Multi-Label Learning and Deep Model Interpretation [7.698783025721071]
We propose IMLSP, an Interpretable Multi-Label multi-modal deep Survival Prediction framework for predicting multiple HNC survival outcomes simultaneously.
We also present Grad-TEAM, a Gradient-weighted Time-Event Activation Mapping approach specifically developed for deep survival model visual explanation.
arXiv Detail & Related papers (2024-05-09T01:30:04Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - Merging-Diverging Hybrid Transformer Networks for Survival Prediction in
Head and Neck Cancer [10.994223928445589]
We propose a merging-diverging learning framework for survival prediction from multi-modality images.
This framework has a merging encoder to fuse multi-modality information and a diverging decoder to extract region-specific information.
Our framework is demonstrated on survival prediction from PET-CT images in Head and Neck (H&N) cancer.
arXiv Detail & Related papers (2023-07-07T07:16:03Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - TMSS: An End-to-End Transformer-based Multimodal Network for
Segmentation and Survival Prediction [0.0]
oncologists do not do this in their analysis but rather fuse the information in their brain from multiple sources such as medical images and patient history.
This work proposes a deep learning method that mimics oncologists' analytical behavior when quantifying cancer and estimating patient survival.
arXiv Detail & Related papers (2022-09-12T06:22:05Z) - Cross-Modality Deep Feature Learning for Brain Tumor Segmentation [158.8192041981564]
This paper proposes a novel cross-modality deep feature learning framework to segment brain tumors from the multi-modality MRI data.
The core idea is to mine rich patterns across the multi-modality data to make up for the insufficient data scale.
Comprehensive experiments are conducted on the BraTS benchmarks, which show that the proposed cross-modality deep feature learning framework can effectively improve the brain tumor segmentation performance.
arXiv Detail & Related papers (2022-01-07T07:46:01Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Shape-aware Meta-learning for Generalizing Prostate MRI Segmentation to
Unseen Domains [68.73614619875814]
We present a novel shape-aware meta-learning scheme to improve the model generalization in prostate MRI segmentation.
Experimental results show that our approach outperforms many state-of-the-art generalization methods consistently across all six settings of unseen domains.
arXiv Detail & Related papers (2020-07-04T07:56:02Z) - M2Net: Multi-modal Multi-channel Network for Overall Survival Time
Prediction of Brain Tumor Patients [151.4352001822956]
Early and accurate prediction of overall survival (OS) time can help to obtain better treatment planning for brain tumor patients.
Existing prediction methods rely on radiomic features at the local lesion area of a magnetic resonance (MR) volume.
We propose an end-to-end OS time prediction model; namely, Multi-modal Multi-channel Network (M2Net)
arXiv Detail & Related papers (2020-06-01T05:21:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.