Improved Multimodal Fusion for Small Datasets with Auxiliary Supervision
- URL: http://arxiv.org/abs/2304.00379v1
- Date: Sat, 1 Apr 2023 20:07:10 GMT
- Title: Improved Multimodal Fusion for Small Datasets with Auxiliary Supervision
- Authors: Gregory Holste, Douwe van der Wal, Hans Pinckaers, Rikiya Yamashita,
Akinori Mitani, Andre Esteva
- Abstract summary: We propose three simple methods for improved multimodal fusion with small datasets.
The proposed methods are straightforward to implement and can be applied to any classification task with paired image and non-image data.
- Score: 3.8750633583374143
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Prostate cancer is one of the leading causes of cancer-related death in men
worldwide. Like many cancers, diagnosis involves expert integration of
heterogeneous patient information such as imaging, clinical risk factors, and
more. For this reason, there have been many recent efforts toward deep
multimodal fusion of image and non-image data for clinical decision tasks. Many
of these studies propose methods to fuse learned features from each patient
modality, providing significant downstream improvements with techniques like
cross-modal attention gating, Kronecker product fusion, orthogonality
regularization, and more. While these enhanced fusion operations can improve
upon feature concatenation, they often come with an extremely high learning
capacity, meaning they are likely to overfit when applied even to small or
low-dimensional datasets. Rather than designing a highly expressive fusion
operation, we propose three simple methods for improved multimodal fusion with
small datasets that aid optimization by generating auxiliary sources of
supervision during training: extra supervision, clinical prediction, and dense
fusion. We validate the proposed approaches on prostate cancer diagnosis from
paired histopathology imaging and tabular clinical features. The proposed
methods are straightforward to implement and can be applied to any
classification task with paired image and non-image data.
Related papers
- Boosting Medical Image-based Cancer Detection via Text-guided Supervision from Reports [68.39938936308023]
We propose a novel text-guided learning method to achieve highly accurate cancer detection results.
Our approach can leverage clinical knowledge by large-scale pre-trained VLM to enhance generalization ability.
arXiv Detail & Related papers (2024-05-23T07:03:38Z) - MMFusion: Multi-modality Diffusion Model for Lymph Node Metastasis Diagnosis in Esophageal Cancer [13.74067035373274]
We introduce a multi-modal heterogeneous graph-based conditional feature-guided diffusion model for lymph node metastasis diagnosis based on CT images.
We propose a masked relational representation learning strategy, aiming to uncover the latent prognostic correlations and priorities of primary tumor and lymph node image representations.
arXiv Detail & Related papers (2024-05-15T17:52:00Z) - SELECTOR: Heterogeneous graph network with convolutional masked autoencoder for multimodal robust prediction of cancer survival [8.403756148610269]
Multimodal prediction of cancer patient survival offers a more comprehensive and precise approach.
This paper introduces SELECTOR, a heterogeneous graph-aware network based on convolutional mask encoders.
Our method significantly outperforms state-of-the-art methods in both modality-missing and intra-modality information-confirmed cases.
arXiv Detail & Related papers (2024-03-14T11:23:39Z) - Optimizing Skin Lesion Classification via Multimodal Data and Auxiliary
Task Integration [54.76511683427566]
This research introduces a novel multimodal method for classifying skin lesions, integrating smartphone-captured images with essential clinical and demographic information.
A distinctive aspect of this method is the integration of an auxiliary task focused on super-resolution image prediction.
The experimental evaluations have been conducted using the PAD-UFES20 dataset, applying various deep-learning architectures.
arXiv Detail & Related papers (2024-02-16T05:16:20Z) - A New Multimodal Medical Image Fusion based on Laplacian Autoencoder
with Channel Attention [3.1531360678320897]
Deep learning models have achieved end-to-end image fusion with highly robust and accurate performance.
Most DL-based fusion models perform down-sampling on the input images to minimize the number of learnable parameters and computations.
We propose a new multimodal medical image fusion model is proposed that is based on integrated Laplacian-Gaussian concatenation with attention pooling.
arXiv Detail & Related papers (2023-10-18T11:29:53Z) - Cross-modality Attention-based Multimodal Fusion for Non-small Cell Lung
Cancer (NSCLC) Patient Survival Prediction [0.6476298550949928]
We propose a cross-modality attention-based multimodal fusion pipeline designed to integrate modality-specific knowledge for patient survival prediction in non-small cell lung cancer (NSCLC)
Compared with single modality, which achieved c-index of 0.5772 and 0.5885 using solely tissue image data or RNA-seq data, respectively, the proposed fusion approach achieved c-index 0.6587 in our experiment.
arXiv Detail & Related papers (2023-08-18T21:42:52Z) - Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges [58.32937972322058]
"Medico automatic polyp segmentation (Medico 2020)" and "MedAI: Transparency in Medical Image (MedAI 2021)" competitions.
We present a comprehensive summary and analyze each contribution, highlight the strength of the best-performing methods, and discuss the possibility of clinical translations of such methods into the clinic.
arXiv Detail & Related papers (2023-07-30T16:08:45Z) - Ambiguous Medical Image Segmentation using Diffusion Models [60.378180265885945]
We introduce a single diffusion model-based approach that produces multiple plausible outputs by learning a distribution over group insights.
Our proposed model generates a distribution of segmentation masks by leveraging the inherent sampling process of diffusion.
Comprehensive results show that our proposed approach outperforms existing state-of-the-art ambiguous segmentation networks.
arXiv Detail & Related papers (2023-04-10T17:58:22Z) - RadioPathomics: Multimodal Learning in Non-Small Cell Lung Cancer for
Adaptive Radiotherapy [1.8161758803237067]
We develop a multimodal late fusion approach to predict radiation therapy outcomes for non-small-cell lung cancer patients.
Experiments show that the proposed multimodal paradigm with an AUC equal to $90.9%$ outperforms each unimodal approach.
arXiv Detail & Related papers (2022-04-26T16:32:52Z) - Explaining Clinical Decision Support Systems in Medical Imaging using
Cycle-Consistent Activation Maximization [112.2628296775395]
Clinical decision support using deep neural networks has become a topic of steadily growing interest.
clinicians are often hesitant to adopt the technology because its underlying decision-making process is considered to be intransparent and difficult to comprehend.
We propose a novel decision explanation scheme based on CycleGAN activation which generates high-quality visualizations of classifier decisions even in smaller data sets.
arXiv Detail & Related papers (2020-10-09T14:39:27Z) - Robust Multimodal Brain Tumor Segmentation via Feature Disentanglement
and Gated Fusion [71.87627318863612]
We propose a novel multimodal segmentation framework which is robust to the absence of imaging modalities.
Our network uses feature disentanglement to decompose the input modalities into the modality-specific appearance code.
We validate our method on the important yet challenging multimodal brain tumor segmentation task with the BRATS challenge dataset.
arXiv Detail & Related papers (2020-02-22T14:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.