Related papers: DRIM: Learning Disentangled Representations from Incomplete Multimodal Healthcare Data

DRIM: Learning Disentangled Representations from Incomplete Multimodal Healthcare Data

URL: http://arxiv.org/abs/2409.17055v2
Date: Tue, 1 Oct 2024 15:47:14 GMT
Title: DRIM: Learning Disentangled Representations from Incomplete Multimodal Healthcare Data
Authors: Lucas Robinet, Ahmad Berjaoui, Ziad Kheil, Elizabeth Cohen-Jonathan Moyal,
Abstract summary: Real-life medical data is often multimodal and incomplete, fueling the need for advanced deep learning models. We introduce DRIM, a new method for capturing shared and unique representations, despite data sparsity. Our method outperforms state-of-the-art algorithms on glioma patients survival prediction tasks, while being robust to missing modalities.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Real-life medical data is often multimodal and incomplete, fueling the growing need for advanced deep learning models capable of integrating them efficiently. The use of diverse modalities, including histopathology slides, MRI, and genetic data, offers unprecedented opportunities to improve prognosis prediction and to unveil new treatment pathways. Contrastive learning, widely used for deriving representations from paired data in multimodal tasks, assumes that different views contain the same task-relevant information and leverages only shared information. This assumption becomes restrictive when handling medical data since each modality also harbors specific knowledge relevant to downstream tasks. We introduce DRIM, a new multimodal method for capturing these shared and unique representations, despite data sparsity. More specifically, given a set of modalities, we aim to encode a representation for each one that can be divided into two components: one encapsulating patient-related information common across modalities and the other, encapsulating modality-specific details. This is achieved by increasing the shared information among different patient modalities while minimizing the overlap between shared and unique components within each modality. Our method outperforms state-of-the-art algorithms on glioma patients survival prediction tasks, while being robust to missing modalities. To promote reproducibility, the code is made publicly available at https://github.com/Lucas-rbnt/DRIM

Related papers

Multimodal Masked Autoencoder Pre-training for 3D MRI-Based Brain Tumor Analysis with Missing Modalities [0.0]
BM-MAE is a masked image modeling pre-training strategy tailored for multimodal MRI data. It seamlessly adapts to any combination of available modalities, extracting rich representations that capture both intra- and inter-modal information. It can quickly and efficiently reconstruct missing modalities, highlighting its practical value.
arXiv Detail & Related papers (2025-05-01T14:51:30Z)
Towards Robust Multimodal Representation: A Unified Approach with Adaptive Experts and Alignment [0.8213829427624407]
We propose a new multi-model model called MoSARe, a deep learning framework that handles incomplete multimodal data. MoSARe integrates expert selection, cross-modal attention, and contrastive learning to improve feature representation and decision-making. It provides reliable predictions even when some data are missing.
arXiv Detail & Related papers (2025-03-12T16:03:00Z)
What are You Looking at? Modality Contribution in Multimodal Medical Deep Learning Methods [0.13194391758295113]
We present a method that measures the importance of each modality in a dataset for the model to fulfill its task. We found that some networks have modality preferences that tend to unimodal collapses, while some datasets are imbalanced from the ground up. With our method we make a crucial contribution to the field of interpretability in deep learning based multimodal research.
arXiv Detail & Related papers (2025-02-28T12:39:39Z)
Continually Evolved Multimodal Foundation Models for Cancer Prognosis [50.43145292874533]
Cancer prognosis is a critical task that involves predicting patient outcomes and survival rates. Previous studies have integrated diverse data modalities, such as clinical notes, medical images, and genomic data, leveraging their complementary information. Existing approaches face two major limitations. First, they struggle to incorporate newly arrived data with varying distributions into training, such as patient records from different hospitals. Second, most multimodal integration methods rely on simplistic concatenation or task-specific pipelines, which fail to capture the complex interdependencies across modalities.
arXiv Detail & Related papers (2025-01-30T06:49:57Z)
FlexCare: Leveraging Cross-Task Synergy for Flexible Multimodal Healthcare Prediction [34.732561455987145]
We propose a unified healthcare prediction model, also named by textbfFlexCare, to flexibly accommodate incomplete multimodal inputs. A task-agnostic multimodal information extraction module is presented to capture decorrelated representations of diverse intra- and inter-modality patterns. Experimental results on multiple tasks from MIMIC-IV/MIMIC-CXR/MIMIC-NOTE datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2024-06-17T12:03:10Z)
MMIST-ccRCC: A Real World Medical Dataset for the Development of Multi-Modal Systems [12.914295902429]
We introduce a real world multi-modal dataset called MMIST-CCRCC. This dataset comprises 2 radiology modalities (CT and MRI), histopathology, genomics, and clinical data from 618 patients with clear cell renal cell carcinoma (ccRCC) We show that even with such severe missing rates the fusion of modalities leads to improvements in the survival forecasting.
arXiv Detail & Related papers (2024-05-02T18:29:05Z)
NativE: Multi-modal Knowledge Graph Completion in the Wild [51.80447197290866]
We propose a comprehensive framework NativE to achieve MMKGC in the wild. NativE proposes a relation-guided dual adaptive fusion module that enables adaptive fusion for any modalities. We construct a new benchmark called WildKGC with five datasets to evaluate our method.
arXiv Detail & Related papers (2024-03-28T03:04:00Z)
Dynamic Multimodal Information Bottleneck for Multimodality Classification [26.65073424377933]
We propose a dynamic multimodal information bottleneck framework for attaining a robust fused feature representation. Specifically, our information bottleneck module serves to filter out the task-irrelevant information and noises in the fused feature. Our method surpasses the state-of-the-art and is significantly more robust, being the only method to remain performance when large-scale noisy channels exist.
arXiv Detail & Related papers (2023-11-02T08:34:08Z)
Cross-Modal Information Maximization for Medical Imaging: CMIM [62.28852442561818]
In hospitals, data are siloed to specific information systems that make the same information available under different modalities. This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time. We propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time.
arXiv Detail & Related papers (2020-10-20T20:05:35Z)
Towards Cross-modality Medical Image Segmentation with Online Mutual Knowledge Distillation [71.89867233426597]
In this paper, we aim to exploit the prior knowledge learned from one modality to improve the segmentation performance on another modality. We propose a novel Mutual Knowledge Distillation scheme to thoroughly exploit the modality-shared knowledge. Experimental results on the public multi-class cardiac segmentation data, i.e., MMWHS 2017, show that our method achieves large improvements on CT segmentation.
arXiv Detail & Related papers (2020-10-04T10:25:13Z)
M2Net: Multi-modal Multi-channel Network for Overall Survival Time Prediction of Brain Tumor Patients [151.4352001822956]
Early and accurate prediction of overall survival (OS) time can help to obtain better treatment planning for brain tumor patients. Existing prediction methods rely on radiomic features at the local lesion area of a magnetic resonance (MR) volume. We propose an end-to-end OS time prediction model; namely, Multi-modal Multi-channel Network (M2Net)
arXiv Detail & Related papers (2020-06-01T05:21:37Z)
Robust Multimodal Brain Tumor Segmentation via Feature Disentanglement and Gated Fusion [71.87627318863612]
We propose a novel multimodal segmentation framework which is robust to the absence of imaging modalities. Our network uses feature disentanglement to decompose the input modalities into the modality-specific appearance code. We validate our method on the important yet challenging multimodal brain tumor segmentation task with the BRATS challenge dataset.
arXiv Detail & Related papers (2020-02-22T14:32:04Z)
MS-Net: Multi-Site Network for Improving Prostate Segmentation with Heterogeneous MRI Data [75.73881040581767]
We propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations. Our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
arXiv Detail & Related papers (2020-02-09T14:11:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.