Incomplete Modality Disentangled Representation for Ophthalmic Disease Grading and Diagnosis
- URL: http://arxiv.org/abs/2502.11724v1
- Date: Mon, 17 Feb 2025 12:10:35 GMT
- Title: Incomplete Modality Disentangled Representation for Ophthalmic Disease Grading and Diagnosis
- Authors: Chengzhi Liu, Zile Huang, Zhe Chen, Feilong Tang, Yu Tian, Zhongxing Xu, Zihong Luo, Yalin Zheng, Yanda Meng,
- Abstract summary: We propose an Incomplete Modality Disentangled Representation (IMDR) strategy to disentangle features into explicit independent modal-common and modal-specific features.
Experiments on four multimodal datasets demonstrate that the proposed IMDR outperforms the state-of-the-art methods significantly.
- Score: 16.95583564875497
- License:
- Abstract: Ophthalmologists typically require multimodal data sources to improve diagnostic accuracy in clinical decisions. However, due to medical device shortages, low-quality data and data privacy concerns, missing data modalities are common in real-world scenarios. Existing deep learning methods tend to address it by learning an implicit latent subspace representation for different modality combinations. We identify two significant limitations of these methods: (1) implicit representation constraints that hinder the model's ability to capture modality-specific information and (2) modality heterogeneity, causing distribution gaps and redundancy in feature representations. To address these, we propose an Incomplete Modality Disentangled Representation (IMDR) strategy, which disentangles features into explicit independent modal-common and modal-specific features by guidance of mutual information, distilling informative knowledge and enabling it to reconstruct valuable missing semantics and produce robust multimodal representations. Furthermore, we introduce a joint proxy learning module that assists IMDR in eliminating intra-modality redundancy by exploiting the extracted proxies from each class. Experiments on four ophthalmology multimodal datasets demonstrate that the proposed IMDR outperforms the state-of-the-art methods significantly.
Related papers
- Continually Evolved Multimodal Foundation Models for Cancer Prognosis [50.43145292874533]
Cancer prognosis is a critical task that involves predicting patient outcomes and survival rates.
Previous studies have integrated diverse data modalities, such as clinical notes, medical images, and genomic data, leveraging their complementary information.
Existing approaches face two major limitations. First, they struggle to incorporate newly arrived data with varying distributions into training, such as patient records from different hospitals.
Second, most multimodal integration methods rely on simplistic concatenation or task-specific pipelines, which fail to capture the complex interdependencies across modalities.
arXiv Detail & Related papers (2025-01-30T06:49:57Z) - AMM-Diff: Adaptive Multi-Modality Diffusion Network for Missing Modality Imputation [2.8498944632323755]
In clinical practice, full imaging is not always feasible, often due to complex acquisition protocols, stringent privacy regulations, or specific clinical needs.
A promising solution is missing data imputation, where absent modalities are generated from available ones.
We propose an Adaptive Multi-Modality Diffusion Network (AMM-Diff), a novel diffusion-based generative model capable of handling any number of input modalities and generating the missing ones.
arXiv Detail & Related papers (2025-01-22T12:29:33Z) - Mutual Information-based Representations Disentanglement for Unaligned Multimodal Language Sequences [25.73415065546444]
Key challenge in unaligned multimodal language sequences is to integrate information from various modalities to obtain a refined multimodal joint representation.
We propose a Mutual Information-based Representations Disentanglement (MIRD) method for unaligned multimodal language sequences.
arXiv Detail & Related papers (2024-09-19T02:12:26Z) - Prototypical Information Bottlenecking and Disentangling for Multimodal
Cancer Survival Prediction [7.772854118416362]
Multimodal learning significantly benefits cancer survival prediction.
Massive redundancy in multimodal data prevents it from extracting discriminative and compact information.
We propose a new framework, Prototypical Information Bottlenecking and Disentangling.
arXiv Detail & Related papers (2024-01-03T09:39:36Z) - Multimodal Explainability via Latent Shift applied to COVID-19 stratification [0.7831774233149619]
We present a deep architecture, which jointly learns modality reconstructions and sample classifications.
We validate our approach in the context of COVID-19 pandemic using the AIforCOVID dataset.
arXiv Detail & Related papers (2022-12-28T20:07:43Z) - Exploiting modality-invariant feature for robust multimodal emotion
recognition with missing modalities [76.08541852988536]
We propose to use invariant features for a missing modality imagination network (IF-MMIN)
We show that the proposed model outperforms all baselines and invariantly improves the overall emotion recognition performance under uncertain missing-modality conditions.
arXiv Detail & Related papers (2022-10-27T12:16:25Z) - Variational Distillation for Multi-View Learning [104.17551354374821]
We design several variational information bottlenecks to exploit two key characteristics for multi-view representation learning.
Under rigorously theoretical guarantee, our approach enables IB to grasp the intrinsic correlation between observations and semantic labels.
arXiv Detail & Related papers (2022-06-20T03:09:46Z) - Multi-Modal Mutual Information Maximization: A Novel Approach for
Unsupervised Deep Cross-Modal Hashing [73.29587731448345]
We propose a novel method, dubbed Cross-Modal Info-Max Hashing (CMIMH)
We learn informative representations that can preserve both intra- and inter-modal similarities.
The proposed method consistently outperforms other state-of-the-art cross-modal retrieval methods.
arXiv Detail & Related papers (2021-12-13T08:58:03Z) - ACN: Adversarial Co-training Network for Brain Tumor Segmentation with
Missing Modalities [26.394130795896704]
We propose a novel Adversarial Co-training Network (ACN) to solve this issue.
ACN enables a coupled learning process for both full modality and missing modality to supplement each other's domain.
Our proposed method significantly outperforms all state-of-the-art methods under any missing situation.
arXiv Detail & Related papers (2021-06-28T11:53:11Z) - Robust Multimodal Brain Tumor Segmentation via Feature Disentanglement
and Gated Fusion [71.87627318863612]
We propose a novel multimodal segmentation framework which is robust to the absence of imaging modalities.
Our network uses feature disentanglement to decompose the input modalities into the modality-specific appearance code.
We validate our method on the important yet challenging multimodal brain tumor segmentation task with the BRATS challenge dataset.
arXiv Detail & Related papers (2020-02-22T14:32:04Z) - MS-Net: Multi-Site Network for Improving Prostate Segmentation with
Heterogeneous MRI Data [75.73881040581767]
We propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations.
Our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
arXiv Detail & Related papers (2020-02-09T14:11:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.