Knowledge-driven Subspace Fusion and Gradient Coordination for Multi-modal Learning
- URL: http://arxiv.org/abs/2406.13979v1
- Date: Thu, 20 Jun 2024 04:01:35 GMT
- Title: Knowledge-driven Subspace Fusion and Gradient Coordination for Multi-modal Learning
- Authors: Yupei Zhang, Xiaofei Wang, Fangliangzi Meng, Jin Tang, Chao Li,
- Abstract summary: Multi-modal learning plays a crucial role in cancer diagnosis and prognosis.
Current deep learning based multi-modal approaches are often limited by their abilities to model the complex correlations between genomics and histology data.
We propose a biologically interpretative and robust multi-modal learning framework to efficiently integrate histology images and genomics.
- Score: 14.1062929554591
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-modal learning plays a crucial role in cancer diagnosis and prognosis. Current deep learning based multi-modal approaches are often limited by their abilities to model the complex correlations between genomics and histology data, addressing the intrinsic complexity of tumour ecosystem where both tumour and microenvironment contribute to malignancy. We propose a biologically interpretative and robust multi-modal learning framework to efficiently integrate histology images and genomics by decomposing the feature subspace of histology images and genomics, reflecting distinct tumour and microenvironment features. To enhance cross-modal interactions, we design a knowledge-driven subspace fusion scheme, consisting of a cross-modal deformable attention module and a gene-guided consistency strategy. Additionally, in pursuit of dynamically optimizing the subspace knowledge, we further propose a novel gradient coordination learning strategy. Extensive experiments demonstrate the effectiveness of the proposed method, outperforming state-of-the-art techniques in three downstream tasks of glioma diagnosis, tumour grading, and survival analysis. Our code is available at https://github.com/helenypzhang/Subspace-Multimodal-Learning.
Related papers
- Automated Ensemble Multimodal Machine Learning for Healthcare [52.500923923797835]
We introduce a multimodal framework, AutoPrognosis-M, that enables the integration of structured clinical (tabular) data and medical imaging using automated machine learning.
AutoPrognosis-M incorporates 17 imaging models, including convolutional neural networks and vision transformers, and three distinct multimodal fusion strategies.
arXiv Detail & Related papers (2024-07-25T17:46:38Z) - GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation [68.63955715643974]
Modality-prompted Heterogeneous Graph for Omnimodal Learning (GTP-4o)
We propose an innovative Modality-prompted Heterogeneous Graph for Omnimodal Learning (GTP-4o)
arXiv Detail & Related papers (2024-07-08T01:06:13Z) - Unified Modeling Enhanced Multimodal Learning for Precision Neuro-Oncology [8.802214988309684]
We introduce a hierarchical attention structure to leverage shared and complementary features of both modalities of histology and genomics.
Our method surpasses previous state-of-the-art approaches in glioma diagnosis and prognosis tasks.
arXiv Detail & Related papers (2024-06-11T09:06:41Z) - Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation [56.34634121544929]
In this study, we first construct the brain-effective network via the dynamic causal model.
We then introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE)
This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks.
arXiv Detail & Related papers (2024-05-21T20:37:07Z) - Pathology-genomic fusion via biologically informed cross-modality graph learning for survival analysis [7.996257103473235]
We propose Pathology-Genome Heterogeneous Graph (PGHG) that integrates whole slide images (WSI) and bulk RNA-Seq expression data with heterogeneous graph neural network for cancer survival analysis.
The PGHG consists of biological knowledge-guided representation learning network and pathology-genome heterogeneous graph.
We evaluate the model on low-grade gliomas, glioblastoma, and kidney renal papillary cell carcinoma datasets from the Cancer Genome Atlas.
arXiv Detail & Related papers (2024-04-11T09:07:40Z) - Modality-Aware and Shift Mixer for Multi-modal Brain Tumor Segmentation [12.094890186803958]
We present a novel Modality Aware and Shift Mixer that integrates intra-modality and inter-modality dependencies of multi-modal images for effective and robust brain tumor segmentation.
Specifically, we introduce a Modality-Aware module according to neuroimaging studies for modeling the specific modality pair relationships at low levels, and a Modality-Shift module with specific mosaic patterns is developed to explore the complex relationships across modalities at high levels via the self-attention.
arXiv Detail & Related papers (2024-03-04T14:21:51Z) - Joint Self-Supervised and Supervised Contrastive Learning for Multimodal
MRI Data: Towards Predicting Abnormal Neurodevelopment [5.771221868064265]
We present a novel joint self-supervised and supervised contrastive learning method to learn the robust latent feature representation from multimodal MRI data.
Our method has the capability to facilitate computer-aided diagnosis within clinical practice, harnessing the power of multimodal data.
arXiv Detail & Related papers (2023-12-22T21:05:51Z) - Multimodal Data Integration for Oncology in the Era of Deep Neural Networks: A Review [0.0]
Integrating diverse data types can improve the accuracy and reliability of cancer diagnosis and treatment.
Deep neural networks have facilitated the development of sophisticated multimodal data fusion approaches.
Recent deep learning frameworks such as Graph Neural Networks (GNNs) and Transformers have shown remarkable success in multimodal learning.
arXiv Detail & Related papers (2023-03-11T17:52:03Z) - Multimodal foundation models are better simulators of the human brain [65.10501322822881]
We present a newly-designed multimodal foundation model pre-trained on 15 million image-text pairs.
We find that both visual and lingual encoders trained multimodally are more brain-like compared with unimodal ones.
arXiv Detail & Related papers (2022-08-17T12:36:26Z) - Cross-Modality Deep Feature Learning for Brain Tumor Segmentation [158.8192041981564]
This paper proposes a novel cross-modality deep feature learning framework to segment brain tumors from the multi-modality MRI data.
The core idea is to mine rich patterns across the multi-modality data to make up for the insufficient data scale.
Comprehensive experiments are conducted on the BraTS benchmarks, which show that the proposed cross-modality deep feature learning framework can effectively improve the brain tumor segmentation performance.
arXiv Detail & Related papers (2022-01-07T07:46:01Z) - Learning Binary Semantic Embedding for Histology Image Classification
and Retrieval [56.34863511025423]
We propose a novel method for Learning Binary Semantic Embedding (LBSE)
Based on the efficient and effective embedding, classification and retrieval are performed to provide interpretable computer-assisted diagnosis for histology images.
Experiments conducted on three benchmark datasets validate the superiority of LBSE under various scenarios.
arXiv Detail & Related papers (2020-10-07T08:36:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.