GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation
- URL: http://arxiv.org/abs/2407.05540v1
- Date: Mon, 8 Jul 2024 01:06:13 GMT
- Title: GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation
- Authors: Chenxin Li, Xinyu Liu, Cheng Wang, Yifan Liu, Weihao Yu, Jing Shao, Yixuan Yuan,
- Abstract summary: Modality-prompted Heterogeneous Graph for Omnimodal Learning (GTP-4o)
We propose an innovative Modality-prompted Heterogeneous Graph for Omnimodal Learning (GTP-4o)
- Score: 68.63955715643974
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in learning multi-modal representation have witnessed the success in biomedical domains. While established techniques enable handling multi-modal information, the challenges are posed when extended to various clinical modalities and practical modalitymissing setting due to the inherent modality gaps. To tackle these, we propose an innovative Modality-prompted Heterogeneous Graph for Omnimodal Learning (GTP-4o), which embeds the numerous disparate clinical modalities into a unified representation, completes the deficient embedding of missing modality and reformulates the cross-modal learning with a graph-based aggregation. Specially, we establish a heterogeneous graph embedding to explicitly capture the diverse semantic properties on both the modality-specific features (nodes) and the cross-modal relations (edges). Then, we design a modality-prompted completion that enables completing the inadequate graph representation of missing modality through a graph prompting mechanism, which generates hallucination graphic topologies to steer the missing embedding towards the intact representation. Through the completed graph, we meticulously develop a knowledge-guided hierarchical cross-modal aggregation consisting of a global meta-path neighbouring to uncover the potential heterogeneous neighbors along the pathways driven by domain knowledge, and a local multi-relation aggregation module for the comprehensive cross-modal interaction across various heterogeneous relations. We assess the efficacy of our methodology on rigorous benchmarking experiments against prior state-of-the-arts. In a nutshell, GTP-4o presents an initial foray into the intriguing realm of embedding, relating and perceiving the heterogeneous patterns from various clinical modalities holistically via a graph theory. Project page: https://gtp-4-o.github.io/.
Related papers
- ViKL: A Mammography Interpretation Framework via Multimodal Aggregation of Visual-knowledge-linguistic Features [54.37042005469384]
We announce MVKL, the first multimodal mammography dataset encompassing multi-view images, detailed manifestations and reports.
Based on this dataset, we focus on the challanging task of unsupervised pretraining.
We propose ViKL, a framework that synergizes Visual, Knowledge, and Linguistic features.
arXiv Detail & Related papers (2024-09-24T05:01:23Z) - PRAGA: Prototype-aware Graph Adaptive Aggregation for Spatial Multi-modal Omics Analysis [1.1619559582563954]
We propose a novel spatial multi-modal omics resolved framework, termed PRototype-Aware Graph Adaptative Aggregation for Spatial Multi-modal Omics Analysis (PRAGA)
PRAGA constructs a dynamic graph to capture latent semantic relations and comprehensively integrate spatial information and feature semantics.
The learnable graph structure can also denoise perturbations by learning cross-modal knowledge.
arXiv Detail & Related papers (2024-09-19T12:53:29Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Graph Relation Distillation for Efficient Biomedical Instance
Segmentation [80.51124447333493]
We propose a graph relation distillation approach for efficient biomedical instance segmentation.
We introduce two graph distillation schemes deployed at both the intra-image level and the inter-image level.
Experimental results on a number of biomedical datasets validate the effectiveness of our approach.
arXiv Detail & Related papers (2024-01-12T04:41:23Z) - HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data [10.774128925670183]
This paper presents the Hybrid Early-fusion Attention Learning Network (HEALNet), a flexible multimodal fusion architecture.
We conduct multimodal survival analysis on Whole Slide Images and Multi-omic data on four cancer datasets from The Cancer Genome Atlas (TCGA)
HEALNet achieves state-of-the-art performance compared to other end-to-end trained fusion models.
arXiv Detail & Related papers (2023-11-15T17:06:26Z) - Multi-modal Multi-kernel Graph Learning for Autism Prediction and
Biomarker Discovery [29.790200009136825]
We propose a novel method to offset the negative impact between modalities in the process of multi-modal integration and extract heterogeneous information from graphs.
Our method is evaluated on the benchmark Autism Brain Imaging Data Exchange (ABIDE) dataset and outperforms the state-of-the-art methods.
In addition, discriminative brain regions associated with autism are identified by our model, providing guidance for the study of autism pathology.
arXiv Detail & Related papers (2023-03-03T07:09:17Z) - Multi-modal Dynamic Graph Network: Coupling Structural and Functional
Connectome for Disease Diagnosis and Classification [8.67028273829113]
We propose a Multi-modal Dynamic Graph Convolution Network (MDGCN) for structural and functional brain network learning.
Our method benefits from modeling inter-modal representations and relating attentive multi-model associations into dynamic graphs.
arXiv Detail & Related papers (2022-10-25T02:41:32Z) - Latent Heterogeneous Graph Network for Incomplete Multi-View Learning [57.49776938934186]
We propose a novel Latent Heterogeneous Graph Network (LHGN) for incomplete multi-view learning.
By learning a unified latent representation, a trade-off between consistency and complementarity among different views is implicitly realized.
To avoid any inconsistencies between training and test phase, a transductive learning technique is applied based on graph learning for classification tasks.
arXiv Detail & Related papers (2022-08-29T15:14:21Z) - Multi-modal Graph Learning for Disease Prediction [35.4310911850558]
We propose an end-to-end Multimodal Graph Learning framework (MMGL) for disease prediction.
Instead of defining the adjacency matrix manually as existing methods, the latent graph structure can be captured through a novel way of adaptive graph learning.
arXiv Detail & Related papers (2021-07-01T03:59:22Z) - Relational Graph Learning on Visual and Kinematics Embeddings for
Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information.
The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.