MGI: Multimodal Contrastive pre-training of Genomic and Medical Imaging
- URL: http://arxiv.org/abs/2406.00631v1
- Date: Sun, 2 Jun 2024 06:20:45 GMT
- Title: MGI: Multimodal Contrastive pre-training of Genomic and Medical Imaging
- Authors: Jiaying Zhou, Mingzhou Jiang, Junde Wu, Jiayuan Zhu, Ziyue Wang, Yueming Jin,
- Abstract summary: We propose a multimodal pre-training framework that jointly incorporates genomics and medical images for downstream tasks.
We align medical images and genes using a self-supervised contrastive learning approach which combines the Mamba as a genetic encoder and the Vision Transformer (ViT) as a medical image encoder.
- Score: 16.325123491357203
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Medicine is inherently a multimodal discipline. Medical images can reflect the pathological changes of cancer and tumors, while the expression of specific genes can influence their morphological characteristics. However, most deep learning models employed for these medical tasks are unimodal, making predictions using either image data or genomic data exclusively. In this paper, we propose a multimodal pre-training framework that jointly incorporates genomics and medical images for downstream tasks. To address the issues of high computational complexity and difficulty in capturing long-range dependencies in genes sequence modeling with MLP or Transformer architectures, we utilize Mamba to model these long genomic sequences. We aligns medical images and genes using a self-supervised contrastive learning approach which combines the Mamba as a genetic encoder and the Vision Transformer (ViT) as a medical image encoder. We pre-trained on the TCGA dataset using paired gene expression data and imaging data, and fine-tuned it for downstream tumor segmentation tasks. The results show that our model outperformed a wide range of related methods.
Related papers
- Translating Imaging to Genomics: Leveraging Transformers for Predictive Modeling [9.403446155541346]
We aim to bridge the gap between imaging and genomics data by leveraging transformer networks.
We propose using only available CT/MRI images to predict genomic sequences.
arXiv Detail & Related papers (2024-08-01T06:14:37Z) - End-to-end autoencoding architecture for the simultaneous generation of
medical images and corresponding segmentation masks [3.1133049660590615]
We present an end-to-end architecture based on the Hamiltonian Variational Autoencoder (HVAE)
This approach yields an improved posterior distribution approximation compared to traditional Variational Autoencoders (VAE)
Our method outperforms generative adversarial conditions, showcasing enhancements in image quality synthesis.
arXiv Detail & Related papers (2023-11-17T11:56:53Z) - BiomedJourney: Counterfactual Biomedical Image Generation by
Instruction-Learning from Multimodal Patient Journeys [99.7082441544384]
We present BiomedJourney, a novel method for counterfactual biomedical image generation by instruction-learning.
We use GPT-4 to process the corresponding imaging reports and generate a natural language description of disease progression.
The resulting triples are then used to train a latent diffusion model for counterfactual biomedical image generation.
arXiv Detail & Related papers (2023-10-16T18:59:31Z) - Genetic InfoMax: Exploring Mutual Information Maximization in
High-Dimensional Imaging Genetics Studies [50.11449968854487]
Genome-wide association studies (GWAS) are used to identify relationships between genetic variations and specific traits.
Representation learning for imaging genetics is largely under-explored due to the unique challenges posed by GWAS.
We introduce a trans-modal learning framework Genetic InfoMax (GIM) to address the specific challenges of GWAS.
arXiv Detail & Related papers (2023-09-26T03:59:21Z) - Gene-induced Multimodal Pre-training for Image-omic Classification [20.465959546613554]
This paper proposes a Gene-induced Multimodal Pre-training framework, which jointly incorporates genomics and Whole Slide Images (WSIs) for classification tasks.
Experimental results on the TCGA dataset show the superiority of our network architectures and our pre-training framework, achieving 99.47% in accuracy for image-omic classification.
arXiv Detail & Related papers (2023-09-06T04:30:15Z) - AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context
Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images.
AMIGO uses the celluar graph within the tissue to provide a single representation for a patient.
We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z) - ContIG: Self-supervised Multimodal Contrastive Learning for Medical
Imaging with Genetics [4.907551775445731]
We propose ContIG, a self-supervised method that can learn from large datasets of unlabeled medical images and genetic data.
Our approach aligns images and several genetic modalities in the feature space using a contrastive loss.
We also perform genome-wide association studies on the features learned by our models, uncovering interesting relationships between images and genetic data.
arXiv Detail & Related papers (2021-11-26T11:06:12Z) - TransMed: Transformers Advance Multi-modal Medical Image Classification [4.500880052705654]
convolutional neural networks (CNN) have shown very competitive performance in medical image analysis tasks.
Transformers have been applied to computer vision and achieved remarkable success in large-scale datasets.
TransMed combines the advantages of CNN and transformer to efficiently extract low-level features of images.
arXiv Detail & Related papers (2021-03-10T08:57:53Z) - Medical Transformer: Gated Axial-Attention for Medical Image
Segmentation [73.98974074534497]
We study the feasibility of using Transformer-based network architectures for medical image segmentation tasks.
We propose a Gated Axial-Attention model which extends the existing architectures by introducing an additional control mechanism in the self-attention module.
To train the model effectively on medical images, we propose a Local-Global training strategy (LoGo) which further improves the performance.
arXiv Detail & Related papers (2021-02-21T18:35:14Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.