Gene-induced Multimodal Pre-training for Image-omic Classification
- URL: http://arxiv.org/abs/2309.02702v1
- Date: Wed, 6 Sep 2023 04:30:15 GMT
- Title: Gene-induced Multimodal Pre-training for Image-omic Classification
- Authors: Ting Jin and Xingran Xie and Renjie Wan and Qingli Li and Yan Wang
- Abstract summary: This paper proposes a Gene-induced Multimodal Pre-training framework, which jointly incorporates genomics and Whole Slide Images (WSIs) for classification tasks.
Experimental results on the TCGA dataset show the superiority of our network architectures and our pre-training framework, achieving 99.47% in accuracy for image-omic classification.
- Score: 20.465959546613554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Histology analysis of the tumor micro-environment integrated with genomic
assays is the gold standard for most cancers in modern medicine. This paper
proposes a Gene-induced Multimodal Pre-training (GiMP) framework, which jointly
incorporates genomics and Whole Slide Images (WSIs) for classification tasks.
Our work aims at dealing with the main challenges of multi-modality image-omic
classification w.r.t. (1) the patient-level feature extraction difficulties
from gigapixel WSIs and tens of thousands of genes, and (2) effective fusion
considering high-order relevance modeling. Concretely, we first propose a group
multi-head self-attention gene encoder to capture global structured features in
gene expression cohorts. We design a masked patch modeling paradigm (MPM) to
capture the latent pathological characteristics of different tissues. The mask
strategy is randomly masking a fixed-length contiguous subsequence of patch
embeddings of a WSI. Finally, we combine the classification tokens of paired
modalities and propose a triplet learning module to learn high-order relevance
and discriminative patient-level information.After pre-training, a simple
fine-tuning can be adopted to obtain the classification results. Experimental
results on the TCGA dataset show the superiority of our network architectures
and our pre-training framework, achieving 99.47% in accuracy for image-omic
classification. The code is publicly available at
https://github.com/huangwudiduan/GIMP.
Related papers
- FUSECAPS: Investigating Feature Fusion Based Framework for Capsule Endoscopy Image Classification [0.0]
This work offers a strong methodology for classifying endoscopic images.
We suggest a hybrid feature extraction method that combines convolutional neural networks (CNNs), multi-layer perceptrons (MLPs), and radiomics.
We have achieved a validation accuracy of 76.2% in the capsule endoscopy video frame classification task.
arXiv Detail & Related papers (2024-11-04T21:55:52Z) - MM-UNet: A Mixed MLP Architecture for Improved Ophthalmic Image Segmentation [3.2846676620336632]
Ophthalmic image segmentation serves as a critical foundation for ocular disease diagnosis.
Transformer-based models address these limitations but introduce substantial computational overhead.
We introduce MM-UNet, an efficient Mixed model tailored for ophthalmic image segmentation.
arXiv Detail & Related papers (2024-08-16T08:34:50Z) - Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective [32.93871326428446]
Recent advances in artificial intelligence (AI) are revolutionizing medical imaging and computational pathology.
A constant challenge in the analysis of digital Whole Slide Images (WSIs) is the problem of aggregating tens of thousands of tile-level image embeddings to a slide-level representation.
This study conducts a benchmarking analysis of ten slide-level aggregation techniques across nine clinically relevant tasks.
arXiv Detail & Related papers (2024-07-10T17:00:57Z) - MGI: Multimodal Contrastive pre-training of Genomic and Medical Imaging [16.325123491357203]
We propose a multimodal pre-training framework that jointly incorporates genomics and medical images for downstream tasks.
We align medical images and genes using a self-supervised contrastive learning approach which combines the Mamba as a genetic encoder and the Vision Transformer (ViT) as a medical image encoder.
arXiv Detail & Related papers (2024-06-02T06:20:45Z) - Genetic InfoMax: Exploring Mutual Information Maximization in
High-Dimensional Imaging Genetics Studies [50.11449968854487]
Genome-wide association studies (GWAS) are used to identify relationships between genetic variations and specific traits.
Representation learning for imaging genetics is largely under-explored due to the unique challenges posed by GWAS.
We introduce a trans-modal learning framework Genetic InfoMax (GIM) to address the specific challenges of GWAS.
arXiv Detail & Related papers (2023-09-26T03:59:21Z) - AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context
Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images.
AMIGO uses the celluar graph within the tissue to provide a single representation for a patient.
We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z) - Hierarchical Transformer for Survival Prediction Using Multimodality
Whole Slide Images and Genomics [63.76637479503006]
Learning good representation of giga-pixel level whole slide pathology images (WSI) for downstream tasks is critical.
This paper proposes a hierarchical-based multimodal transformer framework that learns a hierarchical mapping between pathology images and corresponding genes.
Our architecture requires fewer GPU resources compared with benchmark methods while maintaining better WSI representation ability.
arXiv Detail & Related papers (2022-11-29T23:47:56Z) - Application of Transfer Learning and Ensemble Learning in Image-level
Classification for Breast Histopathology [9.037868656840736]
In Computer-Aided Diagnosis (CAD), traditional classification models mostly use a single network to extract features.
This paper proposes a deep ensemble model based on image-level labels for the binary classification of benign and malignant lesions.
Result: In the ensemble network model with accuracy as the weight, the image-level binary classification achieves an accuracy of $98.90%$.
arXiv Detail & Related papers (2022-04-18T13:31:53Z) - Modality Completion via Gaussian Process Prior Variational Autoencoders
for Multi-Modal Glioma Segmentation [75.58395328700821]
We propose a novel model, Multi-modal Gaussian Process Prior Variational Autoencoder (MGP-VAE), to impute one or more missing sub-modalities for a patient scan.
MGP-VAE can leverage the Gaussian Process (GP) prior on the Variational Autoencoder (VAE) to utilize the subjects/patients and sub-modalities correlations.
We show the applicability of MGP-VAE on brain tumor segmentation where either, two, or three of four sub-modalities may be missing.
arXiv Detail & Related papers (2021-07-07T19:06:34Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.