DEDUCE: Multi-head attention decoupled contrastive learning to discover
cancer subtypes based on multi-omics data
- URL: http://arxiv.org/abs/2307.04075v2
- Date: Mon, 6 Nov 2023 13:11:05 GMT
- Title: DEDUCE: Multi-head attention decoupled contrastive learning to discover
cancer subtypes based on multi-omics data
- Authors: Liangrui Pan, Dazhen Liu, Yutao Dou, Lian Wang, Zhichao Feng, Pengfei
Rong, Liwen Xu, Shaoliang Peng
- Abstract summary: The identification and discovery of cancer subtypes are crucial for the diagnosis, treatment, and prognosis of cancer.
We propose a generalization framework based on attention mechanisms for unsupervised contrastive learning.
The proposed framework includes a decoupled contrastive learning model (DEDUCE) based on a multi-head attention mechanism.
- Score: 4.082329244680199
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to the high heterogeneity and clinical characteristics of cancer, there
are significant differences in multi-omics data and clinical features among
subtypes of different cancers. Therefore, the identification and discovery of
cancer subtypes are crucial for the diagnosis, treatment, and prognosis of
cancer. In this study, we proposed a generalization framework based on
attention mechanisms for unsupervised contrastive learning to analyze cancer
multi-omics data for the identification and characterization of cancer
subtypes. The framework contains a symmetric unsupervised multi-head attention
encoder, which can deeply extract contextual features and long-range
dependencies of multi-omics data, reducing the impact of noise in multi-omics
data. Importantly, the proposed framework includes a decoupled contrastive
learning model (DEDUCE) based on a multi-head attention mechanism to learn
multi-omics data features and clustering and identify cancer subtypes. This
method clusters subtypes by calculating the similarity between samples in the
feature space and sample space of multi-omics data. The basic idea is to
decouple different attributes of multi-omics data features and learn them as
contrasting terms. Construct a contrastive loss function to measure the
difference between positive examples and negative examples, and minimize this
difference, thereby encouraging the model to learn better feature
representation. The DEDUCE model conducts large-scale experiments on simulated
multi-omics data sets, single-cell multi-omics data sets and cancer multi-omics
data sets, and the results are better than 10 deep learning models. Finally, we
used the DEDUCE model to reveal six cancer subtypes of AML. By analyzing GO
functional enrichment, subtype-specific biological functions and GSEA of AML,
Related papers
- PACS: Prediction and analysis of cancer subtypes from multi-omics data
based on a multi-head attention mechanism model [2.275409158519155]
We propose a supervised multi-head attention mechanism model (SMA) to classify cancer subtypes successfully.
The attention mechanism and feature sharing module of the SMA model can successfully learn the global and local feature information of multi-omics data.
The SMA model achieves the highest accuracy, F1 macroscopic, F1 weighted, and accurate classification of cancer subtypes in simulated, single-cell, and cancer multiomics datasets.
arXiv Detail & Related papers (2023-08-21T03:54:21Z) - MoCLIM: Towards Accurate Cancer Subtyping via Multi-Omics Contrastive
Learning with Omics-Inference Modeling [9.900594964709116]
We develop MoCLIM, a representation learning framework for cancer subtyping.
We show that our approach significantly improves data fit and subtyping performance in fewer high-dimensional cancer instances.
Our framework incorporates various medical evaluations as the final component, providing high interpretability in medical analysis.
arXiv Detail & Related papers (2023-08-17T10:49:48Z) - CLCLSA: Cross-omics Linked embedding with Contrastive Learning and Self
Attention for multi-omics integration with incomplete multi-omics data [47.2764293508916]
Integration of heterogeneous and high-dimensional multi-omics data is becoming increasingly important in understanding genetic data.
One obstacle faced when performing multi-omics data integration is the existence of unpaired multi-omics data due to instrument sensitivity and cost.
We propose a deep learning method for multi-omics integration with incomplete data by Cross-omics Linked unified embedding with Contrastive Learning and Self Attention.
arXiv Detail & Related papers (2023-04-12T00:22:18Z) - Pixel-Level Explanation of Multiple Instance Learning Models in
Biomedical Single Cell Images [52.527733226555206]
We investigate the use of four attribution methods to explain a multiple instance learning models.
We study two datasets of acute myeloid leukemia with over 100 000 single cell images.
We compare attribution maps with the annotations of a medical expert to see how the model's decision-making differs from the human standard.
arXiv Detail & Related papers (2023-03-15T14:00:11Z) - Cancer Subtyping by Improved Transcriptomic Features Using Vector
Quantized Variational Autoencoder [10.835673227875615]
We propose Vector Quantized Variational AutoEncoder (VQ-VAE) to tackle the data issues and extract informative latent features that are crucial to the quality of subsequent clustering.
VQ-VAE does not impose strict assumptions and hence its latent features are better representations of the input, capable of yielding superior clustering performance with any mainstream clustering method.
arXiv Detail & Related papers (2022-07-20T09:47:53Z) - Stacked Autoencoder Based Multi-Omics Data Integration for Cancer
Survival Prediction [3.083561980077652]
We propose a novel method to integrate multi-omics data for cancer survival prediction, called Stacked AutoEncoder-based Survival Prediction Neural Network (SAEsurv-net)
SAEsurv-net addresses the curse of dimensionality with a two-stage dimensionality reduction strategy and handles multi-omics heterogeneity with a stacked computation autoencoder model.
The experiments show that SAEsurv-net outperforms models based on a single type of data as well as other state-of-the-art methods.
arXiv Detail & Related papers (2022-07-08T13:53:11Z) - SubOmiEmbed: Self-supervised Representation Learning of Multi-omics Data
for Cancer Type Classification [4.992154875028543]
Integration and analysis of multi-omics data give us a broad view of tumours, which can improve clinical decision making.
SubOmiEmbed produces comparable results to the baseline OmiEmbed with a much smaller network and by using just a subset of the data.
This work can be improved to integrate mutation-based genomic data as well.
arXiv Detail & Related papers (2022-02-03T16:39:09Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Topological Data Analysis of copy number alterations in cancer [70.85487611525896]
We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach.
We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data.
arXiv Detail & Related papers (2020-11-22T17:31:23Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.