DEDUCE: Multi-head attention decoupled contrastive learning to discover
cancer subtypes based on multi-omics data
- URL: http://arxiv.org/abs/2307.04075v2
- Date: Mon, 6 Nov 2023 13:11:05 GMT
- Title: DEDUCE: Multi-head attention decoupled contrastive learning to discover
cancer subtypes based on multi-omics data
- Authors: Liangrui Pan, Dazhen Liu, Yutao Dou, Lian Wang, Zhichao Feng, Pengfei
Rong, Liwen Xu, Shaoliang Peng
- Abstract summary: The identification and discovery of cancer subtypes are crucial for the diagnosis, treatment, and prognosis of cancer.
We propose a generalization framework based on attention mechanisms for unsupervised contrastive learning.
The proposed framework includes a decoupled contrastive learning model (DEDUCE) based on a multi-head attention mechanism.
- Score: 4.082329244680199
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to the high heterogeneity and clinical characteristics of cancer, there
are significant differences in multi-omics data and clinical features among
subtypes of different cancers. Therefore, the identification and discovery of
cancer subtypes are crucial for the diagnosis, treatment, and prognosis of
cancer. In this study, we proposed a generalization framework based on
attention mechanisms for unsupervised contrastive learning to analyze cancer
multi-omics data for the identification and characterization of cancer
subtypes. The framework contains a symmetric unsupervised multi-head attention
encoder, which can deeply extract contextual features and long-range
dependencies of multi-omics data, reducing the impact of noise in multi-omics
data. Importantly, the proposed framework includes a decoupled contrastive
learning model (DEDUCE) based on a multi-head attention mechanism to learn
multi-omics data features and clustering and identify cancer subtypes. This
method clusters subtypes by calculating the similarity between samples in the
feature space and sample space of multi-omics data. The basic idea is to
decouple different attributes of multi-omics data features and learn them as
contrasting terms. Construct a contrastive loss function to measure the
difference between positive examples and negative examples, and minimize this
difference, thereby encouraging the model to learn better feature
representation. The DEDUCE model conducts large-scale experiments on simulated
multi-omics data sets, single-cell multi-omics data sets and cancer multi-omics
data sets, and the results are better than 10 deep learning models. Finally, we
used the DEDUCE model to reveal six cancer subtypes of AML. By analyzing GO
functional enrichment, subtype-specific biological functions and GSEA of AML,
Related papers
- Graph Kolmogorov-Arnold Networks for Multi-Cancer Classification and Biomarker Identification, An Interpretable Multi-Omics Approach [36.92842246372894]
Multi-Omics Graph Kolmogorov-Arnold Network (MOGKAN) is a deep learning framework that utilizes messenger-RNA, micro-RNA sequences, and DNA methylation samples.
By integrating multi-omics data with graph-based deep learning, our proposed approach demonstrates robust predictive performance and interpretability.
arXiv Detail & Related papers (2025-03-29T02:14:05Z) - Adaptive Prototype Learning for Multimodal Cancer Survival Analysis [8.179859593451285]
We propose Adaptive Prototype Learning (APL), a novel and effective approach for multimodal cancer survival analysis.
APL adaptively learns representative prototypes in a data-driven manner, reducing redundancy while preserving critical information.
Our method employs two sets of learnable query vectors that serve as a bridge between high-dimensional representations and survival prediction.
arXiv Detail & Related papers (2025-03-06T17:32:15Z) - Multi-omics data integration for early diagnosis of hepatocellular carcinoma (HCC) using machine learning [8.700808005009806]
We compare the performance of ensemble machine learning algorithms capable of late integration of multi-class data from different modalities.
Two boosted methods, PB-MVBoost and Adaboost with a soft vote were the overall best performing models.
arXiv Detail & Related papers (2024-09-20T09:38:02Z) - PACS: Prediction and analysis of cancer subtypes from multi-omics data
based on a multi-head attention mechanism model [2.275409158519155]
We propose a supervised multi-head attention mechanism model (SMA) to classify cancer subtypes successfully.
The attention mechanism and feature sharing module of the SMA model can successfully learn the global and local feature information of multi-omics data.
The SMA model achieves the highest accuracy, F1 macroscopic, F1 weighted, and accurate classification of cancer subtypes in simulated, single-cell, and cancer multiomics datasets.
arXiv Detail & Related papers (2023-08-21T03:54:21Z) - MoCLIM: Towards Accurate Cancer Subtyping via Multi-Omics Contrastive
Learning with Omics-Inference Modeling [9.900594964709116]
We develop MoCLIM, a representation learning framework for cancer subtyping.
We show that our approach significantly improves data fit and subtyping performance in fewer high-dimensional cancer instances.
Our framework incorporates various medical evaluations as the final component, providing high interpretability in medical analysis.
arXiv Detail & Related papers (2023-08-17T10:49:48Z) - Cancer Subtyping by Improved Transcriptomic Features Using Vector
Quantized Variational Autoencoder [10.835673227875615]
We propose Vector Quantized Variational AutoEncoder (VQ-VAE) to tackle the data issues and extract informative latent features that are crucial to the quality of subsequent clustering.
VQ-VAE does not impose strict assumptions and hence its latent features are better representations of the input, capable of yielding superior clustering performance with any mainstream clustering method.
arXiv Detail & Related papers (2022-07-20T09:47:53Z) - SubOmiEmbed: Self-supervised Representation Learning of Multi-omics Data
for Cancer Type Classification [4.992154875028543]
Integration and analysis of multi-omics data give us a broad view of tumours, which can improve clinical decision making.
SubOmiEmbed produces comparable results to the baseline OmiEmbed with a much smaller network and by using just a subset of the data.
This work can be improved to integrate mutation-based genomic data as well.
arXiv Detail & Related papers (2022-02-03T16:39:09Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.