CLCLSA: Cross-omics Linked embedding with Contrastive Learning and Self
Attention for multi-omics integration with incomplete multi-omics data
- URL: http://arxiv.org/abs/2304.05542v1
- Date: Wed, 12 Apr 2023 00:22:18 GMT
- Title: CLCLSA: Cross-omics Linked embedding with Contrastive Learning and Self
Attention for multi-omics integration with incomplete multi-omics data
- Authors: Chen Zhao, Anqi Liu, Xiao Zhang, Xuewei Cao, Zhengming Ding, Qiuying
Sha, Hui Shen, Hong-Wen Deng, Weihua Zhou
- Abstract summary: Integration of heterogeneous and high-dimensional multi-omics data is becoming increasingly important in understanding genetic data.
One obstacle faced when performing multi-omics data integration is the existence of unpaired multi-omics data due to instrument sensitivity and cost.
We propose a deep learning method for multi-omics integration with incomplete data by Cross-omics Linked unified embedding with Contrastive Learning and Self Attention.
- Score: 47.2764293508916
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Integration of heterogeneous and high-dimensional multi-omics data is
becoming increasingly important in understanding genetic data. Each omics
technique only provides a limited view of the underlying biological process and
integrating heterogeneous omics layers simultaneously would lead to a more
comprehensive and detailed understanding of diseases and phenotypes. However,
one obstacle faced when performing multi-omics data integration is the
existence of unpaired multi-omics data due to instrument sensitivity and cost.
Studies may fail if certain aspects of the subjects are missing or incomplete.
In this paper, we propose a deep learning method for multi-omics integration
with incomplete data by Cross-omics Linked unified embedding with Contrastive
Learning and Self Attention (CLCLSA). Utilizing complete multi-omics data as
supervision, the model employs cross-omics autoencoders to learn the feature
representation across different types of biological data. The multi-omics
contrastive learning, which is used to maximize the mutual information between
different types of omics, is employed before latent feature concatenation. In
addition, the feature-level self-attention and omics-level self-attention are
employed to dynamically identify the most informative features for multi-omics
data integration. Extensive experiments were conducted on four public
multi-omics datasets. The experimental results indicated that the proposed
CLCLSA outperformed the state-of-the-art approaches for multi-omics data
classification using incomplete multi-omics data.
Related papers
- MVKTrans: Multi-View Knowledge Transfer for Robust Multiomics Classification [14.533025681231294]
We propose the multi-view knowledge transfer learning framework, which transfers intra- and inter-omics knowledge in an adaptive manner.
Specifically, we design a graph contrastive module that is trained on unlabeled data to effectively learn and transfer the underlying intra-omics patterns to the supervised task.
In light of the varying discriminative capacities of modalities across different diseases and/or samples, we introduce an adaptive and bi-directional cross-omics distillation module.
arXiv Detail & Related papers (2024-11-13T15:45:46Z) - UNICORN: A Deep Learning Model for Integrating Multi-Stain Data in Histopathology [2.9389205138207277]
UNICORN is a multi-modal transformer capable of processing multi-stain histopathology for atherosclerosis severity class prediction.
The architecture comprises a two-stage, end-to-end trainable model with specialized modules utilizing transformer self-attention blocks.
UNICORN achieved a classification accuracy of 0.67, outperforming other state-of-the-art models.
arXiv Detail & Related papers (2024-09-26T12:13:52Z) - UniCell: Universal Cell Nucleus Classification via Prompt Learning [76.11864242047074]
We propose a universal cell nucleus classification framework (UniCell)
It employs a novel prompt learning mechanism to uniformly predict the corresponding categories of pathological images from different dataset domains.
In particular, our framework adopts an end-to-end architecture for nuclei detection and classification, and utilizes flexible prediction heads for adapting various datasets.
arXiv Detail & Related papers (2024-02-20T11:50:27Z) - Integrate Any Omics: Towards genome-wide data integration for patient
stratification [6.893309898200498]
IntegrAO is an unsupervised framework for integrating incomplete multi-omics data and classifying new samples.
IntegrAO's ability to handle heterogeneous and incomplete data makes it an essential tool for precision oncology.
arXiv Detail & Related papers (2024-01-15T19:57:07Z) - Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks.
We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z) - scICML: Information-theoretic Co-clustering-based Multi-view Learning
for the Integrative Analysis of Single-cell Multi-omics data [0.0]
We develop a novel information-theoretic co-clustering-based multi-view learning (scICML) method for multi-omics single-cell data integration.
scICML utilizes co-clusterings to aggregate similar features for each view of data and uncover the common clustering pattern for cells.
Our experiments on four real-world datasets demonstrate that scICML improves the overall clustering performance and provides biological insights into the data analysis of peripheral blood mononuclear cells.
arXiv Detail & Related papers (2022-05-19T12:41:55Z) - Relational Subsets Knowledge Distillation for Long-tailed Retinal
Diseases Recognition [65.77962788209103]
We propose class subset learning by dividing the long-tailed data into multiple class subsets according to prior knowledge.
It enforces the model to focus on learning the subset-specific knowledge.
The proposed framework proved to be effective for the long-tailed retinal diseases recognition task.
arXiv Detail & Related papers (2021-04-22T13:39:33Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - A generalized kernel machine approach to identify higher-order composite
effects in multi-view datasets [4.579719459619913]
We propose a novel kernel machine approach to identify higher-order composite effects in multi-view biomedical datasets.
The proposed method can effectively identify higher-order composite effects and suggest that corresponding features function in a concerted effort.
arXiv Detail & Related papers (2020-04-29T08:56:02Z) - MS-Net: Multi-Site Network for Improving Prostate Segmentation with
Heterogeneous MRI Data [75.73881040581767]
We propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations.
Our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
arXiv Detail & Related papers (2020-02-09T14:11:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.