OmiEmbed: reconstruct comprehensive phenotypic information from
multi-omics data using multi-task deep learning
- URL: http://arxiv.org/abs/2102.02669v1
- Date: Wed, 3 Feb 2021 07:34:29 GMT
- Title: OmiEmbed: reconstruct comprehensive phenotypic information from
multi-omics data using multi-task deep learning
- Authors: Xiaoyu Zhang, Kai Sun, Yike Guo
- Abstract summary: High-dimensional omics data contains intrinsic biomedical information crucial for personalised medicine.
It is challenging to capture them from genome-wide data due to the large number of molecular features and small number of available samples.
We proposed a unified multi-task deep learning framework called OmiEmbed to capture a holistic and relatively precise profile of phenotype from high-dimensional omics data.
- Score: 19.889861433855053
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-dimensional omics data contains intrinsic biomedical information that is
crucial for personalised medicine. Nevertheless, it is challenging to capture
them from the genome-wide data due to the large number of molecular features
and small number of available samples, which is also called "the curse of
dimensionality" in machine learning. To tackle this problem and pave the way
for machine learning aided precision medicine, we proposed a unified multi-task
deep learning framework called OmiEmbed to capture a holistic and relatively
precise profile of phenotype from high-dimensional omics data. The deep
embedding module of OmiEmbed learnt an omics embedding that mapped multiple
omics data types into a latent space with lower dimensionality. Based on the
new representation of multi-omics data, different downstream networks of
OmiEmbed were trained together with the multi-task strategy to predict the
comprehensive phenotype profile of each sample. We trained the model on two
publicly available omics datasets to evaluate the performance of OmiEmbed. The
OmiEmbed model achieved promising results for multiple downstream tasks
including dimensionality reduction, tumour type classification, multi-omics
integration, demographic and clinical feature reconstruction, and survival
prediction. Instead of training and applying different downstream networks
separately, the multi-task strategy combined them together and conducted
multiple tasks simultaneously and efficiently. The model achieved better
performance with the multi-task strategy comparing to training them
individually. OmiEmbed is a powerful tool to accurately capture comprehensive
phenotypic information from high-dimensional omics data and has a great
potential to facilitate more accurate and personalised clinical decision
making.
Related papers
- MAMMAL -- Molecular Aligned Multi-Modal Architecture and Language [0.24434823694833652]
MAMMAL is a versatile multi-task multi-align foundation model that learns from large-scale biological datasets.
We introduce a prompt syntax that supports a wide range of classification, regression, and generation tasks.
We evaluate the model on 11 diverse downstream tasks spanning different steps within a typical drug discovery pipeline.
arXiv Detail & Related papers (2024-10-28T20:45:52Z) - A Multitask Deep Learning Model for Classification and Regression of Hyperspectral Images: Application to the large-scale dataset [44.94304541427113]
We propose a multitask deep learning model to perform multiple classification and regression tasks simultaneously on hyperspectral images.
We validated our approach on a large hyperspectral dataset called TAIGA.
A comprehensive qualitative and quantitative analysis of the results shows that the proposed method significantly outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-23T11:14:54Z) - Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis.
For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z) - AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context
Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images.
AMIGO uses the celluar graph within the tissue to provide a single representation for a patient.
We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z) - CustOmics: A versatile deep-learning based strategy for multi-omics
integration [0.0]
This paper presents a novel strategy to build a customizable autoencoder model that adapts to the dataset used in the case of high-dimensional multi-source integration.
We will assess the impact of integration strategies on the latent representation and combine the best strategies to propose a new method, CustOmics.
arXiv Detail & Related papers (2022-09-12T14:20:29Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - SubOmiEmbed: Self-supervised Representation Learning of Multi-omics Data
for Cancer Type Classification [4.992154875028543]
Integration and analysis of multi-omics data give us a broad view of tumours, which can improve clinical decision making.
SubOmiEmbed produces comparable results to the baseline OmiEmbed with a much smaller network and by using just a subset of the data.
This work can be improved to integrate mutation-based genomic data as well.
arXiv Detail & Related papers (2022-02-03T16:39:09Z) - Multi-task Semi-supervised Learning for Pulmonary Lobe Segmentation [2.8016091833446617]
Pulmonary lobe segmentation is an important preprocessing task for the analysis of lung diseases.
Deep learning based methods can outperform these traditional approaches.
Deep multi-task learning is expected to utilize labels of multiple different structures.
arXiv Detail & Related papers (2021-04-22T12:33:30Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - MS-Net: Multi-Site Network for Improving Prostate Segmentation with
Heterogeneous MRI Data [75.73881040581767]
We propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations.
Our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
arXiv Detail & Related papers (2020-02-09T14:11:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.