A Cross-Level Information Transmission Network for Predicting Phenotype
from New Genotype: Application to Cancer Precision Medicine
- URL: http://arxiv.org/abs/2010.04824v1
- Date: Fri, 9 Oct 2020 22:01:00 GMT
- Title: A Cross-Level Information Transmission Network for Predicting Phenotype
from New Genotype: Application to Cancer Precision Medicine
- Authors: Di He, Lei Xie
- Abstract summary: We propose a novel Cross-LEvel Information Transmission network (CLEIT) framework.
Inspired by domain adaptation, CLEIT first learns the latent representation of high-level domain then uses it as ground-truth embedding.
We demonstrate the effectiveness and performance boost of CLEIT in predicting anti-cancer drug sensitivity from somatic mutations.
- Score: 37.442717660492384
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: An unsolved fundamental problem in biology and ecology is to predict
observable traits (phenotypes) from a new genetic constitution (genotype) of an
organism under environmental perturbations (e.g., drug treatment). The
emergence of multiple omics data provides new opportunities but imposes great
challenges in the predictive modeling of genotype-phenotype associations.
Firstly, the high-dimensionality of genomics data and the lack of labeled data
often make the existing supervised learning techniques less successful.
Secondly, it is a challenging task to integrate heterogeneous omics data from
different resources. Finally, the information transmission from DNA to
phenotype involves multiple intermediate levels of RNA, protein, metabolite,
etc. The higher-level features (e.g., gene expression) usually have stronger
discriminative power than the lower level features (e.g., somatic mutation). To
address above issues, we proposed a novel Cross-LEvel Information Transmission
network (CLEIT) framework. CLEIT aims to explicitly model the asymmetrical
multi-level organization of the biological system. Inspired by domain
adaptation, CLEIT first learns the latent representation of high-level domain
then uses it as ground-truth embedding to improve the representation learning
of the low-level domain in the form of contrastive loss. In addition, we adopt
a pre-training-fine-tuning approach to leveraging the unlabeled heterogeneous
omics data to improve the generalizability of CLEIT. We demonstrate the
effectiveness and performance boost of CLEIT in predicting anti-cancer drug
sensitivity from somatic mutations via the assistance of gene expressions when
compared with state-of-the-art methods.
Related papers
- Interpreting artificial neural networks to detect genome-wide association signals for complex traits [0.0]
Investigating the genetic architecture of complex diseases is challenging due to the highly polygenic and interactive landscape of genetic and environmental factors.
We trained artificial neural networks for predicting complex traits using both simulated and real genotype/phenotype datasets.
arXiv Detail & Related papers (2024-07-26T15:20:42Z) - Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification [119.13058298388101]
We develop a Biological-knowledge enhanced PathGenomic multi-label Transformer to improve genetic mutation prediction performances.
BPGT first establishes a novel gene encoder that constructs gene priors by two carefully designed modules.
BPGT then designs a label decoder that finally performs genetic mutation prediction by two tailored modules.
arXiv Detail & Related papers (2024-06-05T06:42:27Z) - Whole Genome Transformer for Gene Interaction Effects in Microbiome Habitat Specificity [3.972930262155919]
We propose a framework taking advantage of existing large models for gene vectorization to predict habitat specificity from entire microbial genome sequences.
We train and validate our approach on a large dataset of high quality microbiome genomes from different habitats.
arXiv Detail & Related papers (2024-05-09T09:34:51Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - A Comparative Analysis of Gene Expression Profiling by Statistical and
Machine Learning Approaches [1.8954222800767324]
We discuss the biological and the methodological limitations of machine learning models to classify cancer samples.
Gene rankings are obtained from explainability methods adapted to these models.
We observe that the information learned by black-box neural networks is related to the notion of differential expression.
arXiv Detail & Related papers (2024-02-01T18:17:36Z) - Genetic InfoMax: Exploring Mutual Information Maximization in
High-Dimensional Imaging Genetics Studies [50.11449968854487]
Genome-wide association studies (GWAS) are used to identify relationships between genetic variations and specific traits.
Representation learning for imaging genetics is largely under-explored due to the unique challenges posed by GWAS.
We introduce a trans-modal learning framework Genetic InfoMax (GIM) to address the specific challenges of GWAS.
arXiv Detail & Related papers (2023-09-26T03:59:21Z) - Unsupervised ensemble-based phenotyping helps enhance the
discoverability of genes related to heart morphology [57.25098075813054]
We propose a new framework for gene discovery entitled Un Phenotype Ensembles.
It builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner.
These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations.
arXiv Detail & Related papers (2023-01-07T18:36:44Z) - Graph Coloring via Neural Networks for Haplotype Assembly and Viral
Quasispecies Reconstruction [8.828330486848753]
We develop a new method, called NeurHap, that combines graph representation learning with optimization.
Our experiments demonstrate substantially better performance of NeurHap in real and synthetic datasets compared to competing approaches.
arXiv Detail & Related papers (2022-10-21T12:53:09Z) - Modelling Technical and Biological Effects in scRNA-seq data with
Scalable GPLVMs [6.708052194104378]
We extend a popular approach for probabilistic non-linear dimensionality reduction, the Gaussian process latent variable model, to scale to massive single-cell datasets.
The key idea is to use an augmented kernel which preserves the factorisability of the lower bound allowing for fast variational inference.
arXiv Detail & Related papers (2022-09-14T15:25:15Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.