Longitudinal prediction of DNA methylation to forecast epigenetic
outcomes
- URL: http://arxiv.org/abs/2312.13302v1
- Date: Tue, 19 Dec 2023 22:15:27 GMT
- Title: Longitudinal prediction of DNA methylation to forecast epigenetic
outcomes
- Authors: Arthur Leroy, Ai Ling Teh, Frank Dondelinger, Mauricio A. Alvarez,
Dennis Wang
- Abstract summary: We introduce a probabilistic and longitudinal machine learning framework based on multi-mean Gaussian processes (GPs)
Our model is trained on a birth cohort of children with methylation profiled at ages 0-4, and we demonstrated that the status of methylation sites for each child can be accurately predicted at ages 5-7.
This approach encourages epigenetic studies to move towards longitudinal design for investigating epigenetic changes during development, ageing and disease progression.
- Score: 2.5936539522838506
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Interrogating the evolution of biological changes at early stages of life
requires longitudinal profiling of molecules, such as DNA methylation, which
can be challenging with children. We introduce a probabilistic and longitudinal
machine learning framework based on multi-mean Gaussian processes (GPs),
accounting for individual and gene correlations across time. This method
provides future predictions of DNA methylation status at different individual
ages while accounting for uncertainty. Our model is trained on a birth cohort
of children with methylation profiled at ages 0-4, and we demonstrated that the
status of methylation sites for each child can be accurately predicted at ages
5-7. We show that methylation profiles predicted by multi-mean GPs can be used
to estimate other phenotypes, such as epigenetic age, and enable comparison to
other health measures of interest. This approach encourages epigenetic studies
to move towards longitudinal design for investigating epigenetic changes during
development, ageing and disease progression.
Related papers
- GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters.
The model adheres to the central dogma of molecular biology, accurately generating protein-coding sequences.
It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of promoter sequences.
arXiv Detail & Related papers (2025-02-11T05:39:49Z) - G2PDiffusion: Genotype-to-Phenotype Prediction with Diffusion Models [108.94237816552024]
This paper introduces G2PDiffusion, the first-of-its-kind diffusion model designed for genotype-to-phenotype generation across multiple species.
We use images to represent morphological phenotypes across species and redefine phenotype prediction as conditional image generation.
arXiv Detail & Related papers (2025-02-07T06:16:31Z) - iTARGET: Interpretable Tailored Age Regression for Grouped Epigenetic Traits [0.0]
We propose a novel two-phase algorithm to accurately predict chronological age from DNA methylation patterns.
Our method not only improves prediction accuracy but also reveals key age-related CpG sites, detects age-specific changes in aging rates, and identifies pairwise interactions between CpG sites.
Experimental results show that our approach outperforms traditional epigenetic clocks and machine learning models.
arXiv Detail & Related papers (2025-01-04T23:06:46Z) - U-learning for Prediction Inference via Combinatory Multi-Subsampling: With Applications to LASSO and Neural Networks [5.587500517608073]
Epigenetic aging clocks play a pivotal role in estimating an individual's biological age through the examination of DNA methylation patterns.
We introduce a novel U-sampling approach via multi-sublearning for making ensemble predictions.
More specifically, our approach conceptualizes the ensemble estimators within the framework of generalized U-statistics.
We apply our approach to two commonly used predictive algorithms, Lasso and deep neural networks (DNNs), and illustrate the validity of inferences with extensive numerical studies.
arXiv Detail & Related papers (2024-07-22T00:03:51Z) - Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - Unsupervised ensemble-based phenotyping helps enhance the
discoverability of genes related to heart morphology [57.25098075813054]
We propose a new framework for gene discovery entitled Un Phenotype Ensembles.
It builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner.
These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations.
arXiv Detail & Related papers (2023-01-07T18:36:44Z) - Neurodevelopmental Phenotype Prediction: A State-of-the-Art Deep
Learning Model [0.0]
We apply a deep neural network to analyse the cortical surface data of neonates.
Our goal is to identify neurodevelopmental biomarkers and to predict gestational age at birth based on these biomarkers.
arXiv Detail & Related papers (2022-11-16T11:15:23Z) - Human Age Estimation from Gene Expression Data using Artificial Neural
Networks [27.900947531352983]
We propose a new framework for human age estimation using information from human dermal fibroblast gene expression data.
Our experimental results suggest the superiority of the proposed framework over state-of-the-art age estimation methods.
arXiv Detail & Related papers (2021-11-04T08:57:35Z) - An Information-Theoretic Framework for Identifying Age-Related Genes
Using Human Dermal Fibroblast Transcriptome Data [0.8122270502556371]
We develop an information-theoretic framework for identifying genes that are associated with aging.
We use unsupervised and semi-supervised learning techniques on human dermal fibroblast gene expression data.
Performance assessment for both unsupervised and semi-supervised methods show the effectiveness of the framework.
arXiv Detail & Related papers (2021-11-04T02:41:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.