Deep Phenotyping of Non-Alcoholic Fatty Liver Disease Patients with
Genetic Factors for Insights into the Complex Disease
- URL: http://arxiv.org/abs/2311.08428v1
- Date: Mon, 13 Nov 2023 19:31:12 GMT
- Title: Deep Phenotyping of Non-Alcoholic Fatty Liver Disease Patients with
Genetic Factors for Insights into the Complex Disease
- Authors: Tahmina Sultana Priya, Fan Leng, Anthony C. Luehrs, Eric W. Klee,
Alina M. Allen, Konstantinos N. Lazaridis, Danfeng (Daphne) Yao, Shulan Tian
- Abstract summary: Non-alcoholic fatty liver disease (NAFLD) is a prevalent chronic liver disorder characterized by the excessive accumulation of fat in the liver.
We aim to identify subgroups of NAFLD patients based on demographic, clinical, and genetic characteristics.
- Score: 1.7527259446915058
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Non-alcoholic fatty liver disease (NAFLD) is a prevalent chronic liver
disorder characterized by the excessive accumulation of fat in the liver in
individuals who do not consume significant amounts of alcohol, including risk
factors like obesity, insulin resistance, type 2 diabetes, etc. We aim to
identify subgroups of NAFLD patients based on demographic, clinical, and
genetic characteristics for precision medicine. The genomic and phenotypic data
(3,408 cases and 4,739 controls) for this study were gathered from participants
in Mayo Clinic Tapestry Study (IRB#19-000001) and their electric health
records, including their demographic, clinical, and comorbidity data, and the
genotype information through whole exome sequencing performed at Helix using
the Exome+$^\circledR$ Assay according to standard procedure
$\href{https://www.helix.com/}{(www.helix.com)}$. Factors highly relevant to
NAFLD were determined by the chi-square test and stepwise backward-forward
regression model. Latent class analysis (LCA) was performed on NAFLD cases
using significant indicator variables to identify subgroups. The optimal
clustering revealed 5 latent subgroups from 2,013 NAFLD patients (mean age 60.6
years and 62.1% women), while a polygenic risk score based on 6
single-nucleotide polymorphism (SNP) variants and disease outcomes were used to
analyze the subgroups. The groups are characterized by metabolic syndrome,
obesity, different comorbidities, psychoneurological factors, and genetic
factors. Odds ratios were utilized to compare the risk of complex diseases,
such as fibrosis, cirrhosis, and hepatocellular carcinoma (HCC), as well as
liver failure between the clusters. Cluster 2 has a significantly higher
complex disease outcome compared to other clusters. $$\\$$ Keywords: Fatty
liver disease; Polygenic risk score; Precision medicine; Deep phenotyping;
NAFLD comorbidities; Latent class analysis.
Related papers
- Functional Analysis of Variance for Association Studies [0.624151172311885]
We propose a functional analysis of variance (FANOVA) method for testing an association of sequence variants in a genomic region with a qualitative trait.<n>FANOVA has a number of advantages: (1) it tests for a joint effect of gene variants, including both common and rare; (2) it fully utilizes linkage disequilibrium and genetic position information; and (3) it allows for either protective or risk-increasing causal variants.
arXiv Detail & Related papers (2025-08-14T21:02:45Z) - Survey and Improvement Strategies for Gene Prioritization with Large Language Models [61.24568051916653]
Large language models (LLMs) have performed well in medical exams, but their effectiveness in diagnosing rare genetic diseases has not been assessed.
We used multi-agent and Human Phenotype Ontology (HPO) classification to categorized patients based on phenotypes and solvability levels.
At baseline, GPT-4 outperformed other LLMs, achieving near 30% accuracy in ranking causal genes correctly.
arXiv Detail & Related papers (2025-01-30T23:03:03Z) - Identifying latent disease factors differently expressed in patient subgroups using group factor analysis [54.67330718129736]
We propose a novel approach to uncover subgroup-specific and subgroup-common latent factors.
The proposed approach, sparse Group Factor Analysis (GFA) with regularised horseshoe priors, was implemented with probabilistic programming.
arXiv Detail & Related papers (2024-10-10T13:12:14Z) - From Glucose Patterns to Health Outcomes: A Generalizable Foundation Model for Continuous Glucose Monitor Data Analysis [50.80532910808962]
We present GluFormer, a generative foundation model on biomedical temporal data based on a transformer architecture.
GluFormer generalizes to 15 different external datasets, including 4936 individuals across 5 different geographical regions.
It can also predict onset of future health outcomes even 4 years in advance.
arXiv Detail & Related papers (2024-08-20T13:19:06Z) - Assessing and Enhancing Large Language Models in Rare Disease Question-answering [64.32570472692187]
We introduce a rare disease question-answering (ReDis-QA) dataset to evaluate the performance of Large Language Models (LLMs) in diagnosing rare diseases.
We collected 1360 high-quality question-answer pairs within the ReDis-QA dataset, covering 205 rare diseases.
We then benchmarked several open-source LLMs, revealing that diagnosing rare diseases remains a significant challenge for these models.
Experiment results demonstrate that ReCOP can effectively improve the accuracy of LLMs on the ReDis-QA dataset by an average of 8%.
arXiv Detail & Related papers (2024-08-15T21:09:09Z) - Is plantar thermography a valid digital biomarker for characterising diabetic foot ulceration risk? [1.9029675742486807]
In the absence of prospective data on diabetic foot ulcers (DFU), cross-sectional associations with causal risk factors could be used to establish the validity of plantar thermography for DFU risk stratification.
We investigated the associations between intrinsic thermography clusters and several DFU risk factors using an unsupervised deep-learning framework.
arXiv Detail & Related papers (2024-07-05T17:39:03Z) - Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Identifying acute illness phenotypes via deep temporal interpolation and
clustering network on physiologic signatures [6.315312816818801]
Initial hours of hospital admission impact clinical trajectory, but early clinical decisions often suffer due to data paucity.
We created a single-center, longitudinal EHR dataset for 75,762 adults admitted to a tertiary care center for 6+ hours.
We proposed a deep temporal clustering and clustering network to extract latent representations from sparse, irregularly sampled vital sign data.
arXiv Detail & Related papers (2023-07-27T21:05:23Z) - Prevalence and Major Risk Factors of Non-communicable Diseases: A
Machine Learning based Cross-Sectional Study [0.0]
The most frequently reported NCD was cardiovascular issues (CVD), which was present in 83.56% of all participants.
Our study showed that chronic respiratory illness was more frequent in middle-aged participants than in younger or elderly individuals.
arXiv Detail & Related papers (2023-03-03T21:58:35Z) - Gene-SGAN: a method for discovering disease subtypes with imaging and
genetic signatures via multi-view weakly-supervised deep clustering [6.79528256151419]
Gene-SGAN is a multi-view, weakly-supervised deep clustering method.
It dissects disease heterogeneity by jointly considering phenotypic and genetic data.
Gene-SGAN is broadly applicable to disease subtyping and endophenotype discovery.
arXiv Detail & Related papers (2023-01-25T10:08:30Z) - rfPhen2Gen: A machine learning based association study of brain imaging
phenotypes to genotypes [71.1144397510333]
We learned machine learning models to predict SNPs using 56 brain imaging QTs.
SNPs within the known Alzheimer disease (AD) risk gene APOE had lowest RMSE for lasso and random forest.
Random forests identified additional SNPs that were not prioritized by the linear models but are known to be associated with brain-related disorders.
arXiv Detail & Related papers (2022-03-31T20:15:22Z) - Identifying the Risks of Chronic Diseases Using BMI Trajectories [0.0]
We use a machine learning approach to subtype individuals' risk of developing 18 major chronic diseases by using their BMI trajectories.
We define nine new interpretable and evidence-based variables based on the BMI trajectories to cluster the patients into subgroups.
In our experiments, direct relationship of obesity with diabetes, hypertension, Alzheimer's, and dementia have been found to be conforming or complementary to the existing body of knowledge.
arXiv Detail & Related papers (2021-11-09T19:52:22Z) - Cancer Gene Profiling through Unsupervised Discovery [49.28556294619424]
We introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers.
Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm.
Our signature reports promising results on distinguishing immune inflammatory and immune desert tumors.
arXiv Detail & Related papers (2021-02-11T09:04:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.