Predicting Gene Disease Associations in Type 2 Diabetes Using Machine Learning on Single-Cell RNA-Seq Data
- URL: http://arxiv.org/abs/2602.09036v1
- Date: Fri, 30 Jan 2026 03:27:06 GMT
- Title: Predicting Gene Disease Associations in Type 2 Diabetes Using Machine Learning on Single-Cell RNA-Seq Data
- Authors: Maria De La Luz Lomboy Toledo, Daniel Onah,
- Abstract summary: Diabetes is a chronic metabolic disorder characterized by elevated blood glucose levels due to impaired insulin production or function.<n>Two main forms are recognized: type 1 diabetes (T1D), which involves autoimmune destruction of insulin-producing beta-cells, and type 2 diabetes (T2D), which arises from insulin resistance and progressive beta-cell dysfunction.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diabetes is a chronic metabolic disorder characterized by elevated blood glucose levels due to impaired insulin production or function. Two main forms are recognized: type 1 diabetes (T1D), which involves autoimmune destruction of insulin-producing \b{eta}-cells, and type 2 diabetes (T2D), which arises from insulin resistance and progressive \b{eta}-cell dysfunction. Understanding the molecular mechanisms underlying these diseases is essential for the development of improved therapeutic strategies, particularly those targeting \b{eta}-cell dysfunction. To investigate these mechanisms in a controlled and biologically interpretable setting, mouse models have played a central role in diabetes research. Owing to their genetic and physiological similarity to humans, together with the ability to precisely manipulate their genome, mice enable detailed investigation of disease progression and gene function. In particular, mouse models have provided critical insights into \b{eta}-cell development, cellular heterogeneity, and functional failure under diabetic conditions. Building on these experimental advances, this study applies machine learning methods to single-cell transcriptomic data from mouse pancreatic islets. Specifically, we evaluate two supervised approaches identified in the literature; Extra Trees Classifier (ETC) and Partial Least Squares Discriminant Analysis (PLS-DA), to assess their ability to identify T2D-associated gene expression signatures at single-cell resolution. Model performance is evaluated using standard classification metrics, with an emphasis on interpretability and biological relevance
Related papers
- R-GenIMA: Integrating Neuroimaging and Genetics with Interpretable Multimodal AI for Alzheimer's Disease Progression [63.97617759805451]
Early detection of Alzheimer's disease requires models capable of integrating macro-scale neuroanatomical alterations with micro-scale genetic susceptibility.<n>We introduce R-GenIMA, an interpretable multimodal large language model that couples a novel ROI-wise vision transformer with genetic prompting.<n>R-GenIMA achieves state-of-the-art performance in four-way classification across normal cognition, subjective memory concerns, mild cognitive impairment, and AD.
arXiv Detail & Related papers (2025-12-22T02:54:10Z) - GastroDL-Fusion: A Dual-Modal Deep Learning Framework Integrating Protein-Ligand Complexes and Gene Sequences for Gastrointestinal Disease Drug Discovery [2.1880525779004563]
GastroDL-Fusion is a dual-modal deep learning framework that integrates protein-ligand complex data with disease-associated gene sequence information.<n>We evaluate the model on benchmark datasets of GI disease-related targets.<n>Results confirm that incorporating both structural and genetic features yields more accurate predictions of binding affinities.
arXiv Detail & Related papers (2025-11-07T21:32:58Z) - Use of Continuous Glucose Monitoring with Machine Learning to Identify Metabolic Subphenotypes and Inform Precision Lifestyle Changes [4.643854266548864]
The classification of diabetes and prediabetes by static glucose thresholds obscures the pathophysiological dysglycemia heterogeneity.<n>We show that continuous glucose monitoring and wearable technologies enable a paradigm shift towards non-invasive, dynamic metabolic phenotyping.
arXiv Detail & Related papers (2025-11-06T02:15:08Z) - Gene-Metabolite Association Prediction with Interactive Knowledge Transfer Enhanced Graph for Metabolite Production [49.814615043389864]
We propose a new task, Gene-Metabolite Association Prediction based on metabolic graphs.
We present the first benchmark containing 2474 metabolites and 1947 genes of two commonly used microorganisms.
Our proposed methodology outperforms baselines by up to 12.3% across various link prediction frameworks.
arXiv Detail & Related papers (2024-10-24T06:54:27Z) - From Glucose Patterns to Health Outcomes: A Generalizable Foundation Model for Continuous Glucose Monitor Data Analysis [47.23780364438969]
We present GluFormer, a generative foundation model for CGM data that learns nuanced glycemic patterns and translates them into predictive representations of metabolic health.<n>GluFormer generalizes to 19 external cohorts spanning different ethnicities and ages, 5 countries, 8 CGM devices, and diverse pathophysiological states.<n>In a longitudinal study of 580 adults with CGM data and 12-year follow-up, GluFormer identifies individuals at elevated risk of developing diabetes more effectively than blood HbA1C%.
arXiv Detail & Related papers (2024-08-20T13:19:06Z) - Exploring Biomarker Relationships in Both Type 1 and Type 2 Diabetes Mellitus Through a Bayesian Network Analysis Approach [1.004996690798013]
This study applies Bayesian network structure learning to analyze the Shanghai Type 1 and Type 2 diabetes mellitus datasets.
The constructed Bayesian network presented notable predictive accuracy, particularly for Type 2 diabetes mellitus, with root mean squared error (RMSE) of 18.23 mg/dL.
arXiv Detail & Related papers (2024-06-24T19:27:34Z) - MMIL: A novel algorithm for disease associated cell type discovery [58.044870442206914]
Single-cell datasets often lack individual cell labels, making it challenging to identify cells associated with disease.
We introduce Mixture Modeling for Multiple Learning Instance (MMIL), an expectation method that enables the training and calibration of cell-level classifiers.
arXiv Detail & Related papers (2024-06-12T15:22:56Z) - Causal machine learning for single-cell genomics [94.28105176231739]
We discuss the application of machine learning techniques to single-cell genomics and their challenges.
We first present the model that underlies most of current causal approaches to single-cell biology.
We then identify open problems in the application of causal approaches to single-cell data.
arXiv Detail & Related papers (2023-10-23T13:35:24Z) - Machine Learning based prediction of Glucose Levels in Type 1 Diabetes
Patients with the use of Continuous Glucose Monitoring Data [0.0]
Continuous Glucose Monitoring (CGM) devices offer detailed, non-intrusive and real time insights into a patient's blood glucose concentrations.
Leveraging advanced Machine Learning (ML) Models as methods of prediction of future glucose levels, gives rise to substantial quality of life improvements.
arXiv Detail & Related papers (2023-02-24T19:10:40Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - Temporal patterns in insulin needs for Type 1 diabetes [0.0]
Type 1 Diabetes (T1D) is a chronic condition where the body produces little or no insulin.
Finding the right insulin dose and time remains a complex, challenging and as yet unsolved control task.
In this study, we use the OpenAPS Data Commons dataset to discover temporal patterns in insulin need driven by well-known factors.
arXiv Detail & Related papers (2022-11-14T14:19:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.