Explainable machine learning-based prediction model for diabetic
nephropathy
- URL: http://arxiv.org/abs/2309.16730v2
- Date: Tue, 24 Oct 2023 14:47:41 GMT
- Title: Explainable machine learning-based prediction model for diabetic
nephropathy
- Authors: Jing-Mei Yin, Yang Li, Jun-Tang Xue, Guo-Wei Zong, Zhong-Ze Fang, and
Lang Zou
- Abstract summary: The aim of this study is to analyze the effect of serum metabolites on diabetic nephropathy (DN) and predict the prevalence of DN through a machine learning approach.
The dataset consists of 548 patients from April 2018 to April 2019 in Second Affiliated Hospital of Dalian Medical University (SAHDMU)
- Score: 1.874014847588016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The aim of this study is to analyze the effect of serum metabolites on
diabetic nephropathy (DN) and predict the prevalence of DN through a machine
learning approach. The dataset consists of 548 patients from April 2018 to
April 2019 in Second Affiliated Hospital of Dalian Medical University (SAHDMU).
We select the optimal 38 features through a Least absolute shrinkage and
selection operator (LASSO) regression model and a 10-fold cross-validation. We
compare four machine learning algorithms, including eXtreme Gradient Boosting
(XGB), random forest, decision tree and logistic regression, by AUC-ROC curves,
decision curves, calibration curves. We quantify feature importance and
interaction effects in the optimal predictive model by Shapley Additive
exPlanations (SHAP) method. The XGB model has the best performance to screen
for DN with the highest AUC value of 0.966. The XGB model also gains more
clinical net benefits than others and the fitting degree is better. In
addition, there are significant interactions between serum metabolites and
duration of diabetes. We develop a predictive model by XGB algorithm to screen
for DN. C2, C5DC, Tyr, Ser, Met, C24, C4DC, and Cys have great contribution in
the model, and can possibly be biomarkers for DN.
Related papers
- Multimodal Survival Modeling and Fairness-Aware Clinical Machine Learning for 5-Year Breast Cancer Risk Prediction [4.750682174151462]
We present a fully reproducible machine learning framework for 5-year overall survival prediction in breast cancer.<n>We integrate clinical variables with high-dimensional transcriptomic and copy-number alteration (CNA) features from the METABRIC cohort.<n>Performance was assessed using time-dependent area under the ROC curve (AUC), average precision (AP), calibration curves, Brier score, and bootstrapped 95 percent confidence intervals.
arXiv Detail & Related papers (2026-02-25T07:20:43Z) - An Improved Ensemble-Based Machine Learning Model with Feature Optimization for Early Diabetes Prediction [0.0]
Diabetes is a serious worldwide health issue, and successful intervention depends on early detection.<n>To use extensive health survey data to create a machine learning framework for diabetes classification that is both accurate and comprehensible.<n>In our study, we proposed and developed a React Native-based application with a Python Flask backend to support early diabetes prediction.
arXiv Detail & Related papers (2025-11-15T07:42:31Z) - Risk Prediction of Cardiovascular Disease for Diabetic Patients with Machine Learning and Deep Learning Techniques [0.0]
This study proposes an efficient CVD risk prediction model for diabetic patients using machine learning (ML) and hybrid deep learning (DL) approaches.<n>Several ML models, including Decision Trees (DT), Random Forest (RF), k-Nearest Neighbors (KNN), Support Vector Machine (SVM), AdaBoost, and XGBoost, were implemented.<n>High accuracy and F1 scores demonstrate these models' potential to improve personalized risk management and preventive strategies.
arXiv Detail & Related papers (2025-11-07T04:14:30Z) - Methodology for Comparing Machine Learning Algorithms for Survival Analysis [55.65997641180011]
Six machine learning models for survival analysis were evaluated.<n>XGB-AFT achieved the best performance (C-Index = 0.7618; IPCW = 0.7532, followed by GBSA and RSF)
arXiv Detail & Related papers (2025-10-28T14:42:28Z) - Enhancing Bagging Ensemble Regression with Data Integration for Time Series-Based Diabetes Prediction [0.5399800035598186]
This study begins with a data engineering process to integrate diabetes-related datasets from 2011 to 2021.<n>We then introduce an enhanced bagging ensemble regression model (EBMBag+) for time series forecasting to predict diabetes prevalence across U.S. cities.<n>The experimental results demonstrate that EBMBag+ achieved the best performance, with an MAE of 0.41, RMSE of 0.53, MAPE of 4.01, and an R2 of 0.9.
arXiv Detail & Related papers (2025-06-11T04:21:50Z) - Can Copulas Be Used for Feature Selection? A Machine Learning Study on Diabetes Risk Prediction [0.0]
We introduce a feature-selection framework using the upper-tail dependence coefficient (lambdaU) of the novel A2 copula.<n>Our method prioritizes five predictors based on upper tail dependencies.<n>These features match or outperform MI and GA selected subsets across four classifiers.
arXiv Detail & Related papers (2025-05-28T16:34:58Z) - Predicting Diabetes Using Machine Learning: A Comparative Study of Classifiers [0.0]
Diabetes remains a significant health challenge globally, contributing to severe complications like kidney disease, vision loss, and heart issues.<n>Our study introduces an innovative diabetes prediction framework, leveraging both traditional ML techniques and advanced ensemble methods.<n>Central to our approach is the development of a novel model, DNet, a hybrid architecture combining Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) layers.
arXiv Detail & Related papers (2025-05-11T16:14:31Z) - Large Language Models for Medical Forecasting -- Foresight 2 [0.573038865401108]
Foresight 2 (FS2) is a large language model fine-tuned on hospital data for modelling patient timelines.
It can understand patients' clinical notes and predict SNOMED codes for a wide range of biomedical use cases.
arXiv Detail & Related papers (2024-12-14T14:45:28Z) - Brain Tumor Classification on MRI in Light of Molecular Markers [61.77272414423481]
Co-deletion of the 1p/19q gene is associated with clinical outcomes in low-grade gliomas.
This study aims to utilize a specially MRI-based convolutional neural network for brain cancer detection.
arXiv Detail & Related papers (2024-09-29T07:04:26Z) - Comparative Performance Analysis of Transformer-Based Pre-Trained Models for Detecting Keratoconus Disease [0.0]
This study compares eight pre-trained CNNs for diagnosing keratoconus, a degenerative eye disease.
MobileNetV2 was the best accurate model in identifying keratoconus and normal cases with few misclassifications.
arXiv Detail & Related papers (2024-08-16T20:15:24Z) - Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options.
The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z) - Comparative Analysis of LSTM Neural Networks and Traditional Machine Learning Models for Predicting Diabetes Patient Readmission [0.0]
This study uses the Diabetes 130-US Hospitals dataset for analysis and prediction of readmission patients by various machine learning models.
LightGBM turned out to be the best traditional model, while XGBoost was the runner-up.
This study demonstrates that model selection, validation, and interpretability are key steps in predictive healthcare modeling.
arXiv Detail & Related papers (2024-06-28T15:06:22Z) - Contrast-agent-induced deterministic component of CT-density in the
abdominal aorta during routine angiography: proof of concept study [0.0]
We develop a model describing the dynamic behavior of the contrast agent in the vessel.
It can be useful for both increasing the diagnostic value of a particular study and improving the CT data processing tools.
arXiv Detail & Related papers (2023-10-31T07:59:57Z) - Clinical Deterioration Prediction in Brazilian Hospitals Based on
Artificial Neural Networks and Tree Decision Models [56.93322937189087]
An extremely boosted neural network (XBNet) is used to predict clinical deterioration (CD)
The XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
arXiv Detail & Related papers (2022-12-17T23:29:14Z) - An Efficient End-to-End Deep Neural Network for Interstitial Lung
Disease Recognition and Classification [0.5424799109837065]
This paper introduces an end-to-end deep convolution neural network (CNN) for classifying ILDs patterns.
The proposed model comprises four convolutional layers with different kernel sizes and Rectified Linear Unit (ReLU) activation function.
A dataset consisting of 21328 image patches of 128 CT scans with five classes is taken to train and assess the proposed model.
arXiv Detail & Related papers (2022-04-21T06:36:10Z) - On the explainability of hospitalization prediction on a large COVID-19
patient dataset [45.82374977939355]
We develop various AI models to predict hospitalization on a large (over 110$k$) cohort of COVID-19 positive-tested US patients.
Despite high data unbalance, the models reach average precision 0.96-0.98 (0.75-0.85), recall 0.96-0.98 (0.74-0.85), and $F_score 0.97-0.98 (0.79-0.83) on the non-hospitalized (or hospitalized) class.
arXiv Detail & Related papers (2021-10-28T10:23:38Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Efficient and Visualizable Convolutional Neural Networks for COVID-19
Classification Using Chest CT [0.0]
COVID-19 has infected over 65 million people worldwide as of December 4, 2020.
Deep learning has emerged as a promising diagnosis technique.
In this paper, we evaluate and compare 40 different convolutional neural network architectures for COVID-19 diagnosis.
arXiv Detail & Related papers (2020-12-22T07:09:48Z) - CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors
and Efficient Neural Networks [51.589769497681175]
The novel coronavirus (SARS-CoV-2) has led to a pandemic.
The current testing regime based on Reverse Transcription-Polymerase Chain Reaction for SARS-CoV-2 has been unable to keep up with testing demands.
We propose a framework called CovidDeep that combines efficient DNNs with commercially available WMSs for pervasive testing of the virus.
arXiv Detail & Related papers (2020-07-20T21:47:28Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.