SACDNet: Towards Early Type 2 Diabetes Prediction with Uncertainty for
Electronic Health Records
- URL: http://arxiv.org/abs/2301.04844v1
- Date: Thu, 12 Jan 2023 07:14:47 GMT
- Title: SACDNet: Towards Early Type 2 Diabetes Prediction with Uncertainty for
Electronic Health Records
- Authors: Tayyab Nasir and Muhammad Kamran Malik
- Abstract summary: This study proposes a novel neural network architecture for early T2DM prediction using multi-headed self-attention and dense layers.
The proposed technique is called the Self-Attention for Comorbid Disease Net (SACDNet), achieving an accuracy of 89.3% and an F1-Score of 89.1%.
A T2DM prediction dataset is also built as part of this study which is based on real-world routine Electronic Health Record (EHR) data comprising 4,124 diabetic and 181,767 non-diabetic examples.
- Score: 0.951828574518325
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Type 2 diabetes mellitus (T2DM) is one of the most common diseases and a
leading cause of death. The problem of early diagnosis of T2DM is challenging
and necessary to prevent serious complications. This study proposes a novel
neural network architecture for early T2DM prediction using multi-headed
self-attention and dense layers to extract features from historic diagnoses,
patient vitals, and demographics. The proposed technique is called the
Self-Attention for Comorbid Disease Net (SACDNet), achieving an accuracy of
89.3% and an F1-Score of 89.1%, having a 1.6% increased accuracy and 1.3%
increased f1-score compared to the baseline techniques. Monte Carlo (MC)
Dropout is applied to the SACEDNet to get a bayesian approximation. A T2DM
prediction framework based on the MC Dropout SACDNet is proposed to quantize
the uncertainty associated with the predictions. A T2DM prediction dataset is
also built as part of this study which is based on real-world routine
Electronic Health Record (EHR) data comprising 4,124 diabetic and 181,767
non-diabetic examples, collected from 295 different EHR systems running in
different parts of the United States of America. This dataset is further used
to evaluate 7 different machine learning and 3 deep learning-based models.
Finally, a detailed analysis of the fairness of every technique against
different patient demographic groups is performed to validate the unbiased
generalization of the techniques and the diversity of the data.
Related papers
- Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - First Experiences with the Identification of People at Risk for Diabetes in Argentina using Machine Learning Techniques [0.27488316163114823]
This article describes the development and assessment of predictive models to identify people at risk for T2D and PD specifically in Argentina.
The results obtained show that a very good performance was observed for two datasets with some of these models.
arXiv Detail & Related papers (2024-03-27T14:38:02Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Clinical Deterioration Prediction in Brazilian Hospitals Based on
Artificial Neural Networks and Tree Decision Models [56.93322937189087]
An extremely boosted neural network (XBNet) is used to predict clinical deterioration (CD)
The XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
arXiv Detail & Related papers (2022-12-17T23:29:14Z) - Secure and Privacy-Preserving Automated Machine Learning Operations into
End-to-End Integrated IoT-Edge-Artificial Intelligence-Blockchain Monitoring
System for Diabetes Mellitus Prediction [0.5825410941577593]
This paper proposes an IoT-edge-Artificial Intelligence (AI)-blockchain system for diabetes prediction based on risk factors.
The proposed system is underpinned by the blockchain to obtain a cohesive view of the risk factors data from patients across different hospitals.
Numerical experiments and comparative analysis were carried out between our proposed system, using the most accurate random forest (RF) model.
arXiv Detail & Related papers (2022-11-13T13:57:14Z) - SynthA1c: Towards Clinically Interpretable Patient Representations for
Diabetes Risk Stratification [0.5551483435671848]
Early diagnosis of Type 2 Diabetes Mellitus (T2DM) is crucial to enable timely therapeutic interventions and lifestyle modifications.
We show that image-derived phenotypes and physical examination data together can accurately predict diabetes risk.
arXiv Detail & Related papers (2022-09-20T23:39:52Z) - A novel solution of deep learning for enhanced support vector machine
for predicting the onset of type 2 diabetes [32.25039205521283]
This research aims to increase the accuracy and Area Under the Curve (AUC) metric while improving the processing time for predicting the onset of Type 2 Diabetes.
The proposed solution provides an average accuracy of 86.31 % and an average AUC value of 0.8270 or 82.70 %, with an improvement of 3.8 milliseconds in the processing.
arXiv Detail & Related papers (2022-08-05T18:15:40Z) - Supervised multi-specialist topic model with applications on large-scale
electronic health record data [3.322262654060203]
We present MixEHR-S to jointly infer specialist-disease topics from the EHR data.
For efficient inference, we developed a closed-form collapsed variational inference algorithm.
In three applications, MixEHR-S conferred clinically meaningful latent topics among the most predictive latent topics.
arXiv Detail & Related papers (2021-05-04T01:27:11Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors
and Efficient Neural Networks [51.589769497681175]
The novel coronavirus (SARS-CoV-2) has led to a pandemic.
The current testing regime based on Reverse Transcription-Polymerase Chain Reaction for SARS-CoV-2 has been unable to keep up with testing demands.
We propose a framework called CovidDeep that combines efficient DNNs with commercially available WMSs for pervasive testing of the virus.
arXiv Detail & Related papers (2020-07-20T21:47:28Z) - Short Term Blood Glucose Prediction based on Continuous Glucose
Monitoring Data [53.01543207478818]
This study explores the use of Continuous Glucose Monitoring (CGM) data as input for digital decision support tools.
We investigate how Recurrent Neural Networks (RNNs) can be used for Short Term Blood Glucose (STBG) prediction.
arXiv Detail & Related papers (2020-02-06T16:39:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.