Deep Learning Framework with Uncertainty Quantification for Survey Data: Assessing and Predicting Diabetes Mellitus Risk in the American Population
- URL: http://arxiv.org/abs/2403.19752v1
- Date: Thu, 28 Mar 2024 18:06:11 GMT
- Title: Deep Learning Framework with Uncertainty Quantification for Survey Data: Assessing and Predicting Diabetes Mellitus Risk in the American Population
- Authors: Marcos Matabuena, Juan C. Vidal, Rahul Ghosal, Jukka-Pekka Onnela,
- Abstract summary: This paper proposes a general predictive framework for regression and classification using neural network (NN) modeling.
We apply this framework to assess the risk of Diabetes Mellitus in the US population, utilizing data from the NHANES 2011-2014 cohort.
While focused on diabetes, this NN predictive framework is adaptable to create clinical models for a diverse range of diseases and medical cohorts.
- Score: 2.3849116823891987
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Complex survey designs are commonly employed in many medical cohorts. In such scenarios, developing case-specific predictive risk score models that reflect the unique characteristics of the study design is essential. This approach is key to minimizing potential selective biases in results. The objectives of this paper are: (i) To propose a general predictive framework for regression and classification using neural network (NN) modeling, which incorporates survey weights into the estimation process; (ii) To introduce an uncertainty quantification algorithm for model prediction, tailored for data from complex survey designs; (iii) To apply this method in developing robust risk score models to assess the risk of Diabetes Mellitus in the US population, utilizing data from the NHANES 2011-2014 cohort. The theoretical properties of our estimators are designed to ensure minimal bias and the statistical consistency, thereby ensuring that our models yield reliable predictions and contribute novel scientific insights in diabetes research. While focused on diabetes, this NN predictive framework is adaptable to create clinical models for a diverse range of diseases and medical cohorts. The software and the data used in this paper is publicly available on GitHub.
Related papers
- Towards a Transportable Causal Network Model Based on Observational
Healthcare Data [1.333879175460266]
We propose a novel approach that combines selection diagrams, missingness graphs, causal discovery and prior knowledge into a single graphical model.
We learn this model from data comprising two different cohorts of patients.
The resulting causal network model is validated by expert clinicians in terms of risk assessment, accuracy and explainability.
arXiv Detail & Related papers (2023-11-13T13:23:31Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Stability of clinical prediction models developed using statistical or
machine learning methods [0.5482532589225552]
Clinical prediction models estimate an individual's risk of a particular health outcome, conditional on their values of multiple predictors.
Many models are developed using small datasets that lead to instability in the model and its predictions (estimated risks)
We show instability in a model's estimated risks is often considerable, and manifests itself as miscalibration of predictions in new data.
arXiv Detail & Related papers (2022-11-02T11:55:28Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - Statistical quantification of confounding bias in predictive modelling [0.0]
I propose the partial and full confounder tests, which probe the null hypotheses of unconfounded and fully confounded models.
The tests provide a strict control for Type I errors and high statistical power, even for non-normally and non-linearly dependent predictions.
arXiv Detail & Related papers (2021-11-01T10:35:24Z) - Learning to Predict with Supporting Evidence: Applications to Clinical
Risk Prediction [9.199022926064009]
The impact of machine learning models on healthcare will depend on the degree of trust that healthcare professionals place in the predictions made by these models.
We present a method to provide people with clinical expertise with domain-relevant evidence about why a prediction should be trusted.
arXiv Detail & Related papers (2021-03-04T00:26:32Z) - An Optimal Control Approach to Learning in SIDARTHE Epidemic model [67.22168759751541]
We propose a general approach for learning time-variant parameters of dynamic compartmental models from epidemic data.
We forecast the epidemic evolution in Italy and France.
arXiv Detail & Related papers (2020-10-28T10:58:59Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - Hemogram Data as a Tool for Decision-making in COVID-19 Management:
Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure.
This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients.
Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z) - Simulation of Covid-19 epidemic evolution: are compartmental models
really predictive? [0.0]
This paper addresses the question whether a SIR epidemiological model, enriched with asymptomatic and dead individual compartments, could provide reliable predictions on the epidemic evolution.
A machine learning approach based on particle swarm optimization (PSO) is proposed to automatically identify the model parameters based on a training set of data of progressive increasing size.
The analysis of the scatter in the forecasts shows that model predictions are quite sensitive to the size of the dataset used for training, and that further data are still required to achieve convergent -- and therefore reliable -- predictions.
arXiv Detail & Related papers (2020-04-14T08:42:11Z) - Short Term Blood Glucose Prediction based on Continuous Glucose
Monitoring Data [53.01543207478818]
This study explores the use of Continuous Glucose Monitoring (CGM) data as input for digital decision support tools.
We investigate how Recurrent Neural Networks (RNNs) can be used for Short Term Blood Glucose (STBG) prediction.
arXiv Detail & Related papers (2020-02-06T16:39:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.