Analyzing Impact of Socio-Economic Factors on COVID-19 Mortality
Prediction Using SHAP Value
- URL: http://arxiv.org/abs/2303.00517v1
- Date: Mon, 27 Feb 2023 21:33:04 GMT
- Title: Analyzing Impact of Socio-Economic Factors on COVID-19 Mortality
Prediction Using SHAP Value
- Authors: Redoan Rahman, Jooyeong Kang, Justin F Rousseau, Ying Ding
- Abstract summary: This paper applies machine learning algorithms to a dataset of de-identified COVID-19 patients.
The dataset consists of 20,878 COVID-positive patients, among which 9,177 patients died in the year 2020.
According to our analysis, a patients households annual and disposable income, age, education, and employment status significantly impacts a machine learning models prediction.
- Score: 4.372054218052678
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper applies multiple machine learning (ML) algorithms to a dataset of
de-identified COVID-19 patients provided by the COVID-19 Research Database. The
dataset consists of 20,878 COVID-positive patients, among which 9,177 patients
died in the year 2020. This paper aims to understand and interpret the
association of socio-economic characteristics of patients with their mortality
instead of maximizing prediction accuracy. According to our analysis, a
patients households annual and disposable income, age, education, and
employment status significantly impacts a machine learning models prediction.
We also observe several individual patient data, which gives us insight into
how the feature values impact the prediction for that data point. This paper
analyzes the global and local interpretation of machine learning models on
socio-economic data of COVID patients.
Related papers
- Evaluating the Fairness of the MIMIC-IV Dataset and a Baseline
Algorithm: Application to the ICU Length of Stay Prediction [65.268245109828]
This paper uses the MIMIC-IV dataset to examine the fairness and bias in an XGBoost binary classification model predicting the ICU length of stay.
The research reveals class imbalances in the dataset across demographic attributes and employs data preprocessing and feature extraction.
The paper concludes with recommendations for fairness-aware machine learning techniques for mitigating biases and the need for collaborative efforts among healthcare professionals and data scientists.
arXiv Detail & Related papers (2023-12-31T16:01:48Z) - Automatic prediction of mortality in patients with mental illness using
electronic health records [0.5957022371135096]
This paper addresses the persistent challenge of predicting mortality in patients with mental diagnoses.
Data from patients with mental disease diagnoses were extracted from the well-known clinical MIMIC-III data set.
Four machine learning algorithms were used, with results indicating that Random Forest and Support Vector Machine models outperformed others.
arXiv Detail & Related papers (2023-10-18T17:21:01Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Predicting Cardiovascular Complications in Post-COVID-19 Patients Using
Data-Driven Machine Learning Models [0.0]
The COVID-19 pandemic has globally posed numerous health challenges, notably the emergence of post-COVID-19 cardiovascular complications.
This study addresses this by utilizing data-driven machine learning models to predict such complications in 352 post-COVID-19 patients from Iraq.
arXiv Detail & Related papers (2023-09-27T22:52:08Z) - COVID-Net Biochem: An Explainability-driven Framework to Building
Machine Learning Models for Predicting Survival and Kidney Injury of COVID-19
Patients from Clinical and Biochemistry Data [66.43957431843324]
We introduce COVID-Net Biochem, a versatile and explainable framework for constructing machine learning models.
We apply this framework to predict COVID-19 patient survival and the likelihood of developing Acute Kidney Injury during hospitalization.
arXiv Detail & Related papers (2022-04-24T07:38:37Z) - Towards Trustworthy Cross-patient Model Development [3.109478324371548]
We study differences in model performance and explainability when trained for all patients and one patient at a time.
The results show that patients' demographics has a large impact on the performance and explainability and thus trustworthiness.
arXiv Detail & Related papers (2021-12-20T10:51:04Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Deep learning-based COVID-19 pneumonia classification using chest CT
images: model generalizability [54.86482395312936]
Deep learning (DL) classification models were trained to identify COVID-19-positive patients on 3D computed tomography (CT) datasets from different countries.
We trained nine identical DL-based classification models by using combinations of the datasets with a 72% train, 8% validation, and 20% test data split.
The models trained on multiple datasets and evaluated on a test set from one of the datasets used for training performed better.
arXiv Detail & Related papers (2021-02-18T21:14:52Z) - Classification supporting COVID-19 diagnostics based on patient survey
data [82.41449972618423]
logistic regression and XGBoost classifiers, that allow for effective screening of patients for COVID-19 were generated.
The obtained classification models provided the basis for the DECODE service (decode.polsl.pl), which can serve as support in screening patients with COVID-19 disease.
This data set consists of more than 3,000 examples is based on questionnaires collected at a hospital in Poland.
arXiv Detail & Related papers (2020-11-24T17:44:01Z) - Predicting Patient COVID-19 Disease Severity by means of Statistical and
Machine Learning Analysis of Blood Cell Transcriptome Data [3.5699804146136676]
We investigated how such data from the peripheral blood of COVID-19 patients might be used to predict clinical outcomes.
Our work revealed several clinical parameters measurable in blood samples, which discriminated between healthy people and COVID-19 positive patients.
We thus developed a number of analytic methods that showed accuracy and precision for disease severity and mortality outcome predictions that were above 90%.
arXiv Detail & Related papers (2020-11-19T10:32:46Z) - Individualized Prediction of COVID-19 Adverse outcomes with MLHO [9.197411456718708]
We developed an end-to-end Machine Learning framework that leverages iterative feature and algorithm selection to predict Health outcomes.
We modeled the four adverse outcomes utilizing about 600 features representing patients' pre-COVID health records and demographics.
Our results demonstrated that while demographic variables are important predictors of adverse outcomes after a COVID-19 infection, the incorporation of the past clinical records are vital for a reliable prediction model.
arXiv Detail & Related papers (2020-08-10T02:44:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.