Identifying and mitigating bias in algorithms used to manage patients in
a pandemic
- URL: http://arxiv.org/abs/2111.00340v1
- Date: Sat, 30 Oct 2021 21:10:56 GMT
- Title: Identifying and mitigating bias in algorithms used to manage patients in
a pandemic
- Authors: Yifan Li, Garrett Yoon, Mustafa Nasir-Moin, David Rosenberg, Sean
Neifert, and Douglas Kondziolka, Eric Karl Oermann
- Abstract summary: Logistic regression models were created to predict COVID-19 mortality, ventilator status and inpatient status using a real-world dataset.
Models showed a 57% decrease in the number of biased trials.
After calibration, the average sensitivity of the predictive models increased from 0.527 to 0.955.
- Score: 4.756860520861679
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Numerous COVID-19 clinical decision support systems have been developed.
However many of these systems do not have the merit for validity due to
methodological shortcomings including algorithmic bias. Methods Logistic
regression models were created to predict COVID-19 mortality, ventilator status
and inpatient status using a real-world dataset consisting of four hospitals in
New York City and analyzed for biases against race, gender and age. Simple
thresholding adjustments were applied in the training process to establish more
equitable models. Results Compared to the naively trained models, the
calibrated models showed a 57% decrease in the number of biased trials, while
predictive performance, measured by area under the receiver/operating curve
(AUC), remained unchanged. After calibration, the average sensitivity of the
predictive models increased from 0.527 to 0.955. Conclusion We demonstrate that
naively training and deploying machine learning models on real world data for
predictive analytics of COVID-19 has a high risk of bias. Simple implemented
adjustments or calibrations during model training can lead to substantial and
sustained gains in fairness on subsequent deployment.
Related papers
- Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options.
The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z) - Symptom-based Machine Learning Models for the Early Detection of
COVID-19: A Narrative Review [0.0]
Machine learning models can analyze large datasets, incorporating patient-reported symptoms, clinical data, and medical imaging.
In this paper, we provide an overview of the landscape of symptoms-only machine learning models for predicting COVID-19, including their performance and limitations.
The review will also examine the performance of symptom-based models when compared to image-based models.
arXiv Detail & Related papers (2023-12-08T01:41:42Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Clinical Deterioration Prediction in Brazilian Hospitals Based on
Artificial Neural Networks and Tree Decision Models [56.93322937189087]
An extremely boosted neural network (XBNet) is used to predict clinical deterioration (CD)
The XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
arXiv Detail & Related papers (2022-12-17T23:29:14Z) - Systematic investigation into generalization of COVID-19 CT deep
learning models with Gabor ensemble for lung involvement scoring [9.94980188821453]
This study investigates the generalizability of key published models using the publicly available COVID-19 Computed Tomography data.
We then assess the predictive ability of these models for COVID-19 severity using an independent new dataset.
arXiv Detail & Related papers (2021-04-20T03:49:48Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Comparative Analysis of Machine Learning Approaches to Analyze and
Predict the Covid-19 Outbreak [10.307715136465056]
We present a comparative analysis of various machine learning (ML) approaches in predicting the COVID-19 outbreak in the epidemiological domain.
The results reveal the advantages of ML algorithms for supporting decision making of evolving short term policies.
arXiv Detail & Related papers (2021-02-11T11:57:33Z) - Increasing the efficiency of randomized trial estimates via linear
adjustment for a prognostic score [59.75318183140857]
Estimating causal effects from randomized experiments is central to clinical research.
Most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control.
arXiv Detail & Related papers (2020-12-17T21:10:10Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - Individualized Prediction of COVID-19 Adverse outcomes with MLHO [9.197411456718708]
We developed an end-to-end Machine Learning framework that leverages iterative feature and algorithm selection to predict Health outcomes.
We modeled the four adverse outcomes utilizing about 600 features representing patients' pre-COVID health records and demographics.
Our results demonstrated that while demographic variables are important predictors of adverse outcomes after a COVID-19 infection, the incorporation of the past clinical records are vital for a reliable prediction model.
arXiv Detail & Related papers (2020-08-10T02:44:52Z) - Hemogram Data as a Tool for Decision-making in COVID-19 Management:
Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure.
This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients.
Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.