Statistical Tests for Replacing Human Decision Makers with Algorithms
- URL: http://arxiv.org/abs/2306.11689v1
- Date: Tue, 20 Jun 2023 17:09:04 GMT
- Title: Statistical Tests for Replacing Human Decision Makers with Algorithms
- Authors: Kai Feng, Han Hong, Ke Tang, Jingyuan Wang
- Abstract summary: The performance of each human decision maker is first benchmarked against machine predictions.
We then replace the decisions made by a subset of decision makers with the recommendation from the proposed artificial intelligence algorithm.
We find that our algorithm on a test dataset results in a higher overall true positive rate and a lower false positive rate than the diagnoses made by doctors only.
- Score: 32.877314377522524
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a statistical framework with which artificial
intelligence can improve human decision making. The performance of each human
decision maker is first benchmarked against machine predictions; we then
replace the decisions made by a subset of the decision makers with the
recommendation from the proposed artificial intelligence algorithm. Using a
large nationwide dataset of pregnancy outcomes and doctor diagnoses from
prepregnancy checkups of reproductive age couples, we experimented with both a
heuristic frequentist approach and a Bayesian posterior loss function approach
with an application to abnormal birth detection. We find that our algorithm on
a test dataset results in a higher overall true positive rate and a lower false
positive rate than the diagnoses made by doctors only. We also find that the
diagnoses of doctors from rural areas are more frequently replaceable,
suggesting that artificial intelligence assisted decision making tends to
improve precision more in less developed regions.
Related papers
- Uncertainty-aware abstention in medical diagnosis based on medical texts [87.88110503208016]
This study addresses the critical issue of reliability for AI-assisted medical diagnosis.
We focus on the selection prediction approach that allows the diagnosis system to abstain from providing the decision if it is not confident in the diagnosis.
We introduce HUQ-2, a new state-of-the-art method for enhancing reliability in selective prediction tasks.
arXiv Detail & Related papers (2025-02-25T10:15:21Z) - Two new feature selection methods based on learn-heuristic techniques for breast cancer prediction: A comprehensive analysis [6.796017024594715]
We suggest two novel feature selection (FS) methods based upon an imperialist competitive algorithm (ICA) and a bat algorithm (BA)
This study aims to enhance diagnostic models' efficiency and present a comprehensive analysis to help clinical physicians make much more precise and reliable decisions than before.
arXiv Detail & Related papers (2024-07-19T19:07:53Z) - A Survey of Artificial Intelligence in Gait-Based Neurodegenerative Disease Diagnosis [51.07114445705692]
neurodegenerative diseases (NDs) traditionally require extensive healthcare resources and human effort for medical diagnosis and monitoring.
As a crucial disease-related motor symptom, human gait can be exploited to characterize different NDs.
The current advances in artificial intelligence (AI) models enable automatic gait analysis for NDs identification and classification.
arXiv Detail & Related papers (2024-05-21T06:44:40Z) - The Limits of Fair Medical Imaging AI In The Wild [43.97266228706059]
We investigate the extent to which medical AI utilizes demographic encodings.
We confirm that medical imaging AI leverages demographic shortcuts in disease classification.
We find that models with less encoding of demographic attributes are often most "globally optimal"
arXiv Detail & Related papers (2023-12-11T18:59:50Z) - Auditing for Human Expertise [12.967730957018688]
We develop a statistical framework under which we can pose this question as a natural hypothesis test.
We propose a simple procedure which tests whether expert predictions are statistically independent from the outcomes of interest.
A rejection of our test thus suggests that human experts may add value to any algorithm trained on the available data.
arXiv Detail & Related papers (2023-06-02T16:15:24Z) - An Improved Model Ensembled of Different Hyper-parameter Tuned Machine
Learning Algorithms for Fetal Health Prediction [1.332560004325655]
We propose a robust ensemble model called ensemble of tuned Support Vector Machine and ExtraTrees for predicting fetal health.
Our proposed ETSE model outperformed the other models with 100% precision, 100% recall, 100% F1-score, and 99.66% accuracy.
arXiv Detail & Related papers (2023-05-26T16:40:44Z) - Predicting Adverse Neonatal Outcomes for Preterm Neonates with
Multi-Task Learning [51.487856868285995]
We first analyze the correlations between three adverse neonatal outcomes and then formulate the diagnosis of multiple neonatal outcomes as a multi-task learning (MTL) problem.
In particular, the MTL framework contains shared hidden layers and multiple task-specific branches.
arXiv Detail & Related papers (2023-03-28T00:44:06Z) - Artificial Intelligence Model for Tumoral Clinical Decision Support Systems [0.0]
Comparative diagnostic in brain tumor evaluation makes possible to use available information of a medical center to compare similar cases when a new patient is evaluated.
By leveraging Artificial Intelligence models, the proposed system is able of retrieving the most similar cases of brain tumors for a given query.
arXiv Detail & Related papers (2023-01-09T22:15:18Z) - Efficient error and variance estimation for randomized matrix computations [0.7366405857677227]
This paper proposes a leave-one-out error estimator for randomized low-rank approximations and a jackknife resampling method to estimate the variance of the output of a randomized matrix.
Both of these diagnostics are rapid to compute for randomized low-rank approximation algorithms such as the randomized SVD and randomized Nystr"om approximation.
arXiv Detail & Related papers (2022-07-13T16:57:35Z) - Post-hoc loss-calibration for Bayesian neural networks [25.05373000435213]
We develop methods for correcting approximate posterior predictive distributions encouraging them to prefer high-utility decisions.
In contrast to previous work, our approach is agnostic to the choice of the approximate inference algorithm.
arXiv Detail & Related papers (2021-06-13T13:53:27Z) - The Medkit-Learn(ing) Environment: Medical Decision Modelling through
Simulation [81.72197368690031]
We present a new benchmarking suite designed specifically for medical sequential decision making.
The Medkit-Learn(ing) Environment is a publicly available Python package providing simple and easy access to high-fidelity synthetic medical data.
arXiv Detail & Related papers (2021-06-08T10:38:09Z) - Predicting Parkinson's Disease with Multimodal Irregularly Collected
Longitudinal Smartphone Data [75.23250968928578]
Parkinsons Disease is a neurological disorder and prevalent in elderly people.
Traditional ways to diagnose the disease rely on in-person subjective clinical evaluations on the quality of a set of activity tests.
We propose a novel time-series based approach to predicting Parkinson's Disease with raw activity test data collected by smartphones in the wild.
arXiv Detail & Related papers (2020-09-25T01:50:15Z) - Improved Slice-wise Tumour Detection in Brain MRIs by Computing
Dissimilarities between Latent Representations [68.8204255655161]
Anomaly detection for Magnetic Resonance Images (MRIs) can be solved with unsupervised methods.
We have proposed a slice-wise semi-supervised method for tumour detection based on the computation of a dissimilarity function in the latent space of a Variational AutoEncoder.
We show that by training the models on higher resolution images and by improving the quality of the reconstructions, we obtain results which are comparable with different baselines.
arXiv Detail & Related papers (2020-07-24T14:02:09Z) - Hemogram Data as a Tool for Decision-making in COVID-19 Management:
Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure.
This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients.
Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z) - Anomaly Detection in Univariate Time-series: A Survey on the
State-of-the-Art [0.0]
Anomaly detection for time-series data has been an important research field for a long time.
Recent years an increasing number of machine learning algorithms have been developed to detect anomalies on time-series.
Researchers tried to improve these techniques using (deep) neural networks.
arXiv Detail & Related papers (2020-04-01T13:22:34Z) - A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous
Algorithmic Scores [85.12096045419686]
We study the adoption of an algorithmic tool used to assist child maltreatment hotline screening decisions.
We first show that humans do alter their behavior when the tool is deployed.
We show that humans are less likely to adhere to the machine's recommendation when the score displayed is an incorrect estimate of risk.
arXiv Detail & Related papers (2020-02-19T07:27:32Z) - Overly Optimistic Prediction Results on Imbalanced Data: a Case Study of
Flaws and Benefits when Applying Over-sampling [13.463035357173045]
We focus on one specific type of methodological flaw: applying over-sampling before partitioning the data into mutually exclusive training and testing sets.
We show how this causes the results to be biased using two artificial datasets and reproduce results of studies in which this flaw was identified.
arXiv Detail & Related papers (2020-01-15T12:53:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.