Does AI help humans make better decisions? A methodological framework for experimental evaluation
- URL: http://arxiv.org/abs/2403.12108v1
- Date: Mon, 18 Mar 2024 01:04:52 GMT
- Title: Does AI help humans make better decisions? A methodological framework for experimental evaluation
- Authors: Eli Ben-Michael, D. James Greiner, Melody Huang, Kosuke Imai, Zhichao Jiang, Sooahn Shin,
- Abstract summary: We show how to compare the performance of three alternative decision-making systems--human-alone, human-with-AI, and AI-alone.
We find that AI recommendations do not improve the classification accuracy of a judge's decision to impose cash bail.
- Score: 0.43981305860983716
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The use of Artificial Intelligence (AI) based on data-driven algorithms has become ubiquitous in today's society. Yet, in many cases and especially when stakes are high, humans still make final decisions. The critical question, therefore, is whether AI helps humans make better decisions as compared to a human alone or AI an alone. We introduce a new methodological framework that can be used to answer experimentally this question with no additional assumptions. We measure a decision maker's ability to make correct decisions using standard classification metrics based on the baseline potential outcome. We consider a single-blinded experimental design, in which the provision of AI-generated recommendations is randomized across cases with a human making final decisions. Under this experimental design, we show how to compare the performance of three alternative decision-making systems--human-alone, human-with-AI, and AI-alone. We apply the proposed methodology to the data from our own randomized controlled trial of a pretrial risk assessment instrument. We find that AI recommendations do not improve the classification accuracy of a judge's decision to impose cash bail. Our analysis also shows that AI-alone decisions generally perform worse than human decisions with or without AI assistance. Finally, AI recommendations tend to impose cash bail on non-white arrestees more often than necessary when compared to white arrestees.
Related papers
- Towards a Cascaded LLM Framework for Cost-effective Human-AI Decision-Making [55.2480439325792]
We present a cascaded LLM decision framework that adaptively delegates tasks across multiple tiers of expertise.<n>First, a deferral policy determines whether to accept the base model's answer or regenerate it with the large model.<n>Second, an abstention policy decides whether the cascade model response is sufficiently certain or requires human intervention.
arXiv Detail & Related papers (2025-06-13T15:36:22Z) - Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks.
We show that the learned AI control system demonstrates robustness against adversarial tampering.
In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z) - Online Decision Mediation [72.80902932543474]
Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior.
In clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances.
arXiv Detail & Related papers (2023-10-28T05:59:43Z) - Learning to Make Adherence-Aware Advice [8.419688203654948]
This paper presents a sequential decision-making model that takes into account the human's adherence level.
We provide learning algorithms that learn the optimal advice policy and make advice only at critical time stamps.
arXiv Detail & Related papers (2023-10-01T23:15:55Z) - Using AI Uncertainty Quantification to Improve Human Decision-Making [14.878886078377562]
AI Uncertainty Quantification (UQ) has the potential to improve human decision-making beyond AI predictions alone.
We evaluated the impact on human decision-making for instance-level UQ, using a strict scoring rule, in two online behavioral experiments.
arXiv Detail & Related papers (2023-09-19T18:01:25Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - Algorithmic Assistance with Recommendation-Dependent Preferences [2.864550757598007]
We consider the effect and design of algorithmic recommendations when they affect choices.
We show that recommendation-dependent preferences create inefficiencies where the decision-maker is overly responsive to the recommendation.
arXiv Detail & Related papers (2022-08-16T09:24:47Z) - Randomized Classifiers vs Human Decision-Makers: Trustworthy AI May Have
to Act Randomly and Society Seems to Accept This [0.8889304968879161]
We feel that akin to human decisions, judgments of artificial agents should necessarily be grounded in some moral principles.
Yet a decision-maker can only make truly ethical (based on any ethical theory) and fair (according to any notion of fairness) decisions if full information on all the relevant factors on which the decision is based are available at the time of decision-making.
arXiv Detail & Related papers (2021-11-15T05:39:02Z) - Indecision Modeling [50.00689136829134]
It is important that AI systems act in ways which align with human values.
People are often indecisive, and especially so when their decision has moral implications.
arXiv Detail & Related papers (2020-12-15T18:32:37Z) - A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous
Algorithmic Scores [85.12096045419686]
We study the adoption of an algorithmic tool used to assist child maltreatment hotline screening decisions.
We first show that humans do alter their behavior when the tool is deployed.
We show that humans are less likely to adhere to the machine's recommendation when the score displayed is an incorrect estimate of risk.
arXiv Detail & Related papers (2020-02-19T07:27:32Z) - Effect of Confidence and Explanation on Accuracy and Trust Calibration
in AI-Assisted Decision Making [53.62514158534574]
We study whether features that reveal case-specific model information can calibrate trust and improve the joint performance of the human and AI.
We show that confidence score can help calibrate people's trust in an AI model, but trust calibration alone is not sufficient to improve AI-assisted decision making.
arXiv Detail & Related papers (2020-01-07T15:33:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.