Related papers: Auditing for Human Expertise

Auditing for Human Expertise

URL: http://arxiv.org/abs/2306.01646v3
Date: Mon, 25 Nov 2024 13:59:11 GMT
Title: Auditing for Human Expertise
Authors: Rohan Alur, Loren Laine, Darrick K. Li, Manish Raghavan, Devavrat Shah, Dennis Shung,
Abstract summary: We develop a statistical framework under which we can pose this question as a natural hypothesis test. We propose a simple procedure which tests whether expert predictions are statistically independent from the outcomes of interest. A rejection of our test thus suggests that human experts may add value to any algorithm trained on the available data.
Score: 12.967730957018688
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: High-stakes prediction tasks (e.g., patient diagnosis) are often handled by trained human experts. A common source of concern about automation in these settings is that experts may exercise intuition that is difficult to model and/or have access to information (e.g., conversations with a patient) that is simply unavailable to a would-be algorithm. This raises a natural question whether human experts add value which could not be captured by an algorithmic predictor. We develop a statistical framework under which we can pose this question as a natural hypothesis test. Indeed, as our framework highlights, detecting human expertise is more subtle than simply comparing the accuracy of expert predictions to those made by a particular learning algorithm. Instead, we propose a simple procedure which tests whether expert predictions are statistically independent from the outcomes of interest after conditioning on the available inputs (`features'). A rejection of our test thus suggests that human experts may add value to any algorithm trained on the available data, and has direct implications for whether human-AI `complementarity' is achievable in a given prediction task. We highlight the utility of our procedure using admissions data collected from the emergency department of a large academic hospital system, where we show that physicians' admit/discharge decisions for patients with acute gastrointestinal bleeding (AGIB) appear to be incorporating information that is not available to a standard algorithmic screening tool. This is despite the fact that the screening tool is arguably more accurate than physicians' discretionary decisions, highlighting that -- even absent normative concerns about accountability or interpretability -- accuracy is insufficient to justify algorithmic automation.

Related papers

Uncertainty-aware abstention in medical diagnosis based on medical texts [87.88110503208016]
This study addresses the critical issue of reliability for AI-assisted medical diagnosis. We focus on the selection prediction approach that allows the diagnosis system to abstain from providing the decision if it is not confident in the diagnosis. We introduce HUQ-2, a new state-of-the-art method for enhancing reliability in selective prediction tasks.
arXiv Detail & Related papers (2025-02-25T10:15:21Z)
AI-Assisted Decision Making with Human Learning [8.598431584462944]
In many cases, despite the algorithm's superior performance, the final decision remains in human hands. This paper studies such AI-assisted decision-making settings, where the human learns through repeated interactions with the algorithm. We observe that the discrepancy between the algorithm's model and the human's model creates a fundamental tradeoff.
arXiv Detail & Related papers (2025-02-18T17:08:21Z)
Integrating Expert Judgment and Algorithmic Decision Making: An Indistinguishability Framework [12.967730957018688]
We introduce a novel framework for human-AI collaboration in prediction and decision tasks. Our approach leverages human judgment to distinguish inputs which are algorithmically indistinguishable, or "look the same" to any feasible predictive algorithm.
arXiv Detail & Related papers (2024-10-11T13:03:53Z)
SepsisLab: Early Sepsis Prediction with Uncertainty Quantification and Active Sensing [67.8991481023825]
Sepsis is the leading cause of in-hospital mortality in the USA. Existing predictive models are usually trained on high-quality data with few missing information. For the potential high-risk patients with low confidence due to limited observations, we propose a robust active sensing algorithm.
arXiv Detail & Related papers (2024-07-24T04:47:36Z)
TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets [57.067409211231244]
This paper presents meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design. We provide basic validation methods for each task to ensure the datasets' usability and reliability. We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design.
arXiv Detail & Related papers (2024-06-30T09:13:10Z)
Leveraging graph neural networks for supporting Automatic Triage of Patients [5.864579168378686]
We propose an AI based module to manage patients emergency code assignments in emergency departments. Data containing relevant patient information, such as vital signs, symptoms, and medical history, are used to accurately classify patients into triage categories.
arXiv Detail & Related papers (2024-03-11T09:54:35Z)
Human Expertise in Algorithmic Prediction [16.104330706951004]
We introduce a novel framework for incorporating human expertise into algorithmic predictions. Our approach leverages human judgment to distinguish inputs which are algorithmically indistinguishable, or "look the same" to predictive algorithms.
arXiv Detail & Related papers (2024-02-01T17:23:54Z)
Informing clinical assessment by contextualizing post-hoc explanations of risk prediction models in type-2 diabetes [50.8044927215346]
We consider a comorbidity risk prediction scenario and focus on contexts regarding the patients clinical state. We employ several state-of-the-art LLMs to present contexts around risk prediction model inferences and evaluate their acceptability. Our paper is one of the first end-to-end analyses identifying the feasibility and benefits of contextual explanations in a real-world clinical use case.
arXiv Detail & Related papers (2023-02-11T18:07:11Z)
Scheduling with Predictions [0.0]
Modern learning techniques have made it possible to detect abnormalities in medical images within minutes. Machine-assisted diagnoses cannot yet reliably replace human reviews of images by a radiologist. We study this scenario by formulating it as a learning-augmented online scheduling problem.
arXiv Detail & Related papers (2022-12-20T17:10:06Z)
Uncertainty estimation for out-of-distribution detection in computational histopathology [0.0]
We show that a distance-aware uncertainty estimation method outperforms commonly used approaches. We also investigate the use of uncertainty thresholding to reject out-of-distribution samples for selective prediction.
arXiv Detail & Related papers (2022-10-18T14:49:44Z)
Clinical Outcome Prediction from Admission Notes using Self-Supervised Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks. Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets. We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z)
BiteNet: Bidirectional Temporal Encoder Network to Predict Medical Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey. An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys. We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z)
A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores [85.12096045419686]
We study the adoption of an algorithmic tool used to assist child maltreatment hotline screening decisions. We first show that humans do alter their behavior when the tool is deployed. We show that humans are less likely to adhere to the machine's recommendation when the score displayed is an incorrect estimate of risk.
arXiv Detail & Related papers (2020-02-19T07:27:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.