Tool or Tutor? Experimental evidence from AI deployment in cancer diagnosis
- URL: http://arxiv.org/abs/2502.16411v3
- Date: Sun, 30 Mar 2025 09:36:10 GMT
- Title: Tool or Tutor? Experimental evidence from AI deployment in cancer diagnosis
- Authors: Vivianna Fang He, Sihan Li, Phanish Puranam, Feng Lin,
- Abstract summary: We propose that AI-driven training and AI-assisted task completion can have a joint effect on human capability.<n>In a field experiment with 336 medical students, we manipulated AI deployment in training, in practice, and in both.
- Score: 3.0641365294595815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Professionals increasingly use Artificial Intelligence (AI) to enhance their capabilities and assist with task execution. While prior research has examined these uses separately, their potential interaction remains underexplored. We propose that AI-driven training ("tutor") and AI-assisted task completion ("tool") can have a joint effect on human capability and test this hypothesis in the context of lung cancer diagnosis. In a field experiment with 336 medical students, we manipulated AI deployment in training, in practice, and in both. Our findings reveal that while AI-integrated training and AI assistance independently improved diagnostic performance, their combination yielded the highest accuracy. These results underscore AI's dual role in enhancing human performance through both learning and real-time support, offering insights into AI deployment in professional settings where human expertise remains essential.
Related papers
- Explainable AI as a Double-Edged Sword in Dermatology: The Impact on Clinicians versus The Public [46.86429592892395]
explainable AI (XAI) addresses this by providing AI decision-making insight.<n>We present results from two large-scale experiments combining a fairness-based diagnosis AI model and different XAI explanations.
arXiv Detail & Related papers (2025-12-14T00:06:06Z) - Evolving Diagnostic Agents in a Virtual Clinical Environment [75.59389103511559]
We present a framework for training large language models (LLMs) as diagnostic agents with reinforcement learning.<n>Our method acquires diagnostic strategies through interactive exploration and outcome-based feedback.<n>DiagAgent significantly outperforms 10 state-of-the-art LLMs, including DeepSeek-v3 and GPT-4o.
arXiv Detail & Related papers (2025-10-28T17:19:47Z) - Reverse Physician-AI Relationship: Full-process Clinical Diagnosis Driven by a Large Language Model [71.40113970879219]
We propose a paradigm shift that reverses the relationship between physicians and AI.<n>We present DxDirector-7B, an LLM endowed with advanced deep thinking capabilities, enabling it to drive the full-process diagnosis with minimal physician involvement.<n>In evaluations across rare, complex, and real-world cases under full-process diagnosis setting, DxDirector-7B not only achieves significant superior diagnostic accuracy but also substantially reduces physician workload.
arXiv Detail & Related papers (2025-08-14T09:51:20Z) - Towards Effective Human-in-the-Loop Assistive AI Agents [15.11527529177358]
We introduce an evaluation framework and a dataset of human-AI interactions to assess how AI guidance affects procedural task performance.<n>We also develop an AR-equipped AI agent that provides interactive guidance in real-world tasks, from cooking to battlefield medicine.
arXiv Detail & Related papers (2025-07-24T12:50:46Z) - Explainable AI for Collaborative Assessment of 2D/3D Registration Quality [50.65650507103078]
We propose the first artificial intelligence framework trained specifically for 2D/3D registration quality verification.<n>Our explainable AI (XAI) approach aims to enhance informed decision-making for human operators.
arXiv Detail & Related papers (2025-07-23T15:28:57Z) - Toward the Autonomous AI Doctor: Quantitative Benchmarking of an Autonomous Agentic AI Versus Board-Certified Clinicians in a Real World Setting [0.0]
Globally we face a projected shortage of 11 million healthcare practitioners by 2030.<n>No end-to-end autonomous large language model (LLM)-based AI system has been rigorously evaluated in real-world clinical practice.
arXiv Detail & Related papers (2025-06-27T19:04:44Z) - Can Domain Experts Rely on AI Appropriately? A Case Study on AI-Assisted Prostate Cancer MRI Diagnosis [19.73932120146401]
We conduct an in-depth collaboration with radiologists in prostate cancer diagnosis based on MRI images.<n>We develop an interface and conduct two experiments to study how AI assistance and performance feedback shape the decision making of domain experts.
arXiv Detail & Related papers (2025-02-03T18:59:38Z) - Human-AI Collaborative Game Testing with Vision Language Models [0.0]
This study investigates how AI can improve game testing by developing and experimenting with an AI-assisted workflow.<n>We evaluate the effectiveness of AI assistance under four conditions: with or without AI support, and with or without detailed knowledge of defects and design documentation.<n>Results indicate that AI assistance significantly improves defect identification performance, particularly when paired with detailed knowledge.
arXiv Detail & Related papers (2025-01-20T23:14:23Z) - Leveraging AI for Automatic Classification of PCOS Using Ultrasound Imaging [0.0]
The AUTO-PCOS Classification Challenge seeks to advance the diagnostic capabilities of artificial intelligence (AI) in identifying Polycystic Ovary Syndrome (PCOS)<n>This report outlines our methodology for building a robust AI pipeline utilizing transfer learning with the InceptionV3 architecture to achieve high accuracy in binary classification.
arXiv Detail & Related papers (2024-12-30T11:56:11Z) - AI-Enhanced Sensemaking: Exploring the Design of a Generative AI-Based Assistant to Support Genetic Professionals [38.54324092761751]
Generative AI has the potential to transform knowledge work, but further research is needed to understand how knowledge workers envision using and interacting with generative AI.<n>Our research focused on designing a generative AI assistant to aid genetic professionals in analyzing whole genome sequences (WGS) and other clinical data for rare disease diagnosis.
arXiv Detail & Related papers (2024-12-19T22:54:49Z) - How Performance Pressure Influences AI-Assisted Decision Making [57.53469908423318]
We show how pressure and explainable AI (XAI) techniques interact with AI advice-taking behavior.<n>Our results show complex interaction effects, with different combinations of pressure and XAI techniques either improving or worsening AI advice taking behavior.
arXiv Detail & Related papers (2024-10-21T22:39:52Z) - AI Workflow, External Validation, and Development in Eye Disease Diagnosis [5.940140611616894]
AI shows promise in diagnosis accuracy but faces real-world application issues due to insufficient validation in clinical and diverse populations.<n>This study addresses gaps in medical AI downstream accountability through a case study on age-related macular degeneration (AMD) diagnosis and classification severity.
arXiv Detail & Related papers (2024-09-23T15:01:09Z) - Explainable AI Enhances Glaucoma Referrals, Yet the Human-AI Team Still Falls Short of the AI Alone [6.740852152639975]
We investigate how various AI explanations help providers distinguish between patients needing immediate or non-urgent specialist referrals.
We built explainable AI algorithms to predict glaucoma surgery needs from routine eyecare data as a proxy for identifying high-risk patients.
We incorporated intrinsic and post-hoc explainability and conducted an online study with optometrists to assess human-AI team performance.
arXiv Detail & Related papers (2024-05-24T03:01:20Z) - BO-Muse: A human expert and AI teaming framework for accelerated
experimental design [58.61002520273518]
Our algorithm lets the human expert take the lead in the experimental process.
We show that our algorithm converges sub-linearly, at a rate faster than the AI or human alone.
arXiv Detail & Related papers (2023-03-03T02:56:05Z) - Advancing Human-AI Complementarity: The Impact of User Expertise and
Algorithmic Tuning on Joint Decision Making [10.890854857970488]
Many factors can impact success of Human-AI teams, including a user's domain expertise, mental models of an AI system, trust in recommendations, and more.
Our study examined user performance in a non-trivial blood vessel labeling task where participants indicated whether a given blood vessel was flowing or stalled.
Our results show that while recommendations from an AI-Assistant can aid user decision making, factors such as users' baseline performance relative to the AI and complementary tuning of AI error types significantly impact overall team performance.
arXiv Detail & Related papers (2022-08-16T21:39:58Z) - Robust and Efficient Medical Imaging with Self-Supervision [80.62711706785834]
We present REMEDIS, a unified representation learning strategy to improve robustness and data-efficiency of medical imaging AI.
We study a diverse range of medical imaging tasks and simulate three realistic application scenarios using retrospective data.
arXiv Detail & Related papers (2022-05-19T17:34:18Z) - Who Goes First? Influences of Human-AI Workflow on Decision Making in
Clinical Imaging [24.911186503082465]
This study explores the effects of providing AI assistance at the start of a diagnostic session in radiology versus after the radiologist has made a provisional decision.
We found that participants who are asked to register provisional responses in advance of reviewing AI inferences are less likely to agree with the AI regardless of whether the advice is accurate and, in instances of disagreement with the AI, are less likely to seek the second opinion of a colleague.
arXiv Detail & Related papers (2022-05-19T16:59:25Z) - Cybertrust: From Explainable to Actionable and Interpretable AI (AI2) [58.981120701284816]
Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations.
It will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making.
arXiv Detail & Related papers (2022-01-26T18:53:09Z) - Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in
Artificial Intelligence [79.038671794961]
We launch the Unified CT-COVID AI Diagnostic Initiative (UCADI), where the AI model can be distributedly trained and independently executed at each host institution.
Our study is based on 9,573 chest computed tomography scans (CTs) from 3,336 patients collected from 23 hospitals located in China and the UK.
arXiv Detail & Related papers (2021-11-18T00:43:41Z) - Learning to Complement Humans [67.38348247794949]
A rising vision for AI in the open world centers on the development of systems that can complement humans for perceptual, diagnostic, and reasoning tasks.
We demonstrate how an end-to-end learning strategy can be harnessed to optimize the combined performance of human-machine teams.
arXiv Detail & Related papers (2020-05-01T20:00:23Z) - Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork [54.309495231017344]
We argue that AI systems should be trained in a human-centered manner, directly optimized for team performance.
We study this proposal for a specific type of human-AI teaming, where the human overseer chooses to either accept the AI recommendation or solve the task themselves.
Our experiments with linear and non-linear models on real-world, high-stakes datasets show that the most accuracy AI may not lead to highest team performance.
arXiv Detail & Related papers (2020-04-27T19:06:28Z) - Effect of Confidence and Explanation on Accuracy and Trust Calibration
in AI-Assisted Decision Making [53.62514158534574]
We study whether features that reveal case-specific model information can calibrate trust and improve the joint performance of the human and AI.
We show that confidence score can help calibrate people's trust in an AI model, but trust calibration alone is not sufficient to improve AI-assisted decision making.
arXiv Detail & Related papers (2020-01-07T15:33:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.