Related papers: People over trust AI-generated medical responses and view them to be as valid as doctors, despite low accuracy

People over trust AI-generated medical responses and view them to be as valid as doctors, despite low accuracy

URL: http://arxiv.org/abs/2408.15266v1
Date: Sun, 11 Aug 2024 23:41:28 GMT
Title: People over trust AI-generated medical responses and view them to be as valid as doctors, despite low accuracy
Authors: Shruthi Shekar, Pat Pataranutaporn, Chethan Sarabu, Guillermo A. Cecchi, Pattie Maes,
Abstract summary: A total of 300 participants gave evaluations for medical responses that were either written by a medical doctor on an online healthcare platform, or generated by a large language model. Results showed that participants could not effectively distinguish between AI-generated and Doctors' responses.
Score: 25.91497161129666
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents a comprehensive analysis of how AI-generated medical responses are perceived and evaluated by non-experts. A total of 300 participants gave evaluations for medical responses that were either written by a medical doctor on an online healthcare platform, or generated by a large language model and labeled by physicians as having high or low accuracy. Results showed that participants could not effectively distinguish between AI-generated and Doctors' responses and demonstrated a preference for AI-generated responses, rating High Accuracy AI-generated responses as significantly more valid, trustworthy, and complete/satisfactory. Low Accuracy AI-generated responses on average performed very similar to Doctors' responses, if not more. Participants not only found these low-accuracy AI-generated responses to be valid, trustworthy, and complete/satisfactory but also indicated a high tendency to follow the potentially harmful medical advice and incorrectly seek unnecessary medical attention as a result of the response provided. This problematic reaction was comparable if not more to the reaction they displayed towards doctors' responses. This increased trust placed on inaccurate or inappropriate AI-generated medical advice can lead to misdiagnosis and harmful consequences for individuals seeking help. Further, participants were more trusting of High Accuracy AI-generated responses when told they were given by a doctor and experts rated AI-generated responses significantly higher when the source of the response was unknown. Both experts and non-experts exhibited bias, finding AI-generated responses to be more thorough and accurate than Doctors' responses but still valuing the involvement of a Doctor in the delivery of their medical advice. Ensuring AI systems are implemented with medical professionals should be the future of using AI for the delivery of medical advice.

Related papers

Retrieval-augmented systems can be dangerous medical communicators [21.371504193281226]
Patients have long sought health information online, and increasingly, they are turning to generative AI to answer their health-related queries. Retrieval-augmented generation and citation grounding have been widely promoted as methods to reduce hallucinations and improve the accuracy of AI-generated responses. This paper argues that even when these methods produce literally accurate content drawn from source documents sans hallucinations, they can still be highly misleading.
arXiv Detail & Related papers (2025-02-18T01:57:02Z)
Conversational Medical AI: Ready for Practice [0.19791587637442667]
We present the first large-scale evaluation of a physician-supervised conversational agent in a real-world medical setting. Our agent, Mo, was integrated into an existing medical advice chat service.
arXiv Detail & Related papers (2024-11-19T19:00:31Z)
Explainable AI Enhances Glaucoma Referrals, Yet the Human-AI Team Still Falls Short of the AI Alone [6.740852152639975]
We investigate how various AI explanations help providers distinguish between patients needing immediate or non-urgent specialist referrals. We built explainable AI algorithms to predict glaucoma surgery needs from routine eyecare data as a proxy for identifying high-risk patients. We incorporated intrinsic and post-hoc explainability and conducted an online study with optometrists to assess human-AI team performance.
arXiv Detail & Related papers (2024-05-24T03:01:20Z)
The impact of responding to patient messages with large language model assistance [4.243020918808522]
Documentation burden is a major contributor to clinician burnout. Many hospitals are actively integrating such systems into electronic medical record systems. We are the first to examine the utility of large language models in assisting clinicians draft responses to patient questions.
arXiv Detail & Related papers (2023-10-26T18:03:46Z)
Understanding how the use of AI decision support tools affect critical thinking and over-reliance on technology by drug dispensers in Tanzania [0.0]
Drug shop dispensers were on AI-powered technologies when determining a differential diagnosis for a presented clinical case vignette. We found that dispensers relied on the decision made by the AI 25 percent of the time, even when the AI provided no explanation for its decision.
arXiv Detail & Related papers (2023-02-19T05:59:06Z)
Informing clinical assessment by contextualizing post-hoc explanations of risk prediction models in type-2 diabetes [50.8044927215346]
We consider a comorbidity risk prediction scenario and focus on contexts regarding the patients clinical state. We employ several state-of-the-art LLMs to present contexts around risk prediction model inferences and evaluate their acceptability. Our paper is one of the first end-to-end analyses identifying the feasibility and benefits of contextual explanations in a real-world clinical use case.
arXiv Detail & Related papers (2023-02-11T18:07:11Z)
The Role of AI in Drug Discovery: Challenges, Opportunities, and Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed. The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z)
What Do End-Users Really Want? Investigation of Human-Centered XAI for Mobile Health Apps [69.53730499849023]
We present a user-centered persona concept to evaluate explainable AI (XAI) Results show that users' demographics and personality, as well as the type of explanation, impact explanation preferences. Our insights bring an interactive, human-centered XAI closer to practical application.
arXiv Detail & Related papers (2022-10-07T12:51:27Z)
Towards the Use of Saliency Maps for Explaining Low-Quality Electrocardiograms to End Users [45.62380752173638]
When using medical images for diagnosis, it is important that the images are of high quality. In telemedicine, a common problem is that the quality issue is only flagged once the patient has left the clinic, meaning they must return in order to have the exam redone. This paper reports on the development of an AI system for flagging and explaining low-quality medical images in real-time.
arXiv Detail & Related papers (2022-07-06T14:53:26Z)
Explainable AI for medical imaging: Explaining pneumothorax diagnoses with Bayesian Teaching [4.707325679181196]
We introduce and evaluate explanations based on Bayesian Teaching. We find that medical experts exposed to explanations successfully predict the AI's diagnostic decisions. These results show that Explainable AI can be used to support human-AI collaboration in medical imaging.
arXiv Detail & Related papers (2021-06-08T20:49:11Z)
Semi-Supervised Variational Reasoning for Medical Dialogue Generation [70.838542865384]
Two key characteristics are relevant for medical dialogue generation: patient states and physician actions. We propose an end-to-end variational reasoning approach to medical dialogue generation. A physician policy network composed of an action-classifier and two reasoning detectors is proposed for augmented reasoning ability.
arXiv Detail & Related papers (2021-05-13T04:14:35Z)
Artificial Artificial Intelligence: Measuring Influence of AI 'Assessments' on Moral Decision-Making [48.66982301902923]
We examined the effect of feedback from false AI on moral decision-making about donor kidney allocation. We found some evidence that judgments about whether a patient should receive a kidney can be influenced by feedback about participants' own decision-making perceived to be given by AI.
arXiv Detail & Related papers (2020-01-13T14:15:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.