Related papers: How Reliable AI Chatbots are for Disease Prediction from Patient Complaints?

How Reliable AI Chatbots are for Disease Prediction from Patient Complaints?

URL: http://arxiv.org/abs/2405.13219v1
Date: Tue, 21 May 2024 22:00:13 GMT
Title: How Reliable AI Chatbots are for Disease Prediction from Patient Complaints?
Authors: Ayesha Siddika Nipu, K M Sajjadul Islam, Praveen Madiraju,
Abstract summary: This study examines the reliability of AI chatbots, specifically GPT 4.0, Claude 3 Opus, and Gemini Ultra 1.0, in predicting diseases from patient complaints in the emergency department. Results suggest that GPT 4.0 achieves high accuracy with increased few-shot data, while Gemini Ultra 1.0 performs well with fewer examples, and Claude 3 Opus maintains consistent performance.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Artificial Intelligence (AI) chatbots leveraging Large Language Models (LLMs) are gaining traction in healthcare for their potential to automate patient interactions and aid clinical decision-making. This study examines the reliability of AI chatbots, specifically GPT 4.0, Claude 3 Opus, and Gemini Ultra 1.0, in predicting diseases from patient complaints in the emergency department. The methodology includes few-shot learning techniques to evaluate the chatbots' effectiveness in disease prediction. We also fine-tune the transformer-based model BERT and compare its performance with the AI chatbots. Results suggest that GPT 4.0 achieves high accuracy with increased few-shot data, while Gemini Ultra 1.0 performs well with fewer examples, and Claude 3 Opus maintains consistent performance. BERT's performance, however, is lower than all the chatbots, indicating limitations due to limited labeled data. Despite the chatbots' varying accuracy, none of them are sufficiently reliable for critical medical decision-making, underscoring the need for rigorous validation and human oversight. This study reflects that while AI chatbots have potential in healthcare, they should complement, not replace, human expertise to ensure patient safety. Further refinement and research are needed to improve AI-based healthcare applications' reliability for disease prediction.

Related papers

Explainable AI for Collaborative Assessment of 2D/3D Registration Quality [50.65650507103078]
We propose the first artificial intelligence framework trained specifically for 2D/3D registration quality verification.<n>Our explainable AI (XAI) approach aims to enhance informed decision-making for human operators.
arXiv Detail & Related papers (2025-07-23T15:28:57Z)
Effect of Static vs. Conversational AI-Generated Messages on Colorectal Cancer Screening Intent: a Randomized Controlled Trial [5.429833789548265]
Large language model (LLM) chatbots show increasing promise in persuasive communication.<n>We enrolled 915 U.S. adults (ages 45-75) who had never completed colorectal cancer (CRC) screening.<n>Both AI interventions significantly increased stool test intentions by over 12 points (12.9-13.8/100), compared to a 7.5 gain for expert materials.
arXiv Detail & Related papers (2025-07-10T22:46:43Z)
Development and Evaluation of HopeBot: an LLM-based chatbot for structured and interactive PHQ-9 depression screening [48.355615275247786]
HopeBot administers the Patient Health Questionnaire-9 (PHQ-9) using retrieval-augmented generation and real-time clarification.<n>In a within-subject study, 132 adults in the United Kingdom and China completed both self-administered and chatbots versions.<n>Overall, 87.1% expressed willingness to reuse or recommend HopeBot.
arXiv Detail & Related papers (2025-07-08T13:41:22Z)
Seq2Seq Model-Based Chatbot with LSTM and Attention Mechanism for Enhanced User Interaction [1.937324318931008]
This work proposes a Sequence-to-Sequence (Seq2Seq) model with an encoder-decoder architecture that incorporates attention mechanisms and Long Short-Term Memory (LSTM) cells. The proposed Seq2Seq model-based robot is trained, validated, and tested on a dataset specifically for the tourism sector in Draa-Tafilalet, Morocco.
arXiv Detail & Related papers (2024-12-27T23:50:54Z)
Empathetic Response in Audio-Visual Conversations Using Emotion Preference Optimization and MambaCompressor [44.499778745131046]
Our study introduces a dual approach: firstly, we employ Emotional Preference Optimization (EPO) to train chatbots. This training enables the model to discern fine distinctions between correct and counter-emotional responses. Secondly, we introduce MambaCompressor to effectively compress and manage extensive conversation histories. Our comprehensive experiments across multiple datasets demonstrate that our model significantly outperforms existing models in generating empathetic responses and managing lengthy dialogues.
arXiv Detail & Related papers (2024-12-23T13:44:51Z)
Which Client is Reliable?: A Reliable and Personalized Prompt-based Federated Learning for Medical Image Question Answering [51.26412822853409]
We present a novel personalized federated learning (pFL) method for medical visual question answering (VQA) models. Our method introduces learnable prompts into a Transformer architecture to efficiently train it on diverse medical datasets without massive computational costs.
arXiv Detail & Related papers (2024-10-23T00:31:17Z)
A General-purpose AI Avatar in Healthcare [1.5081825869395544]
This paper focuses on the role of chatbots in healthcare and explores the use of avatars to make AI interactions more appealing to patients. A framework of a general-purpose AI avatar application is demonstrated by using a three-category prompt dictionary and prompt improvement mechanism. A two-phase approach is suggested to fine-tune a general-purpose AI language model and create different AI avatars to discuss medical issues with users.
arXiv Detail & Related papers (2024-01-10T03:44:15Z)
The impact of responding to patient messages with large language model assistance [4.243020918808522]
Documentation burden is a major contributor to clinician burnout. Many hospitals are actively integrating such systems into electronic medical record systems. We are the first to examine the utility of large language models in assisting clinicians draft responses to patient questions.
arXiv Detail & Related papers (2023-10-26T18:03:46Z)
Assistive Chatbots for healthcare: a succinct review [0.0]
The focus on AI-enabled technology is because of its potential for enhancing the quality of human-machine interaction. There is a lack of trust on this technology regarding patient safety and data protection. Patients have expressed dissatisfaction with Natural Language Processing skills.
arXiv Detail & Related papers (2023-08-08T10:35:25Z)
Towards Healthy AI: Large Language Models Need Therapists Too [41.86344997530743]
We define Healthy AI to be safe, trustworthy and ethical. We present the SafeguardGPT framework that uses psychotherapy to correct for these harmful behaviors.
arXiv Detail & Related papers (2023-04-02T00:39:12Z)
The Role of AI in Drug Discovery: Challenges, Opportunities, and Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed. The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z)
Can Machines Imitate Humans? Integrative Turing Tests for Vision and Language Demonstrate a Narrowing Gap [45.6806234490428]
We benchmark current AIs in their abilities to imitate humans in three language tasks and three vision tasks. Experiments involved 549 human agents plus 26 AI agents for dataset creation, and 1,126 human judges plus 10 AI judges. Results reveal that current AIs are not far from being able to impersonate humans in complex language and vision challenges.
arXiv Detail & Related papers (2022-11-23T16:16:52Z)
CheerBots: Chatbots toward Empathy and Emotionusing Reinforcement Learning [60.348822346249854]
This study presents a framework whereby several empathetic chatbots are based on understanding users' implied feelings and replying empathetically for multiple dialogue turns. We call these chatbots CheerBots. CheerBots can be retrieval-based or generative-based and were finetuned by deep reinforcement learning. To respond in an empathetic way, we develop a simulating agent, a Conceptual Human Model, as aids for CheerBots in training with considerations on changes in user's emotional states in the future to arouse sympathy.
arXiv Detail & Related papers (2021-10-08T07:44:47Z)
An Evaluation of Generative Pre-Training Model-based Therapy Chatbot for Caregivers [5.2116528363639985]
Generative-based approaches, such as the OpenAI GPT models, could allow for more dynamic conversations in therapy contexts. We built a chatbots using the GPT-2 model and fine-tuned it with 306 therapy session transcripts between family caregivers of individuals with dementia and therapists conducting Problem Solving Therapy. Results showed that the fine-tuned model created more non-word outputs than the pre-trained model.
arXiv Detail & Related papers (2021-07-28T01:01:08Z)
Put Chatbot into Its Interlocutor's Shoes: New Framework to Learn Chatbot Responding with Intention [55.77218465471519]
This paper proposes an innovative framework to train chatbots to possess human-like intentions. Our framework included a guiding robot and an interlocutor model that plays the role of humans. We examined our framework using three experimental setups and evaluate the guiding robot with four different metrics to demonstrated flexibility and performance advantages.
arXiv Detail & Related papers (2021-03-30T15:24:37Z)
Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork [54.309495231017344]
We argue that AI systems should be trained in a human-centered manner, directly optimized for team performance. We study this proposal for a specific type of human-AI teaming, where the human overseer chooses to either accept the AI recommendation or solve the task themselves. Our experiments with linear and non-linear models on real-world, high-stakes datasets show that the most accuracy AI may not lead to highest team performance.
arXiv Detail & Related papers (2020-04-27T19:06:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.