Related papers: Advice for Diabetes Self-Management by ChatGPT Models: Challenges and Recommendations

Advice for Diabetes Self-Management by ChatGPT Models: Challenges and Recommendations

URL: http://arxiv.org/abs/2501.07931v1
Date: Tue, 14 Jan 2025 08:32:16 GMT
Title: Advice for Diabetes Self-Management by ChatGPT Models: Challenges and Recommendations
Authors: Waqar Hussain, John Grundy,
Abstract summary: We evaluate the responses of ChatGPT versions 3.5 and 4 to diabetes patient queries.<n>Our findings reveal discrepancies in accuracy and embedded biases.<n>We propose a commonsense evaluation layer for prompt evaluation and incorporating disease-specific external memory.
Score: 4.321186293298159
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Given their ability for advanced reasoning, extensive contextual understanding, and robust question-answering abilities, large language models have become prominent in healthcare management research. Despite adeptly handling a broad spectrum of healthcare inquiries, these models face significant challenges in delivering accurate and practical advice for chronic conditions such as diabetes. We evaluate the responses of ChatGPT versions 3.5 and 4 to diabetes patient queries, assessing their depth of medical knowledge and their capacity to deliver personalized, context-specific advice for diabetes self-management. Our findings reveal discrepancies in accuracy and embedded biases, emphasizing the models' limitations in providing tailored advice unless activated by sophisticated prompting techniques. Additionally, we observe that both models often provide advice without seeking necessary clarification, a practice that can result in potentially dangerous advice. This underscores the limited practical effectiveness of these models without human oversight in clinical settings. To address these issues, we propose a commonsense evaluation layer for prompt evaluation and incorporating disease-specific external memory using an advanced Retrieval Augmented Generation technique. This approach aims to improve information quality and reduce misinformation risks, contributing to more reliable AI applications in healthcare settings. Our findings seek to influence the future direction of AI in healthcare, enhancing both the scope and quality of its integration.

Related papers

Uncertainty-Driven Expert Control: Enhancing the Reliability of Medical Vision-Language Models [52.2001050216955]
Existing methods aim to enhance the performance of Medical Vision Language Model (MedVLM) by adjusting model structure, fine-tuning with high-quality data, or through preference fine-tuning.<n>We propose an expert-in-the-loop framework named Expert-Controlled-Free Guidance (Expert-CFG) to align MedVLM with clinical expertise without additional training.
arXiv Detail & Related papers (2025-07-12T09:03:30Z)
AutoMedEval: Harnessing Language Models for Automatic Medical Capability Evaluation [55.2739790399209]
We present AutoMedEval, an open-sourced automatic evaluation model with 13B parameters specifically engineered to measure the question-answering proficiency of medical LLMs.<n>The overarching objective of AutoMedEval is to assess the quality of responses produced by diverse models, aspiring to significantly reduce the dependence on human evaluation.
arXiv Detail & Related papers (2025-05-17T07:44:54Z)
Structured Outputs Enable General-Purpose LLMs to be Medical Experts [50.02627258858336]
Large language models (LLMs) often struggle with open-ended medical questions. We propose a novel approach utilizing structured medical reasoning. Our approach achieves the highest Factuality Score of 85.8, surpassing fine-tuned models.
arXiv Detail & Related papers (2025-03-05T05:24:55Z)
Integrating Generative Artificial Intelligence in ADRD: A Framework for Streamlining Diagnosis and Care in Neurodegenerative Diseases [0.0]
We propose that large language models (LLMs) offer more immediately practical applications by enhancing clinicians' capabilities. We present a framework for responsible AI integration that leverages LLMs' ability to communicate effectively with both patients and providers. This approach prioritizes standardized, high-quality data collection to enable a system that learns from every patient encounter.
arXiv Detail & Related papers (2025-02-06T19:09:11Z)
Towards Privacy-aware Mental Health AI Models: Advances, Challenges, and Opportunities [61.633126163190724]
Mental illness is a widespread and debilitating condition with substantial societal and personal costs. Recent advances in Artificial Intelligence (AI) hold great potential for recognizing and addressing conditions such as depression, anxiety disorder, bipolar disorder, schizophrenia, and post-traumatic stress disorder. Privacy concerns, including the risk of sensitive data leakage from datasets and trained models, remain a critical barrier to deploying these AI systems in real-world clinical settings.
arXiv Detail & Related papers (2025-02-01T15:10:02Z)
Artificial Intelligence-Driven Clinical Decision Support Systems [5.010570270212569]
The chapter emphasizes that creating trustworthy AI systems in healthcare requires careful consideration of fairness, explainability, and privacy. The challenge of ensuring equitable healthcare delivery through AI is stressed, discussing methods to identify and mitigate bias in clinical predictive models. The discussion advances in an analysis of privacy vulnerabilities in medical AI systems, from data leakage in deep learning models to sophisticated attacks against model explanations.
arXiv Detail & Related papers (2025-01-16T16:17:39Z)
Which Client is Reliable?: A Reliable and Personalized Prompt-based Federated Learning for Medical Image Question Answering [51.26412822853409]
We present a novel personalized federated learning (pFL) method for medical visual question answering (VQA) models. Our method introduces learnable prompts into a Transformer architecture to efficiently train it on diverse medical datasets without massive computational costs.
arXiv Detail & Related papers (2024-10-23T00:31:17Z)
Detecting Bias and Enhancing Diagnostic Accuracy in Large Language Models for Healthcare [0.2302001830524133]
Biased AI-generated medical advice and misdiagnoses can jeopardize patient safety. This study introduces new resources designed to promote ethical and precise AI in healthcare.
arXiv Detail & Related papers (2024-10-09T06:00:05Z)
MLtoGAI: Semantic Web based with Machine Learning for Enhanced Disease Prediction and Personalized Recommendations using Generative AI [0.929965561686354]
This research introduces MLtoGAI, which integrates Semantic Web technology with Machine Learning (ML) to enhance disease prediction. By leveraging semantic technology and explainable AI, the system enhances the accuracy of disease prediction and ensures that the recommendations are relevant and easily understood by individual patients.
arXiv Detail & Related papers (2024-07-26T06:32:06Z)
Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models [73.79091519226026]
Uncertainty of Thoughts (UoT) is an algorithm to augment large language models with the ability to actively seek information by asking effective questions. In experiments on medical diagnosis, troubleshooting, and the 20 Questions game, UoT achieves an average performance improvement of 38.1% in the rate of successful task completion.
arXiv Detail & Related papers (2024-02-05T18:28:44Z)
Self-Diagnosis and Large Language Models: A New Front for Medical Misinformation [8.738092015092207]
We evaluate the capabilities of large language models (LLMs) from the lens of a general user self-diagnosing. We develop a testing methodology which can be used to evaluate responses to open-ended questions mimicking real-world use cases. We reveal that a) these models perform worse than previously known, and b) they exhibit peculiar behaviours, including overconfidence when stating incorrect recommendations.
arXiv Detail & Related papers (2023-07-10T21:28:26Z)
Deep Attention Q-Network for Personalized Treatment Recommendation [1.6631602844999724]
We propose the Deep Attention Q-Network for personalized treatment recommendations. The Transformer architecture within a deep reinforcement learning framework efficiently incorporates all past patient observations. We evaluated the model on real-world sepsis and acute hypotension cohorts, demonstrating its superiority to state-of-the-art models.
arXiv Detail & Related papers (2023-07-04T07:00:19Z)
RECAP-KG: Mining Knowledge Graphs from Raw GP Notes for Remote COVID-19 Assessment in Primary Care [45.43645878061283]
We present a framework that performs knowledge graph construction from raw GP medical notes written during or after patient consultations. Our knowledge graphs include information about existing patient symptoms, their duration, and their severity. We apply our framework to consultation notes of COVID-19 patients in the UK.
arXiv Detail & Related papers (2023-06-17T23:35:51Z)
Privacy-preserving machine learning for healthcare: open challenges and future perspectives [72.43506759789861]
We conduct a review of recent literature concerning Privacy-Preserving Machine Learning (PPML) for healthcare. We primarily focus on privacy-preserving training and inference-as-a-service. The aim of this review is to guide the development of private and efficient ML models in healthcare.
arXiv Detail & Related papers (2023-03-27T19:20:51Z)
SPeC: A Soft Prompt-Based Calibration on Performance Variability of Large Language Model in Clinical Notes Summarization [50.01382938451978]
We introduce a model-agnostic pipeline that employs soft prompts to diminish variance while preserving the advantages of prompt-based summarization. Experimental findings indicate that our method not only bolsters performance but also effectively curbs variance for various language models.
arXiv Detail & Related papers (2023-03-23T04:47:46Z)
Informing clinical assessment by contextualizing post-hoc explanations of risk prediction models in type-2 diabetes [50.8044927215346]
We consider a comorbidity risk prediction scenario and focus on contexts regarding the patients clinical state. We employ several state-of-the-art LLMs to present contexts around risk prediction model inferences and evaluate their acceptability. Our paper is one of the first end-to-end analyses identifying the feasibility and benefits of contextual explanations in a real-world clinical use case.
arXiv Detail & Related papers (2023-02-11T18:07:11Z)
Leveraging Clinical Context for User-Centered Explainability: A Diabetes Use Case [4.520155732176645]
We implement a proof-of-concept (POC) in type-2 diabetes (T2DM) use case where we assess the risk of chronic kidney disease (CKD) Within the POC, we include risk prediction models for CKD, post-hoc explainers of the predictions, and other natural-language modules. Our POC approach covers multiple knowledge sources and clinical scenarios, blends knowledge to explain data and predictions to PCPs, and received an enthusiastic response from our medical expert.
arXiv Detail & Related papers (2021-07-06T02:44:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.