Generative AI Is Not Ready for Clinical Use in Patient Education for Lower Back Pain Patients, Even With Retrieval-Augmented Generation
- URL: http://arxiv.org/abs/2409.15260v1
- Date: Mon, 23 Sep 2024 17:56:08 GMT
- Title: Generative AI Is Not Ready for Clinical Use in Patient Education for Lower Back Pain Patients, Even With Retrieval-Augmented Generation
- Authors: Yi-Fei Zhao, Allyn Bove, David Thompson, James Hill, Yi Xu, Yufan Ren, Andrea Hassman, Leming Zhou, Yanshan Wang,
- Abstract summary: Low back pain (LBP) is a leading cause of disability globally.
Despite advancements in patient education strategies, significant gaps persist in delivering personalized, evidence-based information to patients with LBP.
Recent advancements in large language models (LLMs) and generative artificial intelligence (GenAI) have demonstrated the potential to enhance patient education.
- Score: 11.063824496698949
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Low back pain (LBP) is a leading cause of disability globally. Following the onset of LBP and subsequent treatment, adequate patient education is crucial for improving functionality and long-term outcomes. Despite advancements in patient education strategies, significant gaps persist in delivering personalized, evidence-based information to patients with LBP. Recent advancements in large language models (LLMs) and generative artificial intelligence (GenAI) have demonstrated the potential to enhance patient education. However, their application and efficacy in delivering educational content to patients with LBP remain underexplored and warrant further investigation. In this study, we introduce a novel approach utilizing LLMs with Retrieval-Augmented Generation (RAG) and few-shot learning to generate tailored educational materials for patients with LBP. Physical therapists manually evaluated our model responses for redundancy, accuracy, and completeness using a Likert scale. In addition, the readability of the generated education materials is assessed using the Flesch Reading Ease score. The findings demonstrate that RAG-based LLMs outperform traditional LLMs, providing more accurate, complete, and readable patient education materials with less redundancy. Having said that, our analysis reveals that the generated materials are not yet ready for use in clinical practice. This study underscores the potential of AI-driven models utilizing RAG to improve patient education for LBP; however, significant challenges remain in ensuring the clinical relevance and granularity of content generated by these models.
Related papers
- Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning.
Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z) - IntelliCare: Improving Healthcare Analysis with Variance-Controlled Patient-Level Knowledge from Large Language Models [14.709233593021281]
The integration of external knowledge from Large Language Models (LLMs) presents a promising avenue for improving healthcare predictions.
We propose IntelliCare, a novel framework that leverages LLMs to provide high-quality patient-level external knowledge.
IntelliCare identifies patient cohorts and employs task-relevant statistical information to augment LLM understanding and generation.
arXiv Detail & Related papers (2024-08-23T13:56:00Z) - LLMs-based Few-Shot Disease Predictions using EHR: A Novel Approach Combining Predictive Agent Reasoning and Critical Agent Instruction [38.11497959553319]
We investigate the feasibility of applying Large Language Models to convert structured patient visit data into natural language narratives.
We evaluate the zero-shot and few-shot performance of LLMs using various EHR-prediction-oriented prompting strategies.
Our results demonstrate that with the proposed approach, LLMs can achieve decent few-shot performance compared to traditional supervised learning methods in EHR-based disease predictions.
arXiv Detail & Related papers (2024-03-19T18:10:13Z) - Enhancing Readmission Prediction with Deep Learning: Extracting Biomedical Concepts from Clinical Texts [0.26813152817733554]
This study focuses on predicting patient readmission within less than 30 days using text mining techniques.
Various machine learning and deep learning methods were employed to develop a classification model for this purpose.
arXiv Detail & Related papers (2024-03-12T09:03:44Z) - Natural Language Programming in Medicine: Administering Evidence Based Clinical Workflows with Autonomous Agents Powered by Generative Large Language Models [29.05425041393475]
Generative Large Language Models (LLMs) hold significant promise in healthcare.
This study assessed the potential of LLMs to function as autonomous agents in a simulated tertiary care medical center.
arXiv Detail & Related papers (2024-01-05T15:09:57Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning.
They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health.
Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z) - Large Language Models for Healthcare Data Augmentation: An Example on
Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM)
Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z) - SPeC: A Soft Prompt-Based Calibration on Performance Variability of
Large Language Model in Clinical Notes Summarization [50.01382938451978]
We introduce a model-agnostic pipeline that employs soft prompts to diminish variance while preserving the advantages of prompt-based summarization.
Experimental findings indicate that our method not only bolsters performance but also effectively curbs variance for various language models.
arXiv Detail & Related papers (2023-03-23T04:47:46Z) - A Literature Review on Length of Stay Prediction for Stroke Patients
using Machine Learning and Statistical Approaches [0.0]
Hospital length of stay (LOS) is one of the most essential healthcare metrics that reflects the hospital quality of service and helps improve hospital scheduling and management.
In this study, we reviewed papers on LOS prediction using machine learning and statistical approaches.
arXiv Detail & Related papers (2021-12-30T03:48:41Z) - Predicting Patient Readmission Risk from Medical Text via Knowledge
Graph Enhanced Multiview Graph Convolution [67.72545656557858]
We propose a new method that uses medical text of Electronic Health Records for prediction.
We represent discharge summaries of patients with multiview graphs enhanced by an external knowledge graph.
Experimental results prove the effectiveness of our method, yielding state-of-the-art performance.
arXiv Detail & Related papers (2021-12-19T01:45:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.