Impact of Large Language Model Assistance on Patients Reading Clinical
Notes: A Mixed-Methods Study
- URL: http://arxiv.org/abs/2401.09637v1
- Date: Wed, 17 Jan 2024 23:14:52 GMT
- Title: Impact of Large Language Model Assistance on Patients Reading Clinical
Notes: A Mixed-Methods Study
- Authors: Niklas Mannhardt, Elizabeth Bondi-Kelly, Barbara Lam, Chloe O'Connell,
Mercy Asiedu, Hussein Mozannar, Monica Agrawal, Alejandro Buendia, Tatiana
Urman, Irbaz B. Riaz, Catherine E. Ricciardi, Marzyeh Ghassemi, David Sontag
- Abstract summary: Complex medical concepts and jargon within clinical notes hinder patient comprehension and may lead to anxiety.
We developed a patient-facing tool to simplify, extract information from, and add context to notes.
Augmentations were evaluated for errors by clinicians, and we found misleading errors occur.
- Score: 47.61555826813361
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Patients derive numerous benefits from reading their clinical notes,
including an increased sense of control over their health and improved
understanding of their care plan. However, complex medical concepts and jargon
within clinical notes hinder patient comprehension and may lead to anxiety. We
developed a patient-facing tool to make clinical notes more readable,
leveraging large language models (LLMs) to simplify, extract information from,
and add context to notes. We prompt engineered GPT-4 to perform these
augmentation tasks on real clinical notes donated by breast cancer survivors
and synthetic notes generated by a clinician, a total of 12 notes with 3868
words. In June 2023, 200 female-identifying US-based participants were randomly
assigned three clinical notes with varying levels of augmentations using our
tool. Participants answered questions about each note, evaluating their
understanding of follow-up actions and self-reported confidence. We found that
augmentations were associated with a significant increase in action
understanding score (0.63 $\pm$ 0.04 for select augmentations, compared to 0.54
$\pm$ 0.02 for the control) with p=0.002. In-depth interviews of
self-identifying breast cancer patients (N=7) were also conducted via video
conferencing. Augmentations, especially definitions, elicited positive
responses among the seven participants, with some concerns about relying on
LLMs. Augmentations were evaluated for errors by clinicians, and we found
misleading errors occur, with errors more common in real donated notes than
synthetic notes, illustrating the importance of carefully written clinical
notes. Augmentations improve some but not all readability metrics. This work
demonstrates the potential of LLMs to improve patients' experience with
clinical notes at a lower burden to clinicians. However, having a human in the
loop is important to correct potential model errors.
Related papers
- The use of large language models to enhance cancer clinical trial educational materials [2.680807601066252]
GPT4-generated trial summaries were both readable and comprehensive.
Multiple-choice questions demonstrated high accuracy and agreement with crowdsourced annotators.
For both resource types, hallucinations were identified that require ongoing human oversight.
arXiv Detail & Related papers (2024-12-02T20:31:27Z) - Improving Clinical Note Generation from Complex Doctor-Patient Conversation [20.2157016701399]
We present three key contributions to the field of clinical note generation using large language models (LLMs)
First, we introduce CliniKnote, a dataset consisting of 1,200 complex doctor-patient conversations paired with their full clinical notes.
Second, we propose K-SOAP, which enhances traditional SOAPcitepodder20soap (Subjective, Objective, Assessment, and Plan) notes by adding a keyword section at the top, allowing for quick identification of essential information.
Third, we develop an automatic pipeline to generate K-SOAP notes from doctor-patient conversations and benchmark various modern LLMs using various
arXiv Detail & Related papers (2024-08-26T18:39:31Z) - Dynamic Q&A of Clinical Documents with Large Language Models [3.021316686584699]
This work introduces a natural language interface using large language models (LLMs) for dynamic question-answering on clinical notes.
Experiments, utilizing various embedding models and advanced LLMs, show Wizard Vicuna's superior accuracy, albeit with high compute demands.
arXiv Detail & Related papers (2024-01-19T14:50:22Z) - Adapted Large Language Models Can Outperform Medical Experts in Clinical Text Summarization [8.456700096020601]
Large language models (LLMs) have shown promise in natural language processing (NLP), but their effectiveness on a diverse range of clinical summarization tasks remains unproven.
In this study, we apply adaptation methods to eight LLMs, spanning four distinct clinical summarization tasks.
A clinical reader study with ten physicians evaluates summary, completeness, correctness, and conciseness; in a majority of cases, summaries from our best adapted LLMs are either equivalent (45%) or superior (36%) compared to summaries from medical experts.
arXiv Detail & Related papers (2023-09-14T05:15:01Z) - Generating medically-accurate summaries of patient-provider dialogue: A
multi-stage approach using large language models [6.252236971703546]
An effective summary is required to be coherent and accurately capture all the medically relevant information in the dialogue.
This paper tackles the problem of medical conversation summarization by discretizing the task into several smaller dialogue-understanding tasks.
arXiv Detail & Related papers (2023-05-10T08:48:53Z) - SPeC: A Soft Prompt-Based Calibration on Performance Variability of
Large Language Model in Clinical Notes Summarization [50.01382938451978]
We introduce a model-agnostic pipeline that employs soft prompts to diminish variance while preserving the advantages of prompt-based summarization.
Experimental findings indicate that our method not only bolsters performance but also effectively curbs variance for various language models.
arXiv Detail & Related papers (2023-03-23T04:47:46Z) - Retrieval-Augmented and Knowledge-Grounded Language Models for Faithful Clinical Medicine [68.7814360102644]
We propose the Re$3$Writer method with retrieval-augmented generation and knowledge-grounded reasoning.
We demonstrate the effectiveness of our method in generating patient discharge instructions.
arXiv Detail & Related papers (2022-10-23T16:34:39Z) - Human Evaluation and Correlation with Automatic Metrics in Consultation
Note Generation [56.25869366777579]
In recent years, machine learning models have rapidly become better at generating clinical consultation notes.
We present an extensive human evaluation study where 5 clinicians listen to 57 mock consultations, write their own notes, post-edit a number of automatically generated notes, and extract all the errors.
We find that a simple, character-based Levenshtein distance metric performs on par if not better than common model-based metrics like BertScore.
arXiv Detail & Related papers (2022-04-01T14:04:16Z) - VBridge: Connecting the Dots Between Features, Explanations, and Data
for Healthcare Models [85.4333256782337]
VBridge is a visual analytics tool that seamlessly incorporates machine learning explanations into clinicians' decision-making workflow.
We identified three key challenges, including clinicians' unfamiliarity with ML features, lack of contextual information, and the need for cohort-level evidence.
We demonstrated the effectiveness of VBridge through two case studies and expert interviews with four clinicians.
arXiv Detail & Related papers (2021-08-04T17:34:13Z) - Benchmarking Automated Clinical Language Simplification: Dataset,
Algorithm, and Evaluation [48.87254340298189]
We construct a new dataset named MedLane to support the development and evaluation of automated clinical language simplification approaches.
We propose a new model called DECLARE that follows the human annotation procedure and achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-12-04T06:09:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.