Related papers: Impact of Large Language Model Assistance on Patients Reading Clinical Notes: A Mixed-Methods Study

Impact of Large Language Model Assistance on Patients Reading Clinical Notes: A Mixed-Methods Study

URL: http://arxiv.org/abs/2401.09637v2
Date: Mon, 14 Oct 2024 23:49:48 GMT
Title: Impact of Large Language Model Assistance on Patients Reading Clinical Notes: A Mixed-Methods Study
Authors: Niklas Mannhardt, Elizabeth Bondi-Kelly, Barbara Lam, Hussein Mozannar, Chloe O'Connell, Mercy Asiedu, Alejandro Buendia, Tatiana Urman, Irbaz B. Riaz, Catherine E. Ricciardi, Monica Agrawal, Marzyeh Ghassemi, David Sontag,
Abstract summary: We developed a patient-facing tool using large language models (LLMs) to make clinical notes more readable. We piloted the tool with clinical notes donated by patients with a history of breast cancer and synthetic notes from a clinician.
Score: 46.5728291706842
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have immense potential to make information more accessible, particularly in medicine, where complex medical jargon can hinder patient comprehension of clinical notes. We developed a patient-facing tool using LLMs to make clinical notes more readable by simplifying, extracting information from, and adding context to the notes. We piloted the tool with clinical notes donated by patients with a history of breast cancer and synthetic notes from a clinician. Participants (N=200, healthy, female-identifying patients) were randomly assigned three clinical notes in our tool with varying levels of augmentations and answered quantitative and qualitative questions evaluating their understanding of follow-up actions. Augmentations significantly increased their quantitative understanding scores. In-depth interviews were conducted with participants (N=7, patients with a history of breast cancer), revealing both positive sentiments about the augmentations and concerns about AI. We also performed a qualitative clinician-driven analysis of the model's error modes.

Related papers

The use of large language models to enhance cancer clinical trial educational materials [2.680807601066252]
GPT4-generated trial summaries were both readable and comprehensive. Multiple-choice questions demonstrated high accuracy and agreement with crowdsourced annotators. For both resource types, hallucinations were identified that require ongoing human oversight.
arXiv Detail & Related papers (2024-12-02T20:31:27Z)
Improving Clinical Note Generation from Complex Doctor-Patient Conversation [20.2157016701399]
We present three key contributions to the field of clinical note generation using large language models (LLMs) First, we introduce CliniKnote, a dataset consisting of 1,200 complex doctor-patient conversations paired with their full clinical notes. Second, we propose K-SOAP, which enhances traditional SOAPcitepodder20soap (Subjective, Objective, Assessment, and Plan) notes by adding a keyword section at the top, allowing for quick identification of essential information. Third, we develop an automatic pipeline to generate K-SOAP notes from doctor-patient conversations and benchmark various modern LLMs using various
arXiv Detail & Related papers (2024-08-26T18:39:31Z)
Dynamic Q&A of Clinical Documents with Large Language Models [3.021316686584699]
This work introduces a natural language interface using large language models (LLMs) for dynamic question-answering on clinical notes. Experiments, utilizing various embedding models and advanced LLMs, show Wizard Vicuna's superior accuracy, albeit with high compute demands.
arXiv Detail & Related papers (2024-01-19T14:50:22Z)
Adapted Large Language Models Can Outperform Medical Experts in Clinical Text Summarization [8.456700096020601]
Large language models (LLMs) have shown promise in natural language processing (NLP), but their effectiveness on a diverse range of clinical summarization tasks remains unproven. In this study, we apply adaptation methods to eight LLMs, spanning four distinct clinical summarization tasks. A clinical reader study with ten physicians evaluates summary, completeness, correctness, and conciseness; in a majority of cases, summaries from our best adapted LLMs are either equivalent (45%) or superior (36%) compared to summaries from medical experts.
arXiv Detail & Related papers (2023-09-14T05:15:01Z)
Generating medically-accurate summaries of patient-provider dialogue: A multi-stage approach using large language models [6.252236971703546]
An effective summary is required to be coherent and accurately capture all the medically relevant information in the dialogue. This paper tackles the problem of medical conversation summarization by discretizing the task into several smaller dialogue-understanding tasks.
arXiv Detail & Related papers (2023-05-10T08:48:53Z)
SPeC: A Soft Prompt-Based Calibration on Performance Variability of Large Language Model in Clinical Notes Summarization [50.01382938451978]
We introduce a model-agnostic pipeline that employs soft prompts to diminish variance while preserving the advantages of prompt-based summarization. Experimental findings indicate that our method not only bolsters performance but also effectively curbs variance for various language models.
arXiv Detail & Related papers (2023-03-23T04:47:46Z)
Retrieval-Augmented and Knowledge-Grounded Language Models for Faithful Clinical Medicine [68.7814360102644]
We propose the Re$3$Writer method with retrieval-augmented generation and knowledge-grounded reasoning. We demonstrate the effectiveness of our method in generating patient discharge instructions.
arXiv Detail & Related papers (2022-10-23T16:34:39Z)
Human Evaluation and Correlation with Automatic Metrics in Consultation Note Generation [56.25869366777579]
In recent years, machine learning models have rapidly become better at generating clinical consultation notes. We present an extensive human evaluation study where 5 clinicians listen to 57 mock consultations, write their own notes, post-edit a number of automatically generated notes, and extract all the errors. We find that a simple, character-based Levenshtein distance metric performs on par if not better than common model-based metrics like BertScore.
arXiv Detail & Related papers (2022-04-01T14:04:16Z)
VBridge: Connecting the Dots Between Features, Explanations, and Data for Healthcare Models [85.4333256782337]
VBridge is a visual analytics tool that seamlessly incorporates machine learning explanations into clinicians' decision-making workflow. We identified three key challenges, including clinicians' unfamiliarity with ML features, lack of contextual information, and the need for cohort-level evidence. We demonstrated the effectiveness of VBridge through two case studies and expert interviews with four clinicians.
arXiv Detail & Related papers (2021-08-04T17:34:13Z)
Benchmarking Automated Clinical Language Simplification: Dataset, Algorithm, and Evaluation [48.87254340298189]
We construct a new dataset named MedLane to support the development and evaluation of automated clinical language simplification approaches. We propose a new model called DECLARE that follows the human annotation procedure and achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-12-04T06:09:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.