Deciphering Diagnoses: How Large Language Models Explanations Influence
Clinical Decision Making
- URL: http://arxiv.org/abs/2310.01708v1
- Date: Tue, 3 Oct 2023 00:08:23 GMT
- Title: Deciphering Diagnoses: How Large Language Models Explanations Influence
Clinical Decision Making
- Authors: D.Umerenkov, G.Zubkova, A.Nesterov
- Abstract summary: Large Language Models (LLMs) are emerging as a promising tool to generate plain-text explanations for medical decisions.
This study explores the effectiveness and reliability of LLMs in generating explanations for diagnoses based on patient complaints.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clinical Decision Support Systems (CDSS) utilize evidence-based knowledge and
patient data to offer real-time recommendations, with Large Language Models
(LLMs) emerging as a promising tool to generate plain-text explanations for
medical decisions. This study explores the effectiveness and reliability of
LLMs in generating explanations for diagnoses based on patient complaints.
Three experienced doctors evaluated LLM-generated explanations of the
connection between patient complaints and doctor and model-assigned diagnoses
across several stages. Experimental results demonstrated that LLM explanations
significantly increased doctors' agreement rates with given diagnoses and
highlighted potential errors in LLM outputs, ranging from 5% to 30%. The study
underscores the potential and challenges of LLMs in healthcare and emphasizes
the need for careful integration and evaluation to ensure patient safety and
optimal clinical utility.
Related papers
- Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning.
Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z) - IntelliCare: Improving Healthcare Analysis with Variance-Controlled Patient-Level Knowledge from Large Language Models [14.709233593021281]
The integration of external knowledge from Large Language Models (LLMs) presents a promising avenue for improving healthcare predictions.
We propose IntelliCare, a novel framework that leverages LLMs to provide high-quality patient-level external knowledge.
IntelliCare identifies patient cohorts and employs task-relevant statistical information to augment LLM understanding and generation.
arXiv Detail & Related papers (2024-08-23T13:56:00Z) - RuleAlign: Making Large Language Models Better Physicians with Diagnostic Rule Alignment [54.91736546490813]
We introduce the RuleAlign framework, designed to align Large Language Models with specific diagnostic rules.
We develop a medical dialogue dataset comprising rule-based communications between patients and physicians.
Experimental results demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-08-22T17:44:40Z) - CliBench: A Multifaceted and Multigranular Evaluation of Large Language Models for Clinical Decision Making [16.310913127940857]
We introduce CliBench, a novel benchmark developed from the MIMIC IV dataset.
This benchmark offers a comprehensive and realistic assessment of LLMs' capabilities in clinical diagnosis.
We conduct a zero-shot evaluation of leading LLMs to assess their proficiency in clinical decision-making.
arXiv Detail & Related papers (2024-06-14T11:10:17Z) - Evaluating large language models in medical applications: a survey [1.5923327069574245]
Large language models (LLMs) have emerged as powerful tools with transformative potential across numerous domains.
evaluating the performance of LLMs in medical contexts presents unique challenges due to the complex and critical nature of medical information.
arXiv Detail & Related papers (2024-05-13T05:08:33Z) - AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator [69.51568871044454]
We introduce textbfAI Hospital, a framework simulating dynamic medical interactions between emphDoctor as player and NPCs.
This setup allows for realistic assessments of LLMs in clinical scenarios.
We develop the Multi-View Medical Evaluation benchmark, utilizing high-quality Chinese medical records and NPCs.
arXiv Detail & Related papers (2024-02-15T06:46:48Z) - Adapted Large Language Models Can Outperform Medical Experts in Clinical Text Summarization [8.456700096020601]
Large language models (LLMs) have shown promise in natural language processing (NLP), but their effectiveness on a diverse range of clinical summarization tasks remains unproven.
In this study, we apply adaptation methods to eight LLMs, spanning four distinct clinical summarization tasks.
A clinical reader study with ten physicians evaluates summary, completeness, correctness, and conciseness; in a majority of cases, summaries from our best adapted LLMs are either equivalent (45%) or superior (36%) compared to summaries from medical experts.
arXiv Detail & Related papers (2023-09-14T05:15:01Z) - Large Language Models for Healthcare Data Augmentation: An Example on
Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM)
Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z) - SPeC: A Soft Prompt-Based Calibration on Performance Variability of
Large Language Model in Clinical Notes Summarization [50.01382938451978]
We introduce a model-agnostic pipeline that employs soft prompts to diminish variance while preserving the advantages of prompt-based summarization.
Experimental findings indicate that our method not only bolsters performance but also effectively curbs variance for various language models.
arXiv Detail & Related papers (2023-03-23T04:47:46Z) - VBridge: Connecting the Dots Between Features, Explanations, and Data
for Healthcare Models [85.4333256782337]
VBridge is a visual analytics tool that seamlessly incorporates machine learning explanations into clinicians' decision-making workflow.
We identified three key challenges, including clinicians' unfamiliarity with ML features, lack of contextual information, and the need for cohort-level evidence.
We demonstrated the effectiveness of VBridge through two case studies and expert interviews with four clinicians.
arXiv Detail & Related papers (2021-08-04T17:34:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.