Related papers: MedDM:LLM-executable clinical guidance tree for clinical decision-making

MedDM:LLM-executable clinical guidance tree for clinical decision-making

URL: http://arxiv.org/abs/2312.02441v1
Date: Tue, 5 Dec 2023 02:44:07 GMT
Title: MedDM:LLM-executable clinical guidance tree for clinical decision-making
Authors: Binbin Li and Tianxin Meng and Xiaoming Shi and Jie Zhai and Tong Ruan
Abstract summary: There is no suitable clinical guidance tree data set that can be used directly with LLM. We first propose LLM-executavle clinical guidance tree (CGT), which can be directly used by large language models. We construct medical diagnostic decision-making dataset (MedDM) from flowcharts in clinical practice guidelines.
Score: 9.27804927412851
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It is becoming increasingly emphasis on the importance of LLM participating in clinical diagnosis decision-making. However, the low specialization refers to that current medical LLMs can not provide specific medical advice, which are more like a medical Q\&A. And there is no suitable clinical guidance tree data set that can be used directly with LLM. To address this issue, we first propose LLM-executavle clinical guidance tree(CGT), which can be directly used by large language models, and construct medical diagnostic decision-making dataset (MedDM), from flowcharts in clinical practice guidelines. We propose an approach to screen flowcharts from medical literature, followed by their identification and conversion into standardized diagnostic decision trees. Constructed a knowledge base with 1202 decision trees, which came from 5000 medical literature and covered 12 hospital departments, including internal medicine, surgery, psychiatry, and over 500 diseases.Moreover, we propose a method for reasoning on LLM-executable CGT and a Patient-LLM multi-turn dialogue framework.

Related papers

MedGUIDE: Benchmarking Clinical Decision-Making in Large Language Models [10.46932473088646]
We introduce MedGUIDE, a new benchmark for evaluating Large Language Models (LLMs) on their ability to make guideline-consistent clinical decisions.<n> MedGUIDE is constructed from 55 curated NCCN decision trees across 17 cancer types.<n>We apply a two-stage quality selection process, combining expert-labeled reward models and LLM-as-a-judge ensembles across ten clinical and linguistic criteria, to select 7,747 high-quality samples.
arXiv Detail & Related papers (2025-05-16T18:21:52Z)
Demystifying Large Language Models for Medicine: A Primer [50.83806796466396]
Large language models (LLMs) represent a transformative class of AI tools capable of revolutionizing various aspects of healthcare. This tutorial aims to equip healthcare professionals with the tools necessary to effectively integrate LLMs into clinical practice.
arXiv Detail & Related papers (2024-10-24T15:41:56Z)
CliMedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models in Clinical Scenarios [50.032101237019205]
CliMedBench is a comprehensive benchmark with 14 expert-guided core clinical scenarios. The reliability of this benchmark has been confirmed in several ways.
arXiv Detail & Related papers (2024-10-04T15:15:36Z)
RuleAlign: Making Large Language Models Better Physicians with Diagnostic Rule Alignment [54.91736546490813]
We introduce the RuleAlign framework, designed to align Large Language Models with specific diagnostic rules. We develop a medical dialogue dataset comprising rule-based communications between patients and physicians. Experimental results demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-08-22T17:44:40Z)
Stochastic Parrots or ICU Experts? Large Language Models in Critical Care Medicine: A Scoping Review [3.993456293626592]
Large language models (LLMs) have shown strong capabilities in natural language understanding, reasoning, and generation. Critical care medicine provides diagnosis and treatment for critically ill patients who often require intensive monitoring and interventions in intensive care units (ICUs)
arXiv Detail & Related papers (2024-07-27T13:41:43Z)
CliBench: A Multifaceted and Multigranular Evaluation of Large Language Models for Clinical Decision Making [16.310913127940857]
We introduce CliBench, a novel benchmark developed from the MIMIC IV dataset. This benchmark offers a comprehensive and realistic assessment of LLMs' capabilities in clinical diagnosis. We conduct a zero-shot evaluation of leading LLMs to assess their proficiency in clinical decision-making.
arXiv Detail & Related papers (2024-06-14T11:10:17Z)
Large Language Models in the Clinic: A Comprehensive Benchmark [63.21278434331952]
We build a benchmark ClinicBench to better understand large language models (LLMs) in the clinic. We first collect eleven existing datasets covering diverse clinical language generation, understanding, and reasoning tasks. We then construct six novel datasets and clinical tasks that are complex but common in real-world practice. We conduct an extensive evaluation of twenty-two LLMs under both zero-shot and few-shot settings.
arXiv Detail & Related papers (2024-04-25T15:51:06Z)
MedKP: Medical Dialogue with Knowledge Enhancement and Clinical Pathway Encoding [48.348511646407026]
We introduce the Medical dialogue with Knowledge enhancement and clinical Pathway encoding framework. The framework integrates an external knowledge enhancement module through a medical knowledge graph and an internal clinical pathway encoding via medical entities and physician actions.
arXiv Detail & Related papers (2024-03-11T10:57:45Z)
Guiding Clinical Reasoning with Large Language Models via Knowledge Seeds [32.99251005719732]
Clinical reasoning refers to the cognitive process that physicians employ in evaluating and managing patients. In this study, we introduce a novel framework, In-Context Padding (ICP), designed to enhance LLMs with medical knowledge.
arXiv Detail & Related papers (2024-03-11T10:53:20Z)
Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models [59.60384461302662]
We introduce Asclepius, a novel benchmark for evaluating Medical Multi-Modal Large Language Models (Med-MLLMs) Asclepius rigorously and comprehensively assesses model capability in terms of distinct medical specialties and different diagnostic capacities. We also provide an in-depth analysis of 6 Med-MLLMs and compare them with 5 human specialists.
arXiv Detail & Related papers (2024-02-17T08:04:23Z)
MedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models [56.36916128631784]
We introduce MedBench, a comprehensive benchmark for the Chinese medical domain. This benchmark is composed of four key components: the Chinese Medical Licensing Examination, the Resident Standardization Training Examination, and real-world clinic cases. We perform extensive experiments and conduct an in-depth analysis from diverse perspectives, which culminate in the following findings.
arXiv Detail & Related papers (2023-12-20T07:01:49Z)
Diagnostic Reasoning Prompts Reveal the Potential for Large Language Model Interpretability in Medicine [4.773117448586697]
We develop novel diagnostic reasoning prompts to study whether large language models (LLMs) can perform clinical reasoning to accurately form a diagnosis. We find GPT4 can be prompted to mimic the common clinical reasoning processes of clinicians without sacrificing diagnostic accuracy.
arXiv Detail & Related papers (2023-08-13T19:04:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.