Uncertainty-Aware Large Language Models for Explainable Disease Diagnosis
- URL: http://arxiv.org/abs/2505.03467v1
- Date: Tue, 06 May 2025 12:12:48 GMT
- Title: Uncertainty-Aware Large Language Models for Explainable Disease Diagnosis
- Authors: Shuang Zhou, Jiashuo Wang, Zidu Xu, Song Wang, David Brauer, Lindsay Welton, Jacob Cogan, Yuen-Hei Chung, Lei Tian, Zaifu Zhan, Yu Hou, Mingquan Lin, Genevieve B. Melton, Rui Zhang,
- Abstract summary: We introduce ConfiDx, an uncertainty-aware large language model (LLM) created by fine-tuning open-source LLMs with diagnostic criteria.<n>We formalized the task and assembled richly annotated datasets that capture varying degrees of diagnostic ambiguity.
- Score: 11.093388930528022
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Explainable disease diagnosis, which leverages patient information (e.g., signs and symptoms) and computational models to generate probable diagnoses and reasonings, offers clear clinical values. However, when clinical notes encompass insufficient evidence for a definite diagnosis, such as the absence of definitive symptoms, diagnostic uncertainty usually arises, increasing the risk of misdiagnosis and adverse outcomes. Although explicitly identifying and explaining diagnostic uncertainties is essential for trustworthy diagnostic systems, it remains under-explored. To fill this gap, we introduce ConfiDx, an uncertainty-aware large language model (LLM) created by fine-tuning open-source LLMs with diagnostic criteria. We formalized the task and assembled richly annotated datasets that capture varying degrees of diagnostic ambiguity. Evaluating ConfiDx on real-world datasets demonstrated that it excelled in identifying diagnostic uncertainties, achieving superior diagnostic performance, and generating trustworthy explanations for diagnoses and uncertainties. To our knowledge, this is the first study to jointly address diagnostic uncertainty recognition and explanation, substantially enhancing the reliability of automatic diagnostic systems.
Related papers
- MedClarify: An information-seeking AI agent for medical diagnosis with case-specific follow-up questions [26.936554184582096]
We introduce MedClarify, an AI agent for information-seeking that can generate follow-up questions for iterative reasoning.<n>Specifically, MedClarify computes a list of candidate diagnoses analogous to a differential diagnosis, and then proactively generates follow-up questions.
arXiv Detail & Related papers (2026-02-19T12:19:12Z) - Thinking Like a Doctor: Conversational Diagnosis through the Exploration of Diagnostic Knowledge Graphs [12.612647781309098]
We propose a conversational diagnosis system that explores a diagnostic knowledge graph to reason in two steps.<n>We use a realistic patient simulator that responds to the system's questions.<n>Experiments show improved diagnostic accuracy and efficiency over strong baselines.
arXiv Detail & Related papers (2026-02-02T11:56:36Z) - Modeling Clinical Uncertainty in Radiology Reports: from Explicit Uncertainty Markers to Implicit Reasoning Pathways [16.76473492794096]
Explicit uncertainty reflects doubt about the presence or absence of findings, conveyed through hedging phrases.<n>Implicit uncertainty arises when radiologists omit parts of their reasoning, recording only key findings or diagnoses.<n>Here, it is often unclear whether omitted findings are truly absent or simply unmentioned for brevity.<n>We quantify explicit uncertainty by creating an expert-validated, LLM-based reference ranking of common hedging phrases, and mapping each finding to a probability value based on this reference.<n>In addition, we model implicit uncertainty through an expansion framework that systematically adds characteristic sub-findings derived from expert-defined diagnostic pathways for 14 common diagnoses.
arXiv Detail & Related papers (2025-11-06T16:24:53Z) - Simulating Viva Voce Examinations to Evaluate Clinical Reasoning in Large Language Models [51.91760712805404]
We introduce VivaBench, a benchmark for evaluating sequential clinical reasoning in large language models (LLMs)<n>Our dataset consists of 1762 physician-curated clinical vignettes structured as interactive scenarios that simulate a (oral) examination in medical training.<n>Our analysis identified several failure modes that mirror common cognitive errors in clinical practice.
arXiv Detail & Related papers (2025-10-11T16:24:35Z) - Probabilistic Machine Learning for Uncertainty-Aware Diagnosis of Industrial Systems [2.7946438090394903]
This work presents a framework that uses ensemble probabilistic machine learning to improve diagnostic characteristics of data driven consistency based diagnosis.<n>The proposed method is evaluated across several case studies using both ablation and comparative analyses, showing consistent improvements across a range of diagnostic metrics.
arXiv Detail & Related papers (2025-09-23T08:59:20Z) - End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning [52.12425911708585]
Deep-DxSearch is an agentic RAG system trained end-to-end with reinforcement learning (RL)<n>In Deep-DxSearch, we first construct a large-scale medical retrieval corpus comprising patient records and reliable medical knowledge sources.<n> Experiments demonstrate that our end-to-end RL training framework consistently outperforms prompt-engineering and training-free RAG approaches.
arXiv Detail & Related papers (2025-08-21T17:42:47Z) - LLM-Driven Medical Document Analysis: Enhancing Trustworthy Pathology and Differential Diagnosis [13.435898630240416]
We propose a trustworthy medical document analysis platform that fine-tunes a LLaMA-v3 using low-rank adaptation.<n>Our approach utilizes DDXPlus, the largest benchmark dataset for differential diagnosis.<n>The developed web-based platform allows users to submit their own unstructured medical documents and receive accurate, explainable diagnostic results.
arXiv Detail & Related papers (2025-06-24T15:12:42Z) - Uncertainty-aware abstention in medical diagnosis based on medical texts [87.88110503208016]
This study addresses the critical issue of reliability for AI-assisted medical diagnosis.<n>We focus on the selection prediction approach that allows the diagnosis system to abstain from providing the decision if it is not confident in the diagnosis.<n>We introduce HUQ-2, a new state-of-the-art method for enhancing reliability in selective prediction tasks.
arXiv Detail & Related papers (2025-02-25T10:15:21Z) - CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis [36.28995062833098]
Chain-of-Diagnosis (CoD) transforms the diagnostic process into a diagnostic chain that mirrors a physician's thought process.
CoD outputs the disease confidence distribution to ensure transparency in decision-making.
DiagnosisGPT is capable of diagnosing 9604 diseases.
arXiv Detail & Related papers (2024-07-18T09:06:27Z) - Unified Uncertainty Estimation for Cognitive Diagnosis Models [70.46998436898205]
We propose a unified uncertainty estimation approach for a wide range of cognitive diagnosis models.
We decompose the uncertainty of diagnostic parameters into data aspect and model aspect.
Our method is effective and can provide useful insights into the uncertainty of cognitive diagnosis.
arXiv Detail & Related papers (2024-03-09T13:48:20Z) - Towards Reducing Diagnostic Errors with Interpretable Risk Prediction [18.474645862061426]
We propose a method to use LLMs to identify pieces of evidence in patient EHR data that indicate increased or decreased risk of specific diagnoses.
Our ultimate aim is to increase access to evidence and reduce diagnostic errors.
arXiv Detail & Related papers (2024-02-15T17:05:48Z) - A Foundational Framework and Methodology for Personalized Early and
Timely Diagnosis [84.6348989654916]
We propose the first foundational framework for early and timely diagnosis.
It builds on decision-theoretic approaches to outline the diagnosis process.
It integrates machine learning and statistical methodology for estimating the optimal personalized diagnostic path.
arXiv Detail & Related papers (2023-11-26T14:42:31Z) - Towards the Identifiability and Explainability for Personalized Learner
Modeling: An Inductive Paradigm [36.60917255464867]
We propose an identifiable cognitive diagnosis framework (ID-CDF) based on a novel response-proficiency-response paradigm inspired by encoder-decoder models.
We show that ID-CDF can effectively address the problems without loss of diagnosis preciseness.
arXiv Detail & Related papers (2023-09-01T07:18:02Z) - An Uncertainty-Informed Framework for Trustworthy Fault Diagnosis in
Safety-Critical Applications [1.988145627448243]
Low trustworthiness of deep learning-based prognostic and health management (PHM) hinders its applications in safety-critical assets.
We propose an uncertainty-informed framework to diagnose faults and meanwhile detect the OOD dataset.
We show that the proposed framework is of particular advantage in tackling unknowns and enhancing the trustworthiness of fault diagnosis in safety-critical applications.
arXiv Detail & Related papers (2021-10-08T21:24:14Z) - Anytime Diagnosis for Reconfiguration [52.77024349608834]
We introduce and analyze FlexDiag which is an anytime direct diagnosis approach.
We evaluate the algorithm with regard to performance and diagnosis quality using a configuration benchmark from the domain of feature models and an industrial configuration knowledge base from the automotive domain.
arXiv Detail & Related papers (2021-02-19T11:45:52Z) - Inheritance-guided Hierarchical Assignment for Clinical Automatic
Diagnosis [50.15205065710629]
Clinical diagnosis, which aims to assign diagnosis codes for a patient based on the clinical note, plays an essential role in clinical decision-making.
We propose a novel framework to combine the inheritance-guided hierarchical assignment and co-occurrence graph propagation for clinical automatic diagnosis.
arXiv Detail & Related papers (2021-01-27T13:16:51Z) - Uncertainty aware and explainable diagnosis of retinal disease [0.0]
We perform uncertainty analysis of a deep learning model for diagnosis of four retinal diseases.
We show the features that a system used to make prediction while uncertainty awareness is the ability of a system to highlight when it is not sure about the decision.
arXiv Detail & Related papers (2021-01-26T23:37:30Z) - Towards Causality-Aware Inferring: A Sequential Discriminative Approach
for Medical Diagnosis [142.90770786804507]
Medical diagnosis assistant (MDA) aims to build an interactive diagnostic agent to sequentially inquire about symptoms for discriminating diseases.
This work attempts to address these critical issues in MDA by taking advantage of the causal diagram.
We propose a propensity-based patient simulator to effectively answer unrecorded inquiry by drawing knowledge from the other records.
arXiv Detail & Related papers (2020-03-14T02:05:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.