Related papers: Mapping Clinical Doubt: Locating Linguistic Uncertainty in LLMs

Mapping Clinical Doubt: Locating Linguistic Uncertainty in LLMs

URL: http://arxiv.org/abs/2511.22402v1
Date: Thu, 27 Nov 2025 12:26:06 GMT
Title: Mapping Clinical Doubt: Locating Linguistic Uncertainty in LLMs
Authors: Srivarshinee Sridhar, Raghav Kaushik Ravi, Kripabandhu Ghosh,
Abstract summary: This work examines input-side representational sensitivity to linguistic uncertainty in medical text.<n>We propose Model Sensitivity to Uncertainty (MSU), a layerwise probing metric that quantifies activation-level shifts induced by uncertainty cues.
Score: 4.360255198498071
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) are increasingly used in clinical settings, where sensitivity to linguistic uncertainty can influence diagnostic interpretation and decision-making. Yet little is known about where such epistemic cues are internally represented within these models. Distinct from uncertainty quantification, which measures output confidence, this work examines input-side representational sensitivity to linguistic uncertainty in medical text. We curate a contrastive dataset of clinical statements varying in epistemic modality (e.g., 'is consistent with' vs. 'may be consistent with') and propose Model Sensitivity to Uncertainty (MSU), a layerwise probing metric that quantifies activation-level shifts induced by uncertainty cues. Our results show that LLMs exhibit structured, depth-dependent sensitivity to clinical uncertainty, suggesting that epistemic information is progressively encoded in deeper layers. These findings reveal how linguistic uncertainty is internally represented in LLMs, offering insight into their interpretability and epistemic reliability.

Related papers

Lyapunov Spectral Analysis of Speech Embedding Trajectories in Psychosis [63.56564189749175]
We analyze speech embeddings from structured clinical interviews of psychotic patients and healthy controls.<n>Lyapunov exponent (LE) spectra are computed from word-level and answer-level embeddings.
arXiv Detail & Related papers (2026-02-18T08:46:46Z)
Towards Reliable LLM-based Robot Planning via Combined Uncertainty Estimation [68.106428321492]
Large language models (LLMs) demonstrate advanced reasoning abilities, enabling robots to understand natural language instructions and generate high-level plans with appropriate grounding.<n>LLMs hallucinations present a significant challenge, often leading to overconfident yet potentially misaligned or unsafe plans.<n>We present Combined Uncertainty estimation for Reliable Embodied planning (CURE), which decomposes the uncertainty into epistemic and intrinsic uncertainty, each estimated separately.
arXiv Detail & Related papers (2025-10-09T10:26:58Z)
Embeddings to Diagnosis: Latent Fragility under Agentic Perturbations in Clinical LLMs [0.0]
We propose a geometry-aware evaluation framework, LAPD (Latent Agentic Perturbation Diagnostics), which probes the latent robustness of clinical LLMs under structured adversarial edits.<n>Within this framework, we introduce Latent Diagnosis Flip Rate (LDFR), a model-agnostic diagnostic signal that captures representational instability when embeddings cross decision boundaries in PCA-reduced latent space.<n>Our results reveal a persistent gap between surface robustness and semantic stability, underscoring the importance of geometry-aware auditing in safety-critical clinical AI.
arXiv Detail & Related papers (2025-07-27T16:48:53Z)
Representations of Fact, Fiction and Forecast in Large Language Models: Epistemics and Attitudes [15.754908203866284]
Rational speakers are supposed to know what they know and what they do not know.<n>It is still a challenge for current large language models to generate corresponding utterances based on the assessment of facts and confidence in an uncertain real-world environment.
arXiv Detail & Related papers (2025-06-02T10:19:42Z)
Hesitation is defeat? Connecting Linguistic and Predictive Uncertainty [2.8186733524862158]
This paper investigates the relationship between predictive uncertainty and human/linguistic uncertainty, as estimated from free-text reports labelled by rule-based labellers.<n>The results demonstrate good model performance, but also a modest correlation between predictive and linguistic uncertainty, highlighting the challenges in aligning machine uncertainty with human interpretation.
arXiv Detail & Related papers (2025-05-06T18:34:37Z)
The challenge of uncertainty quantification of large language models in medicine [0.0]
This study investigates uncertainty quantification in large language models (LLMs) for medical applications.<n>Our research frames uncertainty not as a barrier but as an essential part of knowledge that invites a dynamic and reflective approach to AI design.
arXiv Detail & Related papers (2025-04-07T17:24:11Z)
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations [51.92795774118647]
We find that verbal uncertainty'' is governed by a single linear feature in the representation space of LLMs.<n>We show that this has only moderate correlation with the actual semantic uncertainty'' of the model.
arXiv Detail & Related papers (2025-03-18T17:51:04Z)
Fact or Guesswork? Evaluating Large Language Models' Medical Knowledge with Structured One-Hop Judgments [108.55277188617035]
Large language models (LLMs) have been widely adopted in various downstream task domains, but their abilities to directly recall and apply factual medical knowledge remains under-explored.<n>We introduce the Medical Knowledge Judgment dataset (MKJ), a dataset derived from the Unified Medical Language System (UMLS), a comprehensive repository of standardized vocabularies and knowledge graphs.<n>Through a binary classification framework, MKJ evaluates LLMs' grasp of fundamental medical facts by having them assess the validity of concise, one-hop statements.
arXiv Detail & Related papers (2025-02-20T05:27:51Z)
Limitations of Large Language Models in Clinical Problem-Solving Arising from Inflexible Reasoning [3.3482359447109866]
Large Language Models (LLMs) have attained human-level accuracy on medical question-answer (QA) benchmarks.<n>Their limitations in navigating open-ended clinical scenarios have recently been shown.<n>We present the medical abstraction and reasoning corpus (M-ARC)<n>We find that LLMs, including current state-of-the-art o1 and Gemini models, perform poorly compared to physicians on M-ARC.
arXiv Detail & Related papers (2025-02-05T18:14:27Z)
Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities [79.9629927171974]
Uncertainty in Large Language Models (LLMs) is crucial for applications where safety and reliability are important. We propose Kernel Language Entropy (KLE), a novel method for uncertainty estimation in white- and black-box LLMs.
arXiv Detail & Related papers (2024-05-30T12:42:05Z)
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability. In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling. Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z)
Navigating the Grey Area: How Expressions of Uncertainty and Overconfidence Affect Language Models [74.07684768317705]
LMs are highly sensitive to markers of certainty in prompts, with accuies varying more than 80%. We find that expressions of high certainty result in a decrease in accuracy as compared to low expressions; similarly, factive verbs hurt performance, while evidentials benefit performance. These associations may suggest that LMs is based on observed language use, rather than truly reflecting uncertainty.
arXiv Detail & Related papers (2023-02-26T23:46:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.