Hallucinations in medical devices
- URL: http://arxiv.org/abs/2508.14118v1
- Date: Mon, 18 Aug 2025 12:31:55 GMT
- Title: Hallucinations in medical devices
- Authors: Jason Granstedt, Prabhat Kc, Rucha Deshpande, Victor Garcia, Aldo Badano,
- Abstract summary: We introduce a practical and universal definition that denotes hallucinations as a type of error that is plausible and can be either impactful or benign to the task at hand.<n>The definition aims at facilitating the evaluation of medical devices that suffer from hallucinations across product areas.
- Score: 0.37698262166557467
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computer methods in medical devices are frequently imperfect and are known to produce errors in clinical or diagnostic tasks. However, when deep learning and data-based approaches yield output that exhibit errors, the devices are frequently said to hallucinate. Drawing from theoretical developments and empirical studies in multiple medical device areas, we introduce a practical and universal definition that denotes hallucinations as a type of error that is plausible and can be either impactful or benign to the task at hand. The definition aims at facilitating the evaluation of medical devices that suffer from hallucinations across product areas. Using examples from imaging and non-imaging applications, we explore how the proposed definition relates to evaluation methodologies and discuss existing approaches for minimizing the prevalence of hallucinations.
Related papers
- HACK: Hallucinations Along Certainty and Knowledge Axes [66.66625343090743]
We propose a framework for categorizing hallucinations along two axes: knowledge and certainty.<n>We identify a particularly concerning subset of hallucinations where models hallucinate with certainty despite having the correct knowledge internally.
arXiv Detail & Related papers (2025-10-28T09:34:31Z) - A novel hallucination classification framework [0.0]
This work introduces a novel methodology for the automatic detection of hallucinations generated during large language model (LLM) inference.<n>The proposed approach is based on a systematic taxonomy and controlled reproduction of diverse hallucination types through prompt engineering.
arXiv Detail & Related papers (2025-10-06T09:54:20Z) - Review of Hallucination Understanding in Large Language and Vision Models [65.29139004945712]
We present a framework for characterizing both image and text hallucinations across diverse applications.<n>Our investigations reveal that hallucinations often stem from predictable patterns in data distributions and inherited biases.<n>This survey provides a foundation for developing more robust and effective solutions to hallucinations in real-world generative AI systems.
arXiv Detail & Related papers (2025-09-26T09:23:08Z) - Trustworthy Medical Imaging with Large Language Models: A Study of Hallucinations Across Modalities [3.1406146587437904]
Large Language Models (LLMs) are increasingly applied to medical imaging tasks.<n>These models often produce hallucinations, which are confident but incorrect outputs that can mislead clinical decisions.<n>This study examines hallucinations in two directions: image to text, where LLMs generate reports from X-ray, CT, or MRI scans, and text to image, where models create medical images from clinical prompts.
arXiv Detail & Related papers (2025-08-09T16:03:46Z) - A Survey of Multimodal Hallucination Evaluation and Detection [52.03164192840023]
Multi-modal Large Language Models (MLLMs) have emerged as a powerful paradigm for integrating visual and textual information.<n>These models often suffer from hallucination, producing content that appears plausible but contradicts the input content or established world knowledge.<n>This survey offers an in-depth review of hallucination evaluation benchmarks and detection methods across Image-to-Text (I2T) and Text-to-image (T2I) generation tasks.
arXiv Detail & Related papers (2025-07-25T07:22:42Z) - Fact-Controlled Diagnosis of Hallucinations in Medical Text Summarization [8.057050705357973]
Hallucinations in large language models (LLMs) pose significant risks to patient care and clinical decision-making.<n>General-domain detectors struggle to detect clinical hallucinations, and that performance on fact-controlled hallucinations does not reliably predict effectiveness on natural hallucinations.<n>We develop fact-based approaches that count hallucinations, offering explainability not available with existing methods.
arXiv Detail & Related papers (2025-05-31T08:04:37Z) - Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations [82.42811602081692]
This paper introduces a subsequence association framework to systematically trace and understand hallucinations.<n>Key insight is hallucinations that arise when dominant hallucinatory associations outweigh faithful ones.<n>We propose a tracing algorithm that identifies causal subsequences by analyzing hallucination probabilities across randomized input contexts.
arXiv Detail & Related papers (2025-04-17T06:34:45Z) - MedHal: An Evaluation Dataset for Medical Hallucination Detection [2.5782420501870296]
We present MedHal, a novel large-scale dataset specifically designed to evaluate if models can detect hallucinations in medical texts.<n>MedHal addresses gaps by: (1) incorporating diverse medical text sources and tasks; (2) providing a substantial volume of annotated samples suitable for training medical hallucination detection models; and (3) including explanations for factual inconsistencies to guide model learning.
arXiv Detail & Related papers (2025-04-11T14:55:15Z) - Medical Hallucinations in Foundation Models and Their Impact on Healthcare [53.97060824532454]
Foundation Models that are capable of processing and generating multi-modal data have transformed AI's role in medicine.<n>We define medical hallucination as any instance in which a model generates misleading medical content.<n>Our results reveal that inference techniques such as Chain-of-Thought (CoT) and Search Augmented Generation can effectively reduce hallucination rates.<n>These findings underscore the ethical and practical imperative for robust detection and mitigation strategies.
arXiv Detail & Related papers (2025-02-26T02:30:44Z) - CoMT: Chain-of-Medical-Thought Reduces Hallucination in Medical Report Generation [20.59298361626719]
We propose a chain-of-medical-thought approach (CoMT) to mitigate hallucinations in medical report generation.<n>CoMT intends to imitate the cognitive process of human doctors by decomposing diagnostic procedures.
arXiv Detail & Related papers (2024-06-17T12:03:32Z) - Do Androids Know They're Only Dreaming of Electric Sheep? [45.513432353811474]
We design probes trained on the internal representations of a transformer language model to predict its hallucinatory behavior.
Our probes are narrowly trained and we find that they are sensitive to their training domain.
We find that probing is a feasible and efficient alternative to language model hallucination evaluation when model states are available.
arXiv Detail & Related papers (2023-12-28T18:59:50Z) - HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data [102.56792377624927]
hallucinations inherent in machine-generated data remain under-explored.
We present a novel hallucination detection and elimination framework, HalluciDoctor, based on the cross-checking paradigm.
Our method successfully mitigates 44.6% hallucinations relatively and maintains competitive performance compared to LLaVA.
arXiv Detail & Related papers (2023-11-22T04:52:58Z) - Towards Mitigating Hallucination in Large Language Models via
Self-Reflection [63.2543947174318]
Large language models (LLMs) have shown promise for generative and knowledge-intensive tasks including question-answering (QA) tasks.
This paper analyses the phenomenon of hallucination in medical generative QA systems using widely adopted LLMs and datasets.
arXiv Detail & Related papers (2023-10-10T03:05:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.