ReXErr: Synthesizing Clinically Meaningful Errors in Diagnostic Radiology Reports
- URL: http://arxiv.org/abs/2409.10829v1
- Date: Tue, 17 Sep 2024 01:42:39 GMT
- Title: ReXErr: Synthesizing Clinically Meaningful Errors in Diagnostic Radiology Reports
- Authors: Vishwanatha M. Rao, Serena Zhang, Julian N. Acosta, Subathra Adithan, Pranav Rajpurkar,
- Abstract summary: We introduce ReXErr, a methodology that leverages Large Language Models to generate representative errors within chest X-ray reports.
We developed error categories that capture common mistakes in both human and AI-generated reports.
Our approach uses a novel sampling scheme to inject diverse errors while maintaining clinical plausibility.
- Score: 1.9106067578277455
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurately interpreting medical images and writing radiology reports is a critical but challenging task in healthcare. Both human-written and AI-generated reports can contain errors, ranging from clinical inaccuracies to linguistic mistakes. To address this, we introduce ReXErr, a methodology that leverages Large Language Models to generate representative errors within chest X-ray reports. Working with board-certified radiologists, we developed error categories that capture common mistakes in both human and AI-generated reports. Our approach uses a novel sampling scheme to inject diverse errors while maintaining clinical plausibility. ReXErr demonstrates consistency across error categories and produces errors that closely mimic those found in real-world scenarios. This method has the potential to aid in the development and evaluation of report correction algorithms, potentially enhancing the quality and reliability of radiology reporting.
Related papers
- Not All Errors Are Equal: Investigation of Speech Recognition Errors in Alzheimer's Disease Detection [62.942077348224046]
Speech recognition plays an important role in automatic detection of Alzheimer's disease (AD)
Recent studies have revealed a non-linear relationship between word error rates (WER) and AD detection performance.
This work presents a series of analyses to explore the effect of ASR transcription errors in BERT-based AD detection systems.
arXiv Detail & Related papers (2024-12-09T09:32:20Z) - Semantic Consistency-Based Uncertainty Quantification for Factuality in Radiology Report Generation [20.173287130474797]
generative medical Vision Large Language Models (VLLMs) are prone to hallucinations and can produce inaccurate diagnostic information.
We introduce a novel Semantic Consistency-Based Uncertainty Quantification framework that provides both report-level and sentence-level uncertainties.
By abstaining from high-uncertainty reports, our approach improves factuality scores by $10$%, achieved by rejecting $20$% of reports.
arXiv Detail & Related papers (2024-12-05T20:43:39Z) - MedAutoCorrect: Image-Conditioned Autocorrection in Medical Reporting [31.710972402763527]
In medical reporting, the accuracy of radiological reports, whether generated by humans or machine learning algorithms, is critical.
We tackle a new task in this paper: image-conditioned autocorrection of inaccuracies within these reports.
We propose a two-stage framework capable of pinpointing these errors and then making corrections, simulating an textitautocorrection process.
arXiv Detail & Related papers (2024-12-04T02:32:53Z) - Resource-Efficient Medical Report Generation using Large Language Models [3.2627279988912194]
Medical report generation is the task of automatically writing radiology reports for chest X-ray images.
We propose a new framework leveraging vision-enabled Large Language Models (LLM) for the task of medical report generation.
arXiv Detail & Related papers (2024-10-21T05:08:18Z) - RaTEScore: A Metric for Radiology Report Generation [59.37561810438641]
This paper introduces a novel, entity-aware metric, as Radiological Report (Text) Evaluation (RaTEScore)
RaTEScore emphasizes crucial medical entities such as diagnostic outcomes and anatomical details, and is robust against complex medical synonyms and sensitive to negation expressions.
Our evaluations demonstrate that RaTEScore aligns more closely with human preference than existing metrics, validated both on established public benchmarks and our newly proposed RaTE-Eval benchmark.
arXiv Detail & Related papers (2024-06-24T17:49:28Z) - Consensus, dissensus and synergy between clinicians and specialist
foundation models in radiology report generation [32.26270073540666]
The worldwide shortage of radiologists restricts access to expert care and imposes heavy workloads.
Recent progress in automated report generation with vision-language models offer clear potential in ameliorating the situation.
We build a state-of-the-art report generation system for chest radiographs, $textitFlamingo-CXR, by fine-tuning a well-known vision-language foundation model on radiology data.
arXiv Detail & Related papers (2023-11-30T05:38:34Z) - Radiology Report Generation Using Transformers Conditioned with
Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information.
The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z) - ChatRadio-Valuer: A Chat Large Language Model for Generalizable
Radiology Report Generation Based on Multi-institution and Multi-system Data [115.0747462486285]
ChatRadio-Valuer is a tailored model for automatic radiology report generation that learns generalizable representations.
The clinical dataset utilized in this study encompasses a remarkable total of textbf332,673 observations.
ChatRadio-Valuer consistently outperforms state-of-the-art models, especially ChatGPT (GPT-3.5-Turbo) and GPT-4 et al.
arXiv Detail & Related papers (2023-10-08T17:23:17Z) - Weakly Supervised Contrastive Learning for Chest X-Ray Report Generation [3.3978173451092437]
Radiology report generation aims at generating descriptive text from radiology images automatically.
A typical setting consists of training encoder-decoder models on image-report pairs with a cross entropy loss.
We propose a novel weakly supervised contrastive loss for medical report generation.
arXiv Detail & Related papers (2021-09-25T00:06:23Z) - Variational Topic Inference for Chest X-Ray Report Generation [102.04931207504173]
Report generation for medical imaging promises to reduce workload and assist diagnosis in clinical practice.
Recent work has shown that deep learning models can successfully caption natural images.
We propose variational topic inference for automatic report generation.
arXiv Detail & Related papers (2021-07-15T13:34:38Z) - Exploring and Distilling Posterior and Prior Knowledge for Radiology
Report Generation [55.00308939833555]
The PPKED includes three modules: Posterior Knowledge Explorer (PoKE), Prior Knowledge Explorer (PrKE) and Multi-domain Knowledge Distiller (MKD)
PoKE explores the posterior knowledge, which provides explicit abnormal visual regions to alleviate visual data bias.
PrKE explores the prior knowledge from the prior medical knowledge graph (medical knowledge) and prior radiology reports (working experience) to alleviate textual data bias.
arXiv Detail & Related papers (2021-06-13T11:10:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.