Improving Radiology Report Generation Systems by Removing Hallucinated
References to Non-existent Priors
- URL: http://arxiv.org/abs/2210.06340v2
- Date: Thu, 13 Oct 2022 14:33:45 GMT
- Title: Improving Radiology Report Generation Systems by Removing Hallucinated
References to Non-existent Priors
- Authors: Vignav Ramesh, Nathan Andrew Chi, Pranav Rajpurkar
- Abstract summary: We propose two methods to remove references to priors in radiology reports.
A GPT-3-based few-shot approach to rewrite medical reports without references to priors; and a BioBERT-based token classification approach to directly remove words referring to priors.
We find that our re-trained model--which we call CXR-ReDonE--outperforms previous report generation methods on clinical metrics, achieving an average BERTScore of 0.2351 (2.57% absolute improvement)
- Score: 1.1110995501996481
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current deep learning models trained to generate radiology reports from chest
radiographs are capable of producing clinically accurate, clear, and actionable
text that can advance patient care. However, such systems all succumb to the
same problem: making hallucinated references to non-existent prior reports.
Such hallucinations occur because these models are trained on datasets of
real-world patient reports that inherently refer to priors. To this end, we
propose two methods to remove references to priors in radiology reports: (1) a
GPT-3-based few-shot approach to rewrite medical reports without references to
priors; and (2) a BioBERT-based token classification approach to directly
remove words referring to priors. We use the aforementioned approaches to
modify MIMIC-CXR, a publicly available dataset of chest X-rays and their
associated free-text radiology reports; we then retrain CXR-RePaiR, a radiology
report generation system, on the adapted MIMIC-CXR dataset. We find that our
re-trained model--which we call CXR-ReDonE--outperforms previous report
generation methods on clinical metrics, achieving an average BERTScore of
0.2351 (2.57% absolute improvement). We expect our approach to be broadly
valuable in enabling current radiology report generation systems to be more
directly integrated into clinical pipelines.
Related papers
- RaTEScore: A Metric for Radiology Report Generation [59.37561810438641]
This paper introduces a novel, entity-aware metric, as Radiological Report (Text) Evaluation (RaTEScore)
RaTEScore emphasizes crucial medical entities such as diagnostic outcomes and anatomical details, and is robust against complex medical synonyms and sensitive to negation expressions.
Our evaluations demonstrate that RaTEScore aligns more closely with human preference than existing metrics, validated both on established public benchmarks and our newly proposed RaTE-Eval benchmark.
arXiv Detail & Related papers (2024-06-24T17:49:28Z) - Radiology Report Generation Using Transformers Conditioned with
Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information.
The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z) - ChatRadio-Valuer: A Chat Large Language Model for Generalizable
Radiology Report Generation Based on Multi-institution and Multi-system Data [115.0747462486285]
ChatRadio-Valuer is a tailored model for automatic radiology report generation that learns generalizable representations.
The clinical dataset utilized in this study encompasses a remarkable total of textbf332,673 observations.
ChatRadio-Valuer consistently outperforms state-of-the-art models, especially ChatGPT (GPT-3.5-Turbo) and GPT-4 et al.
arXiv Detail & Related papers (2023-10-08T17:23:17Z) - Radiology-Llama2: Best-in-Class Large Language Model for Radiology [71.27700230067168]
This paper introduces Radiology-Llama2, a large language model specialized for radiology through a process known as instruction tuning.
Quantitative evaluations using ROUGE metrics on the MIMIC-CXR and OpenI datasets demonstrate that Radiology-Llama2 achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-08-29T17:44:28Z) - Longitudinal Data and a Semantic Similarity Reward for Chest X-Ray Report Generation [7.586632627817609]
Radiologists face high burnout rates, partly due to the increasing volume of Chest X-rays (CXRs) requiring interpretation and reporting.
Our proposed CXR report generator integrates elements of the workflow and introduces a novel reward for reinforcement learning.
Results from our study demonstrate that the proposed model generates reports that are more aligned with radiologists' reports than state-of-the-art models.
arXiv Detail & Related papers (2023-07-19T05:41:14Z) - Replace and Report: NLP Assisted Radiology Report Generation [31.309987297324845]
We propose a template-based approach to generate radiology reports from radiographs.
This is the first attempt to generate chest X-ray radiology reports by first creating small sentences for abnormal findings and then replacing them in the normal report template.
arXiv Detail & Related papers (2023-06-19T10:04:42Z) - Boosting Radiology Report Generation by Infusing Comparison Prior [7.054671146863795]
Recent transformer-based models have made significant strides in generating radiology reports from chest X-ray images.
These models often lack prior knowledge, resulting in the generation of synthetic reports that mistakenly reference non-existent prior exams.
We propose a novel approach that leverages a rule-based labeler to extract comparison prior information from radiology reports.
arXiv Detail & Related papers (2023-05-08T09:12:44Z) - Medical Image Captioning via Generative Pretrained Transformers [57.308920993032274]
We combine two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records.
The proposed model is tested on two medical datasets, the Open-I, MIMIC-CXR, and the general-purpose MS-COCO.
arXiv Detail & Related papers (2022-09-28T10:27:10Z) - Exploring and Distilling Posterior and Prior Knowledge for Radiology
Report Generation [55.00308939833555]
The PPKED includes three modules: Posterior Knowledge Explorer (PoKE), Prior Knowledge Explorer (PrKE) and Multi-domain Knowledge Distiller (MKD)
PoKE explores the posterior knowledge, which provides explicit abnormal visual regions to alleviate visual data bias.
PrKE explores the prior knowledge from the prior medical knowledge graph (medical knowledge) and prior radiology reports (working experience) to alleviate textual data bias.
arXiv Detail & Related papers (2021-06-13T11:10:02Z) - Generating Radiology Reports via Memory-driven Transformer [38.30011851429407]
We propose to generate radiology reports with memory-driven Transformer.
Experimental results on two prevailing radiology report datasets, IU X-Ray and MIMIC-CXR.
arXiv Detail & Related papers (2020-10-30T04:08:03Z) - Improving Factual Completeness and Consistency of Image-to-Text
Radiology Report Generation [26.846912996765447]
We introduce two new simple rewards to encourage the generation of factually complete and consistent radiology reports.
We show that our system leads to generations that are more factually complete and consistent compared to the baselines.
arXiv Detail & Related papers (2020-10-20T05:42:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.