Harnessing Knowledge Retrieval with Large Language Models for Clinical Report Error Correction
- URL: http://arxiv.org/abs/2406.15045v1
- Date: Fri, 21 Jun 2024 10:48:21 GMT
- Title: Harnessing Knowledge Retrieval with Large Language Models for Clinical Report Error Correction
- Authors: Jinge Wu, Zhaolong Wu, Abul Hasan, Yunsoo Kim, Jason P. Y. Cheung, Teng Zhang, Honghan Wu,
- Abstract summary: This study proposes an approach for error correction in clinical radiology reports, leveraging large language models (LLMs) and retrieval-augmented generation (RAG) techniques.
The proposed framework employs internal and external retrieval mechanisms to extract relevant medical entities and relations from the report and external knowledge sources.
The effectiveness of the approach is evaluated using a benchmark dataset created by corrupting real-world radiology reports with realistic errors.
- Score: 5.756731172979317
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study proposes an approach for error correction in clinical radiology reports, leveraging large language models (LLMs) and retrieval-augmented generation (RAG) techniques. The proposed framework employs internal and external retrieval mechanisms to extract relevant medical entities and relations from the report and external knowledge sources. A three-stage inference process is introduced, decomposing the task into error detection, localization, and correction subtasks, which enhances the explainability and performance of the system. The effectiveness of the approach is evaluated using a benchmark dataset created by corrupting real-world radiology reports with realistic errors, guided by domain experts. Experimental results demonstrate the benefits of the proposed methods, with the combination of internal and external retrieval significantly improving the accuracy of error detection, localization, and correction across various state-of-the-art LLMs. The findings contribute to the development of more robust and reliable error correction systems for clinical documentation.
Related papers
- Automated Clinical Data Extraction with Knowledge Conditioned LLMs [7.935125803100394]
Large language models (LLMs) can be effective at interpreting unstructured text in reports, but they often hallucinate due to a lack of domain-specific knowledge.
We propose a novel framework that aligns generated internal knowledge with external knowledge through in-context learning (ICL)
Our framework employs a retriever to identify relevant units of internal or external knowledge and a grader to evaluate the truthfulness and helpfulness of the retrieved internal-knowledge rules.
arXiv Detail & Related papers (2024-06-26T02:49:28Z) - WangLab at MEDIQA-CORR 2024: Optimized LLM-based Programs for Medical Error Detection and Correction [5.7931394318054155]
We present our approach that achieved top performance in all three subtasks.
For the MS dataset, which contains subtle errors, we developed a retrieval-based system.
For the UW dataset, reflecting more realistic clinical notes, we created a pipeline of modules to detect, localize, and correct errors.
arXiv Detail & Related papers (2024-04-22T19:31:45Z) - Word-Level ASR Quality Estimation for Efficient Corpus Sampling and
Post-Editing through Analyzing Attentions of a Reference-Free Metric [5.592917884093537]
The potential of quality estimation (QE) metrics is introduced and evaluated as a novel tool to enhance explainable artificial intelligence (XAI) in ASR systems.
The capabilities of the NoRefER metric are explored in identifying word-level errors to aid post-editors in refining ASR hypotheses.
arXiv Detail & Related papers (2024-01-20T16:48:55Z) - ChatRadio-Valuer: A Chat Large Language Model for Generalizable
Radiology Report Generation Based on Multi-institution and Multi-system Data [115.0747462486285]
ChatRadio-Valuer is a tailored model for automatic radiology report generation that learns generalizable representations.
The clinical dataset utilized in this study encompasses a remarkable total of textbf332,673 observations.
ChatRadio-Valuer consistently outperforms state-of-the-art models, especially ChatGPT (GPT-3.5-Turbo) and GPT-4 et al.
arXiv Detail & Related papers (2023-10-08T17:23:17Z) - Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning.
They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health.
Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z) - Cross-Modal Causal Intervention for Medical Report Generation [109.83549148448469]
Medical report generation (MRG) is essential for computer-aided diagnosis and medication guidance.
Due to the spurious correlations within image-text data induced by visual and linguistic biases, it is challenging to generate accurate reports reliably describing lesion areas.
We propose a novel Visual-Linguistic Causal Intervention (VLCI) framework for MRG, which consists of a visual deconfounding module (VDM) and a linguistic deconfounding module (LDM)
arXiv Detail & Related papers (2023-03-16T07:23:55Z) - The role of noise in denoising models for anomaly detection in medical
images [62.0532151156057]
Pathological brain lesions exhibit diverse appearance in brain images.
Unsupervised anomaly detection approaches have been proposed using only normal data for training.
We show that optimization of the spatial resolution and magnitude of the noise improves the performance of different model training regimes.
arXiv Detail & Related papers (2023-01-19T21:39:38Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Confidence-Guided Radiology Report Generation [24.714303916431078]
We propose a novel method to quantify both the visual uncertainty and the textual uncertainty for the task of radiology report generation.
Our experimental results have demonstrated that our proposed method for model uncertainty characterization and estimation can provide more reliable confidence scores for radiology report generation.
arXiv Detail & Related papers (2021-06-21T07:02:12Z) - Interpreting Rate-Distortion of Variational Autoencoder and Using Model
Uncertainty for Anomaly Detection [5.491655566898372]
We build a scalable machine learning system for unsupervised anomaly detection via representation learning.
We revisit VAE from the perspective of information theory to provide some theoretical foundations on using the reconstruction error.
We show empirically the competitive performance of our approach on benchmark datasets.
arXiv Detail & Related papers (2020-05-05T00:03:48Z) - Generalization Bounds and Representation Learning for Estimation of
Potential Outcomes and Causal Effects [61.03579766573421]
We study estimation of individual-level causal effects, such as a single patient's response to alternative medication.
We devise representation learning algorithms that minimize our bound, by regularizing the representation's induced treatment group distance.
We extend these algorithms to simultaneously learn a weighted representation to further reduce treatment group distances.
arXiv Detail & Related papers (2020-01-21T10:16:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.