Related papers: BI-RADS BERT & Using Section Tokenization to Understand Radiology Reports

BI-RADS BERT & Using Section Tokenization to Understand Radiology Reports

URL: http://arxiv.org/abs/2110.07552v1
Date: Thu, 14 Oct 2021 17:25:49 GMT
Title: BI-RADS BERT & Using Section Tokenization to Understand Radiology Reports
Authors: Grey Kuling, Dr. Belinda Curpen, and Anne L. Martel
Abstract summary: Domain specific contextual word embeddings have been shown to achieve impressive accuracy at such natural language processing tasks in medicine. BERT model pre-trained on breast radiology reports combined with section tokenization resulted in an overall accuracy of 95.9% in field extraction.
Score: 0.18352113484137625
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Radiology reports are the main form of communication between radiologists and other clinicians, and contain important information for patient care. However in order to use this information for research it is necessary to convert the raw text into structured data suitable for analysis. Domain specific contextual word embeddings have been shown to achieve impressive accuracy at such natural language processing tasks in medicine. In this work we pre-trained a contextual embedding BERT model using breast radiology reports and developed a classifier that incorporated the embedding with auxiliary global textual features in order to perform a section tokenization task. This model achieved a 98% accuracy at segregating free text reports into sections of information outlined in the Breast Imaging Reporting and Data System (BI-RADS) lexicon, a significant improvement over the Classic BERT model without auxiliary information. We then evaluated whether using section tokenization improved the downstream extraction of the following fields: modality/procedure, previous cancer, menopausal status, purpose of exam, breast density and background parenchymal enhancement. Using the BERT model pre-trained on breast radiology reports combined with section tokenization resulted in an overall accuracy of 95.9% in field extraction. This is a 17% improvement compared to an overall accuracy of 78.9% for field extraction for models without section tokenization and with Classic BERT embeddings. Our work shows the strength of using BERT in radiology report analysis and the advantages of section tokenization in identifying key features of patient factors recorded in breast radiology reports.

Related papers

Comparative Analysis of Abstractive Summarization Models for Clinical Radiology Reports [3.065600760950715]
This research explores the use of advanced abstractive summarization models to generate the concise impression from the findings section of a radiology report.<n>A comparative analysis is conducted on leading pre-trained and open-source large language models, including T5-base, BART-base, PEG-x-base, ChatGPT-4, LLaMA-3-8B, and a custom Pointer Generator Network with a coverage mechanism.
arXiv Detail & Related papers (2025-06-19T12:07:17Z)
An Advanced NLP Framework for Automated Medical Diagnosis with DeBERTa and Dynamic Contextual Positional Gating [0.40964539027092917]
The proposed approach employs back-translation to generate diverse paraphrased datasets, improving robustness and mitigating overfitting in classification tasks. For classification, an Attention-Based Feedforward Neural Network (ABFNN) is utilized, effectively focusing on the most relevant features to improve decision-making accuracy. This method demonstrates the effectiveness of the proposed NLP framework for medical diagnosis, achieving remarkable results with an accuracy of 99.78%, recall of 99.72%, precision of 99.79%, and an F1-score of 99.75%.
arXiv Detail & Related papers (2025-02-11T18:32:24Z)
RadBARTsum: Domain Specific Adaption of Denoising Sequence-to-Sequence Models for Abstractive Radiology Report Summarization [1.8450534779202723]
This study proposes RadBARTsum, a domain-specific and facilitated adaptation of the BART model for abstractive radiology report summarization. The approach involves two main steps: 1) re-training the BART model on a large corpus of radiology reports using a novel entity masking strategy to improve biomedical domain knowledge learning, and 2) fine-tuning the model for the summarization task using the Findings and Background sections to predict the Impression section.
arXiv Detail & Related papers (2024-06-05T08:43:11Z)
Structural Entities Extraction and Patient Indications Incorporation for Chest X-ray Report Generation [10.46031380503486]
We introduce a novel method, textbfStructural textbfEntities extraction and patient indications textbfIncorporation (SEI) for chest X-ray report generation. We employ a structural entities extraction (SEE) approach to eliminate presentation-style vocabulary in reports. We propose a cross-modal fusion network to integrate information from X-ray images, similar historical cases, and patient-specific indications.
arXiv Detail & Related papers (2024-05-23T01:29:47Z)
Uncertainty-aware Medical Diagnostic Phrase Identification and Grounding [72.18719355481052]
We introduce a novel task called Medical Report Grounding (MRG)<n>MRG aims to directly identify diagnostic phrases and their corresponding grounding boxes from medical reports in an end-to-end manner.<n>We propose uMedGround, a robust and reliable framework that leverages a multimodal large language model to predict diagnostic phrases.
arXiv Detail & Related papers (2024-04-10T07:41:35Z)
PathLDM: Text conditioned Latent Diffusion Model for Histopathology [62.970593674481414]
We introduce PathLDM, the first text-conditioned Latent Diffusion Model tailored for generating high-quality histopathology images. Our approach fuses image and textual data to enhance the generation process. We achieved a SoTA FID score of 7.64 for text-to-image generation on the TCGA-BRCA dataset, significantly outperforming the closest text-conditioned competitor with FID 30.1.
arXiv Detail & Related papers (2023-09-01T22:08:32Z)
Detecting automatically the layout of clinical documents to enhance the performances of downstream natural language processing [53.797797404164946]
We designed an algorithm to process clinical PDF documents and extract only clinically relevant text. The algorithm consists of several steps: initial text extraction using a PDF, followed by classification into such categories as body text, left notes, and footers. Medical performance was evaluated by examining the extraction of medical concepts of interest from the text in their respective sections.
arXiv Detail & Related papers (2023-05-23T08:38:33Z)
Does Synthetic Data Generation of LLMs Help Clinical Text Mining? [51.205078179427645]
We investigate the potential of OpenAI's ChatGPT to aid in clinical text mining. We propose a new training paradigm that involves generating a vast quantity of high-quality synthetic data. Our method has resulted in significant improvements in the performance of downstream tasks.
arXiv Detail & Related papers (2023-03-08T03:56:31Z)
Knowledge Graph Construction and Its Application in Automatic Radiology Report Generation from Radiologist's Dictation [22.894248859405767]
This paper focuses on applications of NLP techniques like Information Extraction (IE) and domain-specific Knowledge Graph (KG) to automatically generate radiology reports from radiologist's dictation. We develop an information extraction pipeline that combines rule-based, pattern-based, and dictionary-based techniques with lexical-semantic features to extract entities and relations. We generate pathological descriptions evaluated using semantic similarity metrics, which shows 97% similarity with gold standard pathological descriptions.
arXiv Detail & Related papers (2022-06-13T16:46:54Z)
Supervised Machine Learning Algorithm for Detecting Consistency between Reported Findings and the Conclusions of Mammography Reports [66.89977257992568]
Mammography reports document the diagnosis of patients' conditions. Many reports contain non-standard terms (non-BI-RADS descriptors) and incomplete statements. Our aim was to develop a tool to detect such discrepancies by comparing the reported conclusions to those that would be expected based on the reported radiology findings.
arXiv Detail & Related papers (2022-02-28T08:59:04Z)
Radiology Report Generation with a Learned Knowledge Base and Multi-modal Alignment [27.111857943935725]
We present an automatic, multi-modal approach for report generation from chest x-ray. Our approach features two distinct modules: (i) Learned knowledge base and (ii) Multi-modal alignment. With the aid of both modules, our approach clearly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-12-30T10:43:56Z)
Event-based clinical findings extraction from radiology reports with pre-trained language model [0.22940141855172028]
We present a new corpus of radiology reports annotated with clinical findings. The gold standard corpus contained a total of 500 annotated computed tomography (CT) reports. We extracted triggers and argument entities using two state-of-the-art deep learning architectures, including BERT.
arXiv Detail & Related papers (2021-12-27T05:03:10Z)
An Interpretable End-to-end Fine-tuning Approach for Long Clinical Text [72.62848911347466]
Unstructured clinical text in EHRs contains crucial information for applications including decision support, trial matching, and retrospective research. Recent work has applied BERT-based models to clinical information extraction and text classification, given these models' state-of-the-art performance in other NLP domains. In this work, we propose a novel fine-tuning approach called SnipBERT. Instead of using entire notes, SnipBERT identifies crucial snippets and feeds them into a truncated BERT-based model in a hierarchical manner.
arXiv Detail & Related papers (2020-11-12T17:14:32Z)
Text Mining to Identify and Extract Novel Disease Treatments From Unstructured Datasets [56.38623317907416]
We use Google Cloud to transcribe podcast episodes of an NPR radio show. We then build a pipeline for systematically pre-processing the text. Our model successfully identified that Omeprazole can help treat heartburn.
arXiv Detail & Related papers (2020-10-22T19:52:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.