Radiology Text Analysis System (RadText): Architecture and Evaluation
- URL: http://arxiv.org/abs/2204.09599v1
- Date: Sat, 19 Mar 2022 17:16:12 GMT
- Title: Radiology Text Analysis System (RadText): Architecture and Evaluation
- Authors: Song Wang, Mingquan Lin, Ying Ding, George Shih, Zhiyong Lu, Yifan
Peng
- Abstract summary: RadText is an open-source radiology text analysis system developed by Python.
It offers an easy-to-use text analysis pipeline, including de-identification, section segmentation, sentence split and word tokenization.
It supports raw text processing and local processing, which enables better usability and improved data privacy.
- Score: 21.051601364891418
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Analyzing radiology reports is a time-consuming and error-prone task, which
raises the need for an efficient automated radiology report analysis system to
alleviate the workloads of radiologists and encourage precise diagnosis. In
this work, we present RadText, an open-source radiology text analysis system
developed by Python. RadText offers an easy-to-use text analysis pipeline,
including de-identification, section segmentation, sentence split and word
tokenization, named entity recognition, parsing, and negation detection.
RadText features a flexible modular design, provides a hybrid text processing
schema, and supports raw text processing and local processing, which enables
better usability and improved data privacy. RadText adopts BioC as the unified
interface, and also standardizes the input / output into a structured
representation compatible with Observational Medical Outcomes Partnership
(OMOP) Common Data Model (CDM). This allows for a more systematic approach to
observational research across multiple, disparate data sources. We evaluated
RadText on the MIMIC-CXR dataset, with five new disease labels we annotated for
this work. RadText demonstrates highly accurate classification performances,
with an average precision of, a recall of 0.94, and an F-1 score of 0.92. We
have made our code, documentation, examples, and the test set available at
https://github.com/bionlplab/radtext .
Related papers
- Extracting and Encoding: Leveraging Large Language Models and Medical Knowledge to Enhance Radiological Text Representation [31.370503681645804]
We present a novel two-stage framework designed to extract high-quality factual statements from free-text radiology reports.
Our framework also includes a new embedding-based metric ( CXRFE) for evaluating chest X-ray text generation systems.
arXiv Detail & Related papers (2024-07-02T04:39:19Z) - Towards Unified Multi-granularity Text Detection with Interactive Attention [56.79437272168507]
"Detect Any Text" is an advanced paradigm that unifies scene text detection, layout analysis, and document page detection into a cohesive, end-to-end model.
A pivotal innovation in DAT is the across-granularity interactive attention module, which significantly enhances the representation learning of text instances.
Tests demonstrate that DAT achieves state-of-the-art performances across a variety of text-related benchmarks.
arXiv Detail & Related papers (2024-05-30T07:25:23Z) - GPT-generated Text Detection: Benchmark Dataset and Tensor-based
Detection Method [4.802604527842989]
We present GPT Reddit dataset (GRiD), a novel Generative Pretrained Transformer (GPT)-generated text detection dataset.
The dataset consists of context-prompt pairs based on Reddit with human-generated and ChatGPT-generated responses.
To showcase the dataset's utility, we benchmark several detection methods on it, demonstrating their efficacy in distinguishing between human and ChatGPT-generated responses.
arXiv Detail & Related papers (2024-03-12T05:15:21Z) - Attribute Structuring Improves LLM-Based Evaluation of Clinical Text
Summaries [62.32403630651586]
Large language models (LLMs) have shown the potential to generate accurate clinical text summaries, but still struggle with issues regarding grounding and evaluation.
Here, we explore a general mitigation framework using Attribute Structuring (AS), which structures the summary evaluation process.
AS consistently improves the correspondence between human annotations and automated metrics in clinical text summarization.
arXiv Detail & Related papers (2024-03-01T21:59:03Z) - RaDialog: A Large Vision-Language Model for Radiology Report Generation
and Conversational Assistance [53.20640629352422]
Conversational AI tools can generate and discuss clinically correct radiology reports for a given medical image.
RaDialog is the first thoroughly evaluated and publicly available large vision-language model for radiology report generation and interactive dialog.
Our method achieves state-of-the-art clinical correctness in report generation and shows impressive abilities in interactive tasks such as correcting reports and answering questions.
arXiv Detail & Related papers (2023-11-30T16:28:40Z) - Rad-ReStruct: A Novel VQA Benchmark and Method for Structured Radiology
Reporting [45.76458992133422]
We introduce Rad-ReStruct, a new benchmark dataset that provides fine-grained, hierarchically ordered annotations in the form of structured reports for X-Ray images.
We propose hi-VQA, a novel method that considers prior context in the form of previously asked questions and answers for populating a structured radiology report.
Our experiments show that hi-VQA achieves competitive performance to the state-of-the-art on the medical VQA benchmark VQARad while performing best among methods without domain-specific vision-language pretraining.
arXiv Detail & Related papers (2023-07-11T19:47:05Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases.
We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z) - Knowledge Graph Construction and Its Application in Automatic Radiology
Report Generation from Radiologist's Dictation [22.894248859405767]
This paper focuses on applications of NLP techniques like Information Extraction (IE) and domain-specific Knowledge Graph (KG) to automatically generate radiology reports from radiologist's dictation.
We develop an information extraction pipeline that combines rule-based, pattern-based, and dictionary-based techniques with lexical-semantic features to extract entities and relations.
We generate pathological descriptions evaluated using semantic similarity metrics, which shows 97% similarity with gold standard pathological descriptions.
arXiv Detail & Related papers (2022-06-13T16:46:54Z) - Text Mining to Identify and Extract Novel Disease Treatments From
Unstructured Datasets [56.38623317907416]
We use Google Cloud to transcribe podcast episodes of an NPR radio show.
We then build a pipeline for systematically pre-processing the text.
Our model successfully identified that Omeprazole can help treat heartburn.
arXiv Detail & Related papers (2020-10-22T19:52:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.