Related papers: Radiology Text Analysis System (RadText): Architecture and Evaluation

Radiology Text Analysis System (RadText): Architecture and Evaluation

URL: http://arxiv.org/abs/2204.09599v1
Date: Sat, 19 Mar 2022 17:16:12 GMT
Title: Radiology Text Analysis System (RadText): Architecture and Evaluation
Authors: Song Wang, Mingquan Lin, Ying Ding, George Shih, Zhiyong Lu, Yifan Peng
Abstract summary: RadText is an open-source radiology text analysis system developed by Python. It offers an easy-to-use text analysis pipeline, including de-identification, section segmentation, sentence split and word tokenization. It supports raw text processing and local processing, which enables better usability and improved data privacy.
Score: 21.051601364891418
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Analyzing radiology reports is a time-consuming and error-prone task, which raises the need for an efficient automated radiology report analysis system to alleviate the workloads of radiologists and encourage precise diagnosis. In this work, we present RadText, an open-source radiology text analysis system developed by Python. RadText offers an easy-to-use text analysis pipeline, including de-identification, section segmentation, sentence split and word tokenization, named entity recognition, parsing, and negation detection. RadText features a flexible modular design, provides a hybrid text processing schema, and supports raw text processing and local processing, which enables better usability and improved data privacy. RadText adopts BioC as the unified interface, and also standardizes the input / output into a structured representation compatible with Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). This allows for a more systematic approach to observational research across multiple, disparate data sources. We evaluated RadText on the MIMIC-CXR dataset, with five new disease labels we annotated for this work. RadText demonstrates highly accurate classification performances, with an average precision of, a recall of 0.94, and an F-1 score of 0.92. We have made our code, documentation, examples, and the test set available at https://github.com/bionlplab/radtext .

Related papers

Extracting and Encoding: Leveraging Large Language Models and Medical Knowledge to Enhance Radiological Text Representation [31.370503681645804]
We present a novel two-stage framework designed to extract high-quality factual statements from free-text radiology reports. Our framework also includes a new embedding-based metric ( CXRFE) for evaluating chest X-ray text generation systems.
arXiv Detail & Related papers (2024-07-02T04:39:19Z)
Towards Unified Multi-granularity Text Detection with Interactive Attention [56.79437272168507]
"Detect Any Text" is an advanced paradigm that unifies scene text detection, layout analysis, and document page detection into a cohesive, end-to-end model. A pivotal innovation in DAT is the across-granularity interactive attention module, which significantly enhances the representation learning of text instances. Tests demonstrate that DAT achieves state-of-the-art performances across a variety of text-related benchmarks.
arXiv Detail & Related papers (2024-05-30T07:25:23Z)
GPT-generated Text Detection: Benchmark Dataset and Tensor-based Detection Method [4.802604527842989]
We present GPT Reddit dataset (GRiD), a novel Generative Pretrained Transformer (GPT)-generated text detection dataset. The dataset consists of context-prompt pairs based on Reddit with human-generated and ChatGPT-generated responses. To showcase the dataset's utility, we benchmark several detection methods on it, demonstrating their efficacy in distinguishing between human and ChatGPT-generated responses.
arXiv Detail & Related papers (2024-03-12T05:15:21Z)
Attribute Structuring Improves LLM-Based Evaluation of Clinical Text Summaries [62.32403630651586]
Large language models (LLMs) have shown the potential to generate accurate clinical text summaries, but still struggle with issues regarding grounding and evaluation. Here, we explore a general mitigation framework using Attribute Structuring (AS), which structures the summary evaluation process. AS consistently improves the correspondence between human annotations and automated metrics in clinical text summarization.
arXiv Detail & Related papers (2024-03-01T21:59:03Z)
RaDialog: A Large Vision-Language Model for Radiology Report Generation and Conversational Assistance [53.20640629352422]
Conversational AI tools can generate and discuss clinically correct radiology reports for a given medical image. RaDialog is the first thoroughly evaluated and publicly available large vision-language model for radiology report generation and interactive dialog. Our method achieves state-of-the-art clinical correctness in report generation and shows impressive abilities in interactive tasks such as correcting reports and answering questions.
arXiv Detail & Related papers (2023-11-30T16:28:40Z)
Rad-ReStruct: A Novel VQA Benchmark and Method for Structured Radiology Reporting [45.76458992133422]
We introduce Rad-ReStruct, a new benchmark dataset that provides fine-grained, hierarchically ordered annotations in the form of structured reports for X-Ray images. We propose hi-VQA, a novel method that considers prior context in the form of previously asked questions and answers for populating a structured radiology report. Our experiments show that hi-VQA achieves competitive performance to the state-of-the-art on the medical VQA benchmark VQARad while performing best among methods without domain-specific vision-language pretraining.
arXiv Detail & Related papers (2023-07-11T19:47:05Z)
TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture. TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling. It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z)
On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases. We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z)
Knowledge Graph Construction and Its Application in Automatic Radiology Report Generation from Radiologist's Dictation [22.894248859405767]
This paper focuses on applications of NLP techniques like Information Extraction (IE) and domain-specific Knowledge Graph (KG) to automatically generate radiology reports from radiologist's dictation. We develop an information extraction pipeline that combines rule-based, pattern-based, and dictionary-based techniques with lexical-semantic features to extract entities and relations. We generate pathological descriptions evaluated using semantic similarity metrics, which shows 97% similarity with gold standard pathological descriptions.
arXiv Detail & Related papers (2022-06-13T16:46:54Z)
Text Mining to Identify and Extract Novel Disease Treatments From Unstructured Datasets [56.38623317907416]
We use Google Cloud to transcribe podcast episodes of an NPR radio show. We then build a pipeline for systematically pre-processing the text. Our model successfully identified that Omeprazole can help treat heartburn.
arXiv Detail & Related papers (2020-10-22T19:52:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.