Multimodal Image-Text Matching Improves Retrieval-based Chest X-Ray
Report Generation
- URL: http://arxiv.org/abs/2303.17579v2
- Date: Tue, 2 May 2023 21:03:40 GMT
- Title: Multimodal Image-Text Matching Improves Retrieval-based Chest X-Ray
Report Generation
- Authors: Jaehwan Jeong, Katherine Tian, Andrew Li, Sina Hartung, Fardad
Behzadi, Juan Calle, David Osayande, Michael Pohlen, Subathra Adithan, Pranav
Rajpurkar
- Abstract summary: Contrastive X-Ray REport Match (X-REM) is a novel retrieval-based radiology report generation module.
X-REM uses an image-text matching score to measure the similarity of a chest X-ray image and radiology report for report retrieval.
- Score: 3.6664023341224827
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated generation of clinically accurate radiology reports can improve
patient care. Previous report generation methods that rely on image captioning
models often generate incoherent and incorrect text due to their lack of
relevant domain knowledge, while retrieval-based attempts frequently retrieve
reports that are irrelevant to the input image. In this work, we propose
Contrastive X-Ray REport Match (X-REM), a novel retrieval-based radiology
report generation module that uses an image-text matching score to measure the
similarity of a chest X-ray image and radiology report for report retrieval. We
observe that computing the image-text matching score with a language-image
model can effectively capture the fine-grained interaction between image and
text that is often lost when using cosine similarity. X-REM outperforms
multiple prior radiology report generation modules in terms of both natural
language and clinical metrics. Human evaluation of the generated reports
suggests that X-REM increased the number of zero-error reports and decreased
the average error severity compared to the baseline retrieval approach. Our
code is available at: https://github.com/rajpurkarlab/X-REM
Related papers
- Improving Factuality of 3D Brain MRI Report Generation with Paired Image-domain Retrieval and Text-domain Augmentation [42.13004422063442]
Acute ischemic stroke (AIS) requires time-critical management, with hours of delayed intervention leading to an irreversible disability of the patient.
Since diffusion weighted imaging (DWI) using the magnetic resonance image (MRI) plays a crucial role in the detection of AIS, automated prediction of AIS from DWI has been a research topic of clinical importance.
While text radiology reports contain the most relevant clinical information from the image findings, the difficulty of mapping across different modalities has limited the factuality of conventional direct DWI-to-report generation methods.
arXiv Detail & Related papers (2024-11-23T08:18:55Z) - MedCycle: Unpaired Medical Report Generation via Cycle-Consistency [11.190146577567548]
We introduce an innovative approach that eliminates the need for consistent labeling schemas.
This approach is based on cycle-consistent mapping functions that transform image embeddings into report embeddings.
It outperforms state-of-the-art results in unpaired chest X-ray report generation, demonstrating improvements in both language and clinical metrics.
arXiv Detail & Related papers (2024-03-20T09:40:11Z) - Fine-Grained Image-Text Alignment in Medical Imaging Enables Explainable Cyclic Image-Report Generation [91.63262242041695]
We propose a novel Adaptive patch-word Matching (AdaMatch) model to correlate chest X-ray (CXR) image regions with words in medical reports.
AdaMatch exploits the fine-grained relation between adaptive patches and words to provide explanations of specific image regions with corresponding words.
In order to provide explicit explainability for CXR-report generation task, we propose an AdaMatch-based bidirectional large language model for Cyclic CXR-report generation.
arXiv Detail & Related papers (2023-12-13T11:47:28Z) - RaDialog: A Large Vision-Language Model for Radiology Report Generation
and Conversational Assistance [53.20640629352422]
Conversational AI tools can generate and discuss clinically correct radiology reports for a given medical image.
RaDialog is the first thoroughly evaluated and publicly available large vision-language model for radiology report generation and interactive dialog.
Our method achieves state-of-the-art clinical correctness in report generation and shows impressive abilities in interactive tasks such as correcting reports and answering questions.
arXiv Detail & Related papers (2023-11-30T16:28:40Z) - Radiology Report Generation Using Transformers Conditioned with
Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information.
The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z) - Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report
Generation [47.250147322130545]
Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images.
Most existing methods focus solely on the image data, disregarding the other patient information accessible to radiologists.
We present a novel multi-modal deep neural network framework for generating chest X-rays reports by integrating structured patient data, such as vital signs and symptoms, alongside unstructured clinical notes.
arXiv Detail & Related papers (2023-11-18T14:37:53Z) - Style-Aware Radiology Report Generation with RadGraph and Few-Shot
Prompting [5.596515201054671]
We propose a two-step approach for radiology report generation.
First, we extract the content from an image; then, we verbalize the extracted content into a report that matches the style of a specific radiologist.
arXiv Detail & Related papers (2023-10-26T23:06:38Z) - Replace and Report: NLP Assisted Radiology Report Generation [31.309987297324845]
We propose a template-based approach to generate radiology reports from radiographs.
This is the first attempt to generate chest X-ray radiology reports by first creating small sentences for abnormal findings and then replacing them in the normal report template.
arXiv Detail & Related papers (2023-06-19T10:04:42Z) - Writing by Memorizing: Hierarchical Retrieval-based Medical Report
Generation [26.134055930805523]
We propose MedWriter that incorporates a novel hierarchical retrieval mechanism to automatically extract both report and sentence-level templates.
MedWriter first employs the Visual-Language Retrieval(VLR) module to retrieve the most relevant reports for the given images.
To guarantee the logical coherence between sentences, the Language-Language Retrieval(LLR) module is introduced to retrieve relevant sentences.
At last, a language decoder fuses image features and features from retrieved reports and sentences to generate meaningful medical reports.
arXiv Detail & Related papers (2021-05-25T07:47:23Z) - Chest X-ray Report Generation through Fine-Grained Label Learning [46.352966049776875]
We present a domain-aware automatic chest X-ray radiology report generation algorithm that learns fine-grained description of findings from images.
We also develop an automatic labeling algorithm for assigning such descriptors to images and build a novel deep learning network that recognizes both coarse and fine-grained descriptions of findings.
arXiv Detail & Related papers (2020-07-27T19:50:56Z) - Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report
Generation [107.3538598876467]
We propose an Auxiliary Signal-Guided Knowledge-Decoder (ASGK) to mimic radiologists' working patterns.
ASGK integrates internal visual feature fusion and external medical linguistic information to guide medical knowledge transfer and learning.
arXiv Detail & Related papers (2020-06-06T01:00:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.