TiBiX: Leveraging Temporal Information for Bidirectional X-ray and Report Generation
- URL: http://arxiv.org/abs/2403.13343v1
- Date: Wed, 20 Mar 2024 07:00:03 GMT
- Title: TiBiX: Leveraging Temporal Information for Bidirectional X-ray and Report Generation
- Authors: Santosh Sanjeev, Fadillah Adamsyah Maani, Arsen Abzhanov, Vijay Ram Papineni, Ibrahim Almakky, Bartłomiej W. Papież, Mohammad Yaqub,
- Abstract summary: TiBiX: Leveraging Temporal information for Bidirectional X-ray and Report Generation.
We propose TiBiX: Leveraging Temporal information for Bidirectional X-ray and Report Generation.
- Score: 0.7381551917607596
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: With the emergence of vision language models in the medical imaging domain, numerous studies have focused on two dominant research activities: (1) report generation from Chest X-rays (CXR), and (2) synthetic scan generation from text or reports. Despite some research incorporating multi-view CXRs into the generative process, prior patient scans and reports have been generally disregarded. This can inadvertently lead to the leaving out of important medical information, thus affecting generation quality. To address this, we propose TiBiX: Leveraging Temporal information for Bidirectional X-ray and Report Generation. Considering previous scans, our approach facilitates bidirectional generation, primarily addressing two challenging problems: (1) generating the current image from the previous image and current report and (2) generating the current report based on both the previous and current images. Moreover, we extract and release a curated temporal benchmark dataset derived from the MIMIC-CXR dataset, which focuses on temporal data. Our comprehensive experiments and ablation studies explore the merits of incorporating prior CXRs and achieve state-of-the-art (SOTA) results on the report generation task. Furthermore, we attain on-par performance with SOTA image generation efforts, thus serving as a new baseline in longitudinal bidirectional CXR-to-report generation. The code is available at https://github.com/BioMedIA-MBZUAI/TiBiX.
Related papers
- CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset [14.911363203907008]
X-ray image-based medical report generation can significantly reduce diagnostic burdens and patient wait times.
We conduct a comprehensive benchmarking of existing mainstream X-ray report generation models and large language models (LLMs) on the CheXpert Plus dataset.
We propose a large model for the X-ray image report generation using a multi-stage pre-training strategy, including self-supervised autoregressive generation and Xray-report contrastive learning.
arXiv Detail & Related papers (2024-10-01T04:07:01Z) - MAIRA-2: Grounded Radiology Report Generation [39.7576903743788]
Radiology reporting is a complex task that requires detailed image understanding, integration of multiple inputs, and precise language generation.
Here, we extend report generation to include the localisation of individual findings on the image - a task we call grounded report generation.
We introduce MAIRA-2, a large multimodal model combining a radiology-specific image encoder with a LLM, and trained for the new task of grounded report generation on chest X-rays.
arXiv Detail & Related papers (2024-06-06T19:12:41Z) - ORGAN: Observation-Guided Radiology Report Generation via Tree Reasoning [9.316999438459794]
We propose an observation-guided radiology report generation framework (ORGAN)
It first produces an observation plan and then feeds both the plan and radiographs for report generation.
Our framework outperforms previous state-of-the-art methods regarding text quality and clinical efficacy.
arXiv Detail & Related papers (2023-06-10T15:36:04Z) - Boosting Radiology Report Generation by Infusing Comparison Prior [7.054671146863795]
Recent transformer-based models have made significant strides in generating radiology reports from chest X-ray images.
These models often lack prior knowledge, resulting in the generation of synthetic reports that mistakenly reference non-existent prior exams.
We propose a novel approach that leverages a rule-based labeler to extract comparison prior information from radiology reports.
arXiv Detail & Related papers (2023-05-08T09:12:44Z) - Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report
Generation [92.73584302508907]
We propose a knowledge graph with Dynamic structure and nodes to facilitate medical report generation with Contrastive Learning.
In detail, the fundamental structure of our graph is pre-constructed from general knowledge.
Each image feature is integrated with its very own updated graph before being fed into the decoder module for report generation.
arXiv Detail & Related papers (2023-03-18T03:53:43Z) - Medical Image Captioning via Generative Pretrained Transformers [57.308920993032274]
We combine two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records.
The proposed model is tested on two medical datasets, the Open-I, MIMIC-CXR, and the general-purpose MS-COCO.
arXiv Detail & Related papers (2022-09-28T10:27:10Z) - AlignTransformer: Hierarchical Alignment of Visual Regions and Disease
Tags for Medical Report Generation [50.21065317817769]
We propose an AlignTransformer framework, which includes the Align Hierarchical Attention (AHA) and the Multi-Grained Transformer (MGT) modules.
Experiments on the public IU-Xray and MIMIC-CXR datasets show that the AlignTransformer can achieve results competitive with state-of-the-art methods on the two datasets.
arXiv Detail & Related papers (2022-03-18T13:43:53Z) - Contrastive Attention for Automatic Chest X-ray Report Generation [124.60087367316531]
In most cases, the normal regions dominate the entire chest X-ray image, and the corresponding descriptions of these normal regions dominate the final report.
We propose Contrastive Attention (CA) model, which compares the current input image with normal images to distill the contrastive information.
We achieve the state-of-the-art results on the two public datasets.
arXiv Detail & Related papers (2021-06-13T11:20:31Z) - Exploring and Distilling Posterior and Prior Knowledge for Radiology
Report Generation [55.00308939833555]
The PPKED includes three modules: Posterior Knowledge Explorer (PoKE), Prior Knowledge Explorer (PrKE) and Multi-domain Knowledge Distiller (MKD)
PoKE explores the posterior knowledge, which provides explicit abnormal visual regions to alleviate visual data bias.
PrKE explores the prior knowledge from the prior medical knowledge graph (medical knowledge) and prior radiology reports (working experience) to alleviate textual data bias.
arXiv Detail & Related papers (2021-06-13T11:10:02Z) - Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report
Generation [107.3538598876467]
We propose an Auxiliary Signal-Guided Knowledge-Decoder (ASGK) to mimic radiologists' working patterns.
ASGK integrates internal visual feature fusion and external medical linguistic information to guide medical knowledge transfer and learning.
arXiv Detail & Related papers (2020-06-06T01:00:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.