A Comparison of Object Detection and Phrase Grounding Models in Chest X-ray Abnormality Localization using Eye-tracking Data
- URL: http://arxiv.org/abs/2503.01037v1
- Date: Sun, 02 Mar 2025 21:54:47 GMT
- Title: A Comparison of Object Detection and Phrase Grounding Models in Chest X-ray Abnormality Localization using Eye-tracking Data
- Authors: Elham Ghelichkhan, Tolga Tasdizen,
- Abstract summary: This paper investigates how text enhances abnormality localization in chest X-rays by comparing the performance and explainability of these two tasks.<n>To establish an explainability baseline, we proposed an automatic pipeline to generate image regions for report sentences using radiologists' eye-tracking data.<n>The better performance - mIoU = 0.36 vs. 0.20 - and explainability - Containment ratio 0.48 vs. 0.26 - of the phrase grounding model infers the effectiveness of text in enhancing chest X-ray abnormality localization.
- Score: 1.4272411349249625
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Chest diseases rank among the most prevalent and dangerous global health issues. Object detection and phrase grounding deep learning models interpret complex radiology data to assist healthcare professionals in diagnosis. Object detection locates abnormalities for classes, while phrase grounding locates abnormalities for textual descriptions. This paper investigates how text enhances abnormality localization in chest X-rays by comparing the performance and explainability of these two tasks. To establish an explainability baseline, we proposed an automatic pipeline to generate image regions for report sentences using radiologists' eye-tracking data. The better performance - mIoU = 0.36 vs. 0.20 - and explainability - Containment ratio 0.48 vs. 0.26 - of the phrase grounding model infers the effectiveness of text in enhancing chest X-ray abnormality localization.
Related papers
- Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report
Generation [47.250147322130545]
Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images.
Most existing methods focus solely on the image data, disregarding the other patient information accessible to radiologists.
We present a novel multi-modal deep neural network framework for generating chest X-rays reports by integrating structured patient data, such as vital signs and symptoms, alongside unstructured clinical notes.
arXiv Detail & Related papers (2023-11-18T14:37:53Z) - DeepLOC: Deep Learning-based Bone Pathology Localization and
Classification in Wrist X-ray Images [1.45543311565555]
This paper presents a novel approach for bone pathology localization and classification in wrist X-ray images.
The proposed methodology addresses two critical challenges in wrist X-ray analysis: accurate localization of bone pathologies and precise classification of abnormalities.
arXiv Detail & Related papers (2023-08-24T12:06:10Z) - Discrimination of Radiologists Utilizing Eye-Tracking Technology and
Machine Learning: A Case Study [0.9142067094647588]
This study presents a novel discretized feature encoding based on binning fixation data for efficient geometric alignment.
The encoded features of the eye-fixation data are employed by machine learning classifiers to discriminate between faculty and trainee radiologists.
arXiv Detail & Related papers (2023-08-04T23:51:47Z) - Cross-Modal Causal Intervention for Medical Report Generation [109.83549148448469]
Medical report generation (MRG) is essential for computer-aided diagnosis and medication guidance.
Due to the spurious correlations within image-text data induced by visual and linguistic biases, it is challenging to generate accurate reports reliably describing lesion areas.
We propose a novel Visual-Linguistic Causal Intervention (VLCI) framework for MRG, which consists of a visual deconfounding module (VDM) and a linguistic deconfounding module (LDM)
arXiv Detail & Related papers (2023-03-16T07:23:55Z) - Medical Image Captioning via Generative Pretrained Transformers [57.308920993032274]
We combine two language models, the Show-Attend-Tell and the GPT-3, to generate comprehensive and descriptive radiology records.
The proposed model is tested on two medical datasets, the Open-I, MIMIC-CXR, and the general-purpose MS-COCO.
arXiv Detail & Related papers (2022-09-28T10:27:10Z) - Radiomics-Guided Global-Local Transformer for Weakly Supervised
Pathology Localization in Chest X-Rays [65.88435151891369]
Radiomics-Guided Transformer (RGT) fuses textitglobal image information with textitlocal knowledge-guided radiomics information.
RGT consists of an image Transformer branch, a radiomics Transformer branch, and fusion layers that aggregate image and radiomic information.
arXiv Detail & Related papers (2022-07-10T06:32:56Z) - Anatomy-Guided Weakly-Supervised Abnormality Localization in Chest
X-rays [17.15666977702355]
We propose an Anatomy-Guided chest X-ray Network (AGXNet) to address weak annotation issues.
Our framework consists of a cascade of two networks, one responsible for identifying anatomical abnormalities and the second responsible for pathological observations.
Our results on the MIMIC-CXR dataset demonstrate the effectiveness of AGXNet in disease and anatomical abnormality localization.
arXiv Detail & Related papers (2022-06-25T18:33:27Z) - Generative Residual Attention Network for Disease Detection [51.60842580044539]
We present a novel approach for disease generation in X-rays using a conditional generative adversarial learning.
We generate a corresponding radiology image in a target domain while preserving the identity of the patient.
We then use the generated X-ray image in the target domain to augment our training to improve the detection performance.
arXiv Detail & Related papers (2021-10-25T14:15:57Z) - Weakly Supervised Contrastive Learning for Chest X-Ray Report Generation [3.3978173451092437]
Radiology report generation aims at generating descriptive text from radiology images automatically.
A typical setting consists of training encoder-decoder models on image-report pairs with a cross entropy loss.
We propose a novel weakly supervised contrastive loss for medical report generation.
arXiv Detail & Related papers (2021-09-25T00:06:23Z) - Contrastive Attention for Automatic Chest X-ray Report Generation [124.60087367316531]
In most cases, the normal regions dominate the entire chest X-ray image, and the corresponding descriptions of these normal regions dominate the final report.
We propose Contrastive Attention (CA) model, which compares the current input image with normal images to distill the contrastive information.
We achieve the state-of-the-art results on the two public datasets.
arXiv Detail & Related papers (2021-06-13T11:20:31Z) - Contralaterally Enhanced Networks for Thoracic Disease Detection [120.60868136876599]
There exist many similar structures in the left and right parts of the chest, such as ribs, lung fields and bronchial tubes.
This kind of similarities can be used to identify diseases in chest X-rays, according to the experience of broad-certificated radiologists.
We propose a deep end-to-end module to exploit the contralateral context information for enhancing feature representations of disease proposals.
arXiv Detail & Related papers (2020-10-09T10:15:26Z) - Learning Invariant Feature Representation to Improve Generalization
across Chest X-ray Datasets [55.06983249986729]
We show that a deep learning model performing well when tested on the same dataset as training data starts to perform poorly when it is tested on a dataset from a different source.
By employing an adversarial training strategy, we show that a network can be forced to learn a source-invariant representation.
arXiv Detail & Related papers (2020-08-04T07:41:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.