Enhancing Human-Computer Interaction in Chest X-ray Analysis using Vision and Language Model with Eye Gaze Patterns
- URL: http://arxiv.org/abs/2404.02370v1
- Date: Wed, 3 Apr 2024 00:09:05 GMT
- Title: Enhancing Human-Computer Interaction in Chest X-ray Analysis using Vision and Language Model with Eye Gaze Patterns
- Authors: Yunsoo Kim, Jinge Wu, Yusuf Abdulle, Yue Gao, Honghan Wu,
- Abstract summary: Vision-Language Models (VLMs) enhanced with radiologists' attention by incorporating eye gaze data alongside textual prompts.
Heatmaps generated from eye gaze data, overlaying them onto medical images to highlight areas of intense radiologist's focus.
Results demonstrate the inclusion of eye gaze information significantly enhances the accuracy of chest X-ray analysis.
- Score: 7.6599164274971026
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in Computer Assisted Diagnosis have shown promising performance in medical imaging tasks, particularly in chest X-ray analysis. However, the interaction between these models and radiologists has been primarily limited to input images. This work proposes a novel approach to enhance human-computer interaction in chest X-ray analysis using Vision-Language Models (VLMs) enhanced with radiologists' attention by incorporating eye gaze data alongside textual prompts. Our approach leverages heatmaps generated from eye gaze data, overlaying them onto medical images to highlight areas of intense radiologist's focus during chest X-ray evaluation. We evaluate this methodology in tasks such as visual question answering, chest X-ray report automation, error detection, and differential diagnosis. Our results demonstrate the inclusion of eye gaze information significantly enhances the accuracy of chest X-ray analysis. Also, the impact of eye gaze on fine-tuning was confirmed as it outperformed other medical VLMs in all tasks except visual question answering. This work marks the potential of leveraging both the VLM's capabilities and the radiologist's domain knowledge to improve the capabilities of AI models in medical imaging, paving a novel way for Computer Assisted Diagnosis with a human-centred AI.
Related papers
- D-Rax: Domain-specific Radiologic assistant leveraging multi-modal data and eXpert model predictions [8.50767187405446]
We propose D-Rax -- a domain-specific, conversational, radiologic assistance tool.
We enhance the conversational analysis of chest X-ray (CXR) images to support radiological reporting.
We observe statistically significant improvement in responses when evaluated for both open and close-ended conversations.
arXiv Detail & Related papers (2024-07-02T18:43:10Z) - Multimodal Learning and Cognitive Processes in Radiology: MedGaze for Chest X-ray Scanpath Prediction [10.388541520456714]
Our proposed system aims to predict eye gaze sequences from radiology reports and CXR images.
Our model predicts fixation coordinates and durations critical for medical scanpath prediction, outperforming existing models in the computer vision community.
Based on the radiologist's evaluation, MedGaze can generate human-like gaze sequences with a high focus on relevant regions.
arXiv Detail & Related papers (2024-06-28T06:38:58Z) - Computer-Aided Diagnosis of Thoracic Diseases in Chest X-rays using hybrid CNN-Transformer Architecture [1.0878040851637998]
An automated computer-aided diagnosis system can interpret chest X-rays to augment radiologists by providing actionable insights.
In this study, we applied a novel architecture augmenting the DenseNet121 Convolutional Neural Network (CNN) with multi-head self-attention mechanism.
Experimental results show that augmenting CNN with self-attention has potential in diagnosing different thoracic diseases from chest X-rays.
arXiv Detail & Related papers (2024-04-18T01:46:31Z) - Radiology Report Generation Using Transformers Conditioned with
Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information.
The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z) - I-AI: A Controllable & Interpretable AI System for Decoding
Radiologists' Intense Focus for Accurate CXR Diagnoses [9.260958560874812]
Interpretable Artificial Intelligence (I-AI) is a novel and unified controllable interpretable pipeline.
Our I-AI addresses three key questions: where a radiologist looks, how long they focus on specific areas, and what findings they diagnose.
arXiv Detail & Related papers (2023-09-24T04:48:44Z) - XrayGPT: Chest Radiographs Summarization using Medical Vision-Language
Models [60.437091462613544]
We introduce XrayGPT, a novel conversational medical vision-language model.
It can analyze and answer open-ended questions about chest radiographs.
We generate 217k interactive and high-quality summaries from free-text radiology reports.
arXiv Detail & Related papers (2023-06-13T17:59:59Z) - Act Like a Radiologist: Radiology Report Generation across Anatomical Regions [50.13206214694885]
X-RGen is a radiologist-minded report generation framework across six anatomical regions.
In X-RGen, we seek to mimic the behaviour of human radiologists, breaking them down into four principal phases.
We enhance the recognition capacity of the image encoder by analysing images and reports across various regions.
arXiv Detail & Related papers (2023-05-26T07:12:35Z) - Computer Vision on X-ray Data in Industrial Production and Security
Applications: A survey [89.45221564651145]
This survey reviews the recent research on using computer vision and machine learning for X-ray analysis in industrial production and security applications.
It covers the applications, techniques, evaluation metrics, datasets, and performance comparison of those techniques on publicly available datasets.
arXiv Detail & Related papers (2022-11-10T13:37:36Z) - Generative Residual Attention Network for Disease Detection [51.60842580044539]
We present a novel approach for disease generation in X-rays using a conditional generative adversarial learning.
We generate a corresponding radiology image in a target domain while preserving the identity of the patient.
We then use the generated X-ray image in the target domain to augment our training to improve the detection performance.
arXiv Detail & Related papers (2021-10-25T14:15:57Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.