Mining Gaze for Contrastive Learning toward Computer-Assisted Diagnosis
- URL: http://arxiv.org/abs/2312.06069v2
- Date: Tue, 12 Dec 2023 05:45:49 GMT
- Title: Mining Gaze for Contrastive Learning toward Computer-Assisted Diagnosis
- Authors: Zihao Zhao, Sheng Wang, Qian Wang, Dinggang Shen
- Abstract summary: We propose eye-tracking as an alternative to text reports for medical images.
By tracking the gaze of radiologists as they read and diagnose medical images, we can understand their visual attention and clinical reasoning.
We introduce the Medical contrastive Gaze Image Pre-training (McGIP) as a plug-and-play module for contrastive learning frameworks.
- Score: 61.089776864520594
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Obtaining large-scale radiology reports can be difficult for medical images
due to various reasons, limiting the effectiveness of contrastive pre-training
in the medical image domain and underscoring the need for alternative methods.
In this paper, we propose eye-tracking as an alternative to text reports, as it
allows for the passive collection of gaze signals without disturbing
radiologist's routine diagnosis process. By tracking the gaze of radiologists
as they read and diagnose medical images, we can understand their visual
attention and clinical reasoning. When a radiologist has similar gazes for two
medical images, it may indicate semantic similarity for diagnosis, and these
images should be treated as positive pairs when pre-training a
computer-assisted diagnosis (CAD) network through contrastive learning.
Accordingly, we introduce the Medical contrastive Gaze Image Pre-training
(McGIP) as a plug-and-play module for contrastive learning frameworks. McGIP
uses radiologist's gaze to guide contrastive pre-training. We evaluate our
method using two representative types of medical images and two common types of
gaze data. The experimental results demonstrate the practicality of McGIP,
indicating its high potential for various clinical scenarios and applications.
Related papers
- VALD-MD: Visual Attribution via Latent Diffusion for Medical Diagnostics [0.0]
Visual attribution in medical imaging seeks to make evident the diagnostically-relevant components of a medical image.
We here present a novel generative visual attribution technique, one that leverages latent diffusion models in combination with domain-specific large language models.
The resulting system also exhibits a range of latent capabilities including zero-shot localized disease induction.
arXiv Detail & Related papers (2024-01-02T19:51:49Z) - Radiology Report Generation Using Transformers Conditioned with
Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information.
The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z) - KiUT: Knowledge-injected U-Transformer for Radiology Report Generation [10.139767157037829]
Radiology report generation aims to automatically generate a clinically accurate and coherent paragraph from the X-ray image.
We propose a Knowledge-injected U-Transformer (KiUT) to learn multi-level visual representation and adaptively distill the information.
arXiv Detail & Related papers (2023-06-20T07:27:28Z) - Multimorbidity Content-Based Medical Image Retrieval Using Proxies [37.47987844057842]
We propose a novel multi-label metric learning method that can be used for both classification and content-based image retrieval.
Our model is able to support diagnosis by predicting the presence of diseases and provide evidence for these predictions.
We demonstrate the efficacy of our approach to both classification and content-based image retrieval on two multimorbidity radiology datasets.
arXiv Detail & Related papers (2022-11-22T11:23:53Z) - Cyclic Generative Adversarial Networks With Congruent Image-Report
Generation For Explainable Medical Image Analysis [5.6512908295414]
We present a novel framework for explainable labeling and interpretation of medical images.
The aim of the work is to generate trustworthy and faithful explanations for the outputs of a model diagnosing chest x-ray images.
arXiv Detail & Related papers (2022-11-16T12:41:21Z) - BI-RADS-Net: An Explainable Multitask Learning Approach for Cancer
Diagnosis in Breast Ultrasound Images [69.41441138140895]
This paper introduces BI-RADS-Net, a novel explainable deep learning approach for cancer detection in breast ultrasound images.
The proposed approach incorporates tasks for explaining and classifying breast tumors, by learning feature representations relevant to clinical diagnosis.
Explanations of the predictions (benign or malignant) are provided in terms of morphological features that are used by clinicians for diagnosis and reporting in medical practice.
arXiv Detail & Related papers (2021-10-05T19:14:46Z) - Automated Knee X-ray Report Generation [12.732469371097347]
We propose to take advantage of past radiological exams and formulate a framework capable of learning the correspondence between the images and reports.
We demonstrate how aggregating the image features of individual exams and using them as conditional inputs when training a language generation model results in auto-generated exam reports.
arXiv Detail & Related papers (2021-05-22T11:59:42Z) - Act Like a Radiologist: Towards Reliable Multi-view Correspondence
Reasoning for Mammogram Mass Detection [49.14070210387509]
We propose an Anatomy-aware Graph convolutional Network (AGN) for mammogram mass detection.
AGN is tailored for mammogram mass detection and endows existing detection methods with multi-view reasoning ability.
Experiments on two standard benchmarks reveal that AGN significantly exceeds the state-of-the-art performance.
arXiv Detail & Related papers (2021-05-21T06:48:34Z) - Collaborative Unsupervised Domain Adaptation for Medical Image Diagnosis [102.40869566439514]
We seek to exploit rich labeled data from relevant domains to help the learning in the target task via Unsupervised Domain Adaptation (UDA)
Unlike most UDA methods that rely on clean labeled data or assume samples are equally transferable, we innovatively propose a Collaborative Unsupervised Domain Adaptation algorithm.
We theoretically analyze the generalization performance of the proposed method, and also empirically evaluate it on both medical and general images.
arXiv Detail & Related papers (2020-07-05T11:49:17Z) - Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report
Generation [107.3538598876467]
We propose an Auxiliary Signal-Guided Knowledge-Decoder (ASGK) to mimic radiologists' working patterns.
ASGK integrates internal visual feature fusion and external medical linguistic information to guide medical knowledge transfer and learning.
arXiv Detail & Related papers (2020-06-06T01:00:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.