CT-ScanGaze: A Dataset and Baselines for 3D Volumetric Scanpath Modeling
- URL: http://arxiv.org/abs/2507.12591v1
- Date: Wed, 16 Jul 2025 19:21:05 GMT
- Title: CT-ScanGaze: A Dataset and Baselines for 3D Volumetric Scanpath Modeling
- Authors: Trong-Thang Pham, Akash Awasthi, Saba Khan, Esteban Duran Marti, Tien-Phat Nguyen, Khoa Vo, Minh Tran, Ngoc Son Nguyen, Cuong Tran Van, Yuki Ikebe, Anh Totti Nguyen, Anh Nguyen, Zhigang Deng, Carol C. Wu, Hien Van Nguyen, Ngan Le,
- Abstract summary: We present the first publicly available eye gaze dataset on CT, called CT-ScanGaze.<n>We then introduce CT-Searcher, a novel 3D scanpath predictor designed specifically to process CT volumes and generate radiologist-like 3D fixation sequences.
- Score: 12.457017701871273
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Understanding radiologists' eye movement during Computed Tomography (CT) reading is crucial for developing effective interpretable computer-aided diagnosis systems. However, CT research in this area has been limited by the lack of publicly available eye-tracking datasets and the three-dimensional complexity of CT volumes. To address these challenges, we present the first publicly available eye gaze dataset on CT, called CT-ScanGaze. Then, we introduce CT-Searcher, a novel 3D scanpath predictor designed specifically to process CT volumes and generate radiologist-like 3D fixation sequences, overcoming the limitations of current scanpath predictors that only handle 2D inputs. Since deep learning models benefit from a pretraining step, we develop a pipeline that converts existing 2D gaze datasets into 3D gaze data to pretrain CT-Searcher. Through both qualitative and quantitative evaluations on CT-ScanGaze, we demonstrate the effectiveness of our approach and provide a comprehensive assessment framework for 3D scanpath prediction in medical imaging.
Related papers
- Imitating Radiological Scrolling: A Global-Local Attention Model for 3D Chest CT Volumes Multi-Label Anomaly Classification [0.0]
Multi-label classification of 3D CT scans is a challenging task due to the volumetric nature of the data and the variety of anomalies to be detected.<n>Existing deep learning methods based on Convolutional Neural Networks (CNNs) struggle to capture long-range dependencies effectively.<n>We present CT-Scroll, a novel global-local attention model specifically designed to emulate the scrolling behavior of radiologists during the analysis of 3D CT scans.
arXiv Detail & Related papers (2025-03-26T15:47:50Z) - Explaining 3D Computed Tomography Classifiers with Counterfactuals [5.782952470371709]
We extend the Latent Shift counterfactual generation method from 2D applications to explain 3D computed tomography (CT) scans.<n>We implement a slice-based autoencoder and gradient blocking except for specific chunks of slices.<n>Our approach is both memory-efficient and effective for generating interpretable counterfactuals in high-resolution 3D medical imaging.
arXiv Detail & Related papers (2025-02-11T00:44:20Z) - 3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models [51.855377054763345]
This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model for generating radiology reports from 3D CT scans.
Experiments on both public and private datasets demonstrate that 3D-CT-GPT significantly outperforms existing methods in terms of report accuracy and quality.
arXiv Detail & Related papers (2024-09-28T12:31:07Z) - Developing Generalist Foundation Models from a Multimodal Dataset for 3D Computed Tomography [10.110878689623961]
We introduce CT-RATE, the first dataset that pairs 3D medical images with corresponding textual reports.<n>We develop CT-CLIP, a CT-focused contrastive language-image pretraining framework.<n>We create CT-CHAT, a vision-language foundational chat model for 3D chest CT volumes.
arXiv Detail & Related papers (2024-03-26T16:19:56Z) - Deep learning network to correct axial and coronal eye motion in 3D OCT
retinal imaging [65.47834983591957]
We propose deep learning based neural networks to correct axial and coronal motion artifacts in OCT based on a single scan.
The experimental result shows that the proposed method can effectively correct motion artifacts and achieve smaller error than other methods.
arXiv Detail & Related papers (2023-05-27T03:55:19Z) - Geometry-Aware Attenuation Learning for Sparse-View CBCT Reconstruction [53.93674177236367]
Cone Beam Computed Tomography (CBCT) plays a vital role in clinical imaging.
Traditional methods typically require hundreds of 2D X-ray projections to reconstruct a high-quality 3D CBCT image.
This has led to a growing interest in sparse-view CBCT reconstruction to reduce radiation doses.
We introduce a novel geometry-aware encoder-decoder framework to solve this problem.
arXiv Detail & Related papers (2023-03-26T14:38:42Z) - Slice-level Detection of Intracranial Hemorrhage on CT Using Deep
Descriptors of Adjacent Slices [0.31317409221921133]
We propose a new strategy to train emphslice-level classifiers on CT scans based on the descriptors of the adjacent slices along the axis.
We obtain a single model in the top 4% best-performing solutions of the RSNA Intracranial Hemorrhage dataset challenge.
The proposed method is general and can be applied to other 3D medical diagnosis tasks such as MRI imaging.
arXiv Detail & Related papers (2022-08-05T23:20:37Z) - COVIDx CT-3: A Large-scale, Multinational, Open-Source Benchmark Dataset
for Computer-aided COVID-19 Screening from Chest CT Images [82.74877848011798]
We introduce COVIDx CT-3, a large-scale benchmark dataset for detection of COVID-19 cases from chest CT images.
COVIDx CT-3 includes 431,205 CT slices from 6,068 patients across at least 17 countries.
We examine the data diversity and potential biases of the COVIDx CT-3 dataset, finding significant geographic and class imbalances.
arXiv Detail & Related papers (2022-06-07T06:35:48Z) - Automated Model Design and Benchmarking of 3D Deep Learning Models for
COVID-19 Detection with Chest CT Scans [72.04652116817238]
We propose a differentiable neural architecture search (DNAS) framework to automatically search for the 3D DL models for 3D chest CT scans classification.
We also exploit the Class Activation Mapping (CAM) technique on our models to provide the interpretability of the results.
arXiv Detail & Related papers (2021-01-14T03:45:01Z) - XraySyn: Realistic View Synthesis From a Single Radiograph Through CT
Priors [118.27130593216096]
A radiograph visualizes the internal anatomy of a patient through the use of X-ray, which projects 3D information onto a 2D plane.
To the best of our knowledge, this is the first work on radiograph view synthesis.
We show that by gaining an understanding of radiography in 3D space, our method can be applied to radiograph bone extraction and suppression without groundtruth bone labels.
arXiv Detail & Related papers (2020-12-04T05:08:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.