Imitating Radiological Scrolling: A Global-Local Attention Model for 3D Chest CT Volumes Multi-Label Anomaly Classification
- URL: http://arxiv.org/abs/2503.20652v4
- Date: Fri, 06 Jun 2025 07:43:45 GMT
- Title: Imitating Radiological Scrolling: A Global-Local Attention Model for 3D Chest CT Volumes Multi-Label Anomaly Classification
- Authors: Theo Di Piazza, Carole Lazarus, Olivier Nempont, Loic Boussel,
- Abstract summary: Multi-label classification of 3D CT scans is a challenging task due to the volumetric nature of the data and the variety of anomalies to be detected.<n>Existing deep learning methods based on Convolutional Neural Networks (CNNs) struggle to capture long-range dependencies effectively.<n>We present CT-Scroll, a novel global-local attention model specifically designed to emulate the scrolling behavior of radiologists during the analysis of 3D CT scans.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid increase in the number of Computed Tomography (CT) scan examinations has created an urgent need for automated tools, such as organ segmentation, anomaly classification, and report generation, to assist radiologists with their growing workload. Multi-label classification of Three-Dimensional (3D) CT scans is a challenging task due to the volumetric nature of the data and the variety of anomalies to be detected. Existing deep learning methods based on Convolutional Neural Networks (CNNs) struggle to capture long-range dependencies effectively, while Vision Transformers require extensive pre-training, posing challenges for practical use. Additionally, these existing methods do not explicitly model the radiologist's navigational behavior while scrolling through CT scan slices, which requires both global context understanding and local detail awareness. In this study, we present CT-Scroll, a novel global-local attention model specifically designed to emulate the scrolling behavior of radiologists during the analysis of 3D CT scans. Our approach is evaluated on two public datasets, demonstrating its efficacy through comprehensive experiments and an ablation study that highlights the contribution of each model component.
Related papers
- Structured Spectral Graph Learning for Anomaly Classification in 3D Chest CT Scans [0.0]
We propose a new graph-based approach that models CT scans as structured graphs, leveraging axial slice triplets nodes processed through spectral domain convolution to enhance anomaly classification performance.<n>Our method exhibits strong cross-dataset generalization, and competitive performance while achieving robustness to z-axis translation.
arXiv Detail & Related papers (2025-08-01T19:52:34Z) - CT-ScanGaze: A Dataset and Baselines for 3D Volumetric Scanpath Modeling [12.457017701871273]
We present the first publicly available eye gaze dataset on CT, called CT-ScanGaze.<n>We then introduce CT-Searcher, a novel 3D scanpath predictor designed specifically to process CT volumes and generate radiologist-like 3D fixation sequences.
arXiv Detail & Related papers (2025-07-16T19:21:05Z) - Rethinking Whole-Body CT Image Interpretation: An Abnormality-Centric Approach [57.86418347491272]
We propose a comprehensive hierarchical classification system, with 404 representative abnormal findings across all body regions.<n>We contribute a dataset containing over 14.5K CT images from multiple planes and all human body regions, and meticulously provide grounding annotations for over 19K abnormalities.<n>We propose OminiAbnorm-CT, which can automatically ground and describe abnormal findings on multi-plane and whole-body CT images based on text queries.
arXiv Detail & Related papers (2025-06-03T17:57:34Z) - CT-Agent: A Multimodal-LLM Agent for 3D CT Radiology Question Answering [23.158482226185217]
A visual question answering (VQA) system that can answer radiologists' questions about some anatomical regions on the CT scan is urgently needed.<n>Existing VQA systems cannot adequately handle the CT radiology question answering (CTQA) task for: (1) anatomic complexity makes CT images difficult to understand; (2) spatial relationship across hundreds slices is difficult to capture.<n>This paper proposes CT-Agent, a multimodal agentic framework for CTQA.
arXiv Detail & Related papers (2025-05-22T04:59:20Z) - A Continual Learning-driven Model for Accurate and Generalizable Segmentation of Clinically Comprehensive and Fine-grained Whole-body Anatomies in CT [67.34586036959793]
There is no fully annotated CT dataset with all anatomies delineated for training.<n>We propose a novel continual learning-driven CT model that can segment complete anatomies.<n>Our single unified CT segmentation model, CL-Net, can highly accurately segment a clinically comprehensive set of 235 fine-grained whole-body anatomies.
arXiv Detail & Related papers (2025-03-16T23:55:02Z) - 3D Foundation AI Model for Generalizable Disease Detection in Head Computed Tomography [8.896955286474991]
We introduce FM-CT: a Foundation Model for Head CT for generalizable disease detection, trained using self-supervised learning.
Our approach pre-trains a deep learning model on a large, diverse dataset of 361,663 non-contrast 3D head CT scans without the need for manual annotations.
Our results demonstrate that the self-supervised foundation model significantly improves performance on downstream diagnostic tasks.
arXiv Detail & Related papers (2025-02-04T23:42:18Z) - A Fast, Scalable, and Robust Deep Learning-based Iterative Reconstruction Framework for Accelerated Industrial Cone-beam X-ray Computed Tomography [5.104810959579395]
Cone-beam X-ray Computed Tomography (XCT) with large detectors and corresponding large-scale 3D reconstruction plays a pivotal role in micron-scale characterization of materials and parts across various industries.<n>We present a novel deep neural network-based iterative algorithm that integrates an artifact reduction-trained CNN as a prior model with automated regularization parameter selection.
arXiv Detail & Related papers (2025-01-21T19:34:01Z) - Abnormality-Driven Representation Learning for Radiology Imaging [0.8321462983924758]
We introduce lesion-enhanced contrastive learning (LeCL), a novel approach to obtain visual representations driven by abnormalities in 2D axial slices across different locations of the CT scans.
We evaluate our approach across three clinical tasks: tumor lesion location, lung disease detection, and patient staging, benchmarking against four state-of-the-art foundation models.
arXiv Detail & Related papers (2024-11-25T13:53:26Z) - 3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models [51.855377054763345]
This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model for generating radiology reports from 3D CT scans.
Experiments on both public and private datasets demonstrate that 3D-CT-GPT significantly outperforms existing methods in terms of report accuracy and quality.
arXiv Detail & Related papers (2024-09-28T12:31:07Z) - Multi-View Vertebra Localization and Identification from CT Images [57.56509107412658]
We propose a multi-view vertebra localization and identification from CT images.
We convert the 3D problem into a 2D localization and identification task on different views.
Our method can learn the multi-view global information naturally.
arXiv Detail & Related papers (2023-07-24T14:43:07Z) - Slice-level Detection of Intracranial Hemorrhage on CT Using Deep
Descriptors of Adjacent Slices [0.31317409221921133]
We propose a new strategy to train emphslice-level classifiers on CT scans based on the descriptors of the adjacent slices along the axis.
We obtain a single model in the top 4% best-performing solutions of the RSNA Intracranial Hemorrhage dataset challenge.
The proposed method is general and can be applied to other 3D medical diagnosis tasks such as MRI imaging.
arXiv Detail & Related papers (2022-08-05T23:20:37Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - A unified 3D framework for Organs at Risk Localization and Segmentation
for Radiation Therapy Planning [56.52933974838905]
Current medical workflow requires manual delineation of organs-at-risk (OAR)
In this work, we aim to introduce a unified 3D pipeline for OAR localization-segmentation.
Our proposed framework fully enables the exploitation of 3D context information inherent in medical imaging.
arXiv Detail & Related papers (2022-03-01T17:08:41Z) - COVID-19 identification from volumetric chest CT scans using a
progressively resized 3D-CNN incorporating segmentation, augmentation, and
class-rebalancing [4.446085353384894]
COVID-19 is a global pandemic disease overgrowing worldwide.
Computer-aided screening tools with greater sensitivity is imperative for disease diagnosis and prognosis.
This article proposes a 3D Convolutional Neural Network (CNN)-based classification approach.
arXiv Detail & Related papers (2021-02-11T18:16:18Z) - Automated Model Design and Benchmarking of 3D Deep Learning Models for
COVID-19 Detection with Chest CT Scans [72.04652116817238]
We propose a differentiable neural architecture search (DNAS) framework to automatically search for the 3D DL models for 3D chest CT scans classification.
We also exploit the Class Activation Mapping (CAM) technique on our models to provide the interpretability of the results.
arXiv Detail & Related papers (2021-01-14T03:45:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.