Superpowering Open-Vocabulary Object Detectors for X-ray Vision
- URL: http://arxiv.org/abs/2503.17071v1
- Date: Fri, 21 Mar 2025 11:54:16 GMT
- Title: Superpowering Open-Vocabulary Object Detectors for X-ray Vision
- Authors: Pablo Garcia-Fernandez, Lorenzo Vaquero, Mingxuan Liu, Feng Xue, Daniel Cores, Nicu Sebe, Manuel Mucientes, Elisa Ricci,
- Abstract summary: Open-vocabulary object detection (OvOD) is set to revolutionize security screening by enabling systems to recognize any item in X-ray scans.<n>We propose RAXO, a framework that repurposes off-the-shelf RGB OvOD detectors for robust X-ray detection.<n> RAXO builds high-quality X-ray class descriptors using a dual-source retrieval strategy.
- Score: 53.07098133237041
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Open-vocabulary object detection (OvOD) is set to revolutionize security screening by enabling systems to recognize any item in X-ray scans. However, developing effective OvOD models for X-ray imaging presents unique challenges due to data scarcity and the modality gap that prevents direct adoption of RGB-based solutions. To overcome these limitations, we propose RAXO, a training-free framework that repurposes off-the-shelf RGB OvOD detectors for robust X-ray detection. RAXO builds high-quality X-ray class descriptors using a dual-source retrieval strategy. It gathers relevant RGB images from the web and enriches them via a novel X-ray material transfer mechanism, eliminating the need for labeled databases. These visual descriptors replace text-based classification in OvOD, leveraging intra-modal feature distances for robust detection. Extensive experiments demonstrate that RAXO consistently improves OvOD performance, providing an average mAP increase of up to 17.0 points over base detectors. To further support research in this emerging field, we also introduce DET-COMPASS, a new benchmark featuring bounding box annotations for over 300 object categories, enabling large-scale evaluation of OvOD in X-ray. Code and dataset available at: https://github.com/PAGF188/RAXO.
Related papers
- Fan-Beam CT Reconstruction for Unaligned Sparse-View X-ray Baggage Dataset [0.0]
We present a calibration and reconstruction method using an unaligned sparse multi-view X-ray baggage dataset.<n>Our approach integrates multi-spectral neural attenuation field reconstruction with Linear pushbroom (LPB) camera model pose optimization.
arXiv Detail & Related papers (2024-12-04T05:16:54Z) - BGM: Background Mixup for X-ray Prohibited Items Detection [75.58709178012502]
This paper introduces a novel data augmentation approach tailored for prohibited item detection, leveraging unique characteristics inherent to X-ray imagery.<n>Our method is motivated by observations of physical properties including: 1) X-ray Transmission Imagery: Unlike reflected light images, transmitted X-ray pixels represent composite information from multiple materials along the imaging path.<n>We propose a simple yet effective X-ray image augmentation technique, Background Mixup (BGM), for prohibited item detection in security screening contexts.
arXiv Detail & Related papers (2024-11-30T12:26:55Z) - Enhancing Prohibited Item Detection through X-ray-Specific Augmentation and Contextual Feature Integration [81.11400642272976]
X-ray prohibited item detection faces challenges due to the long-tail distribution and unique characteristics of X-ray imaging.<n>Traditional data augmentation strategies, such as copy-paste and mixup, are ineffective at improving the detection of rare items.<n>We propose the X-ray Imaging-driven Detection Network (XIDNet) to address these challenges.
arXiv Detail & Related papers (2024-11-27T06:13:56Z) - Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP [6.934570446284497]
We introduce distillation-based open-vocabulary object detection task into X-ray security inspection domain.
It aims to detect novel prohibited item categories beyond base categories on which the detector is trained.
X-ray feature adapter and apply it to CLIP within OVOD framework to develop OVXD model.
arXiv Detail & Related papers (2024-06-16T14:42:52Z) - AO-DETR: Anti-Overlapping DETR for X-Ray Prohibited Items Detection [6.603436370737025]
We propose an Anti-Overlapping DETR (AO-DETR) based on one of the state-of-the-art general object detectors, DINO.
To address the feature coupling issue caused by overlapping phenomena, we introduce the Category-Specific One-to-One Assignment (CSA) strategy.
To address the edge blurring problem caused by overlapping phenomena, we propose the Look Forwardly scheme.
arXiv Detail & Related papers (2024-03-07T08:30:17Z) - Generative Residual Attention Network for Disease Detection [51.60842580044539]
We present a novel approach for disease generation in X-rays using a conditional generative adversarial learning.
We generate a corresponding radiology image in a target domain while preserving the identity of the patient.
We then use the generated X-ray image in the target domain to augment our training to improve the detection performance.
arXiv Detail & Related papers (2021-10-25T14:15:57Z) - On the impact of using X-ray energy response imagery for object
detection via Convolutional Neural Networks [17.639472693362926]
We study the impact of variant X-ray imagery, i.e. X-ray energy response (high, low) and effective-z compared to geometries.
We evaluate CNN architectures to explore the transferability of models trained with such 'raw' variant imagery.
arXiv Detail & Related papers (2021-08-27T21:28:28Z) - Tensor Pooling Driven Instance Segmentation Framework for Baggage Threat
Recognition [39.40595024569702]
We propose a novel multi-scale contour instance segmentation framework to identify cluttered contraband data in baggage X-ray scans.
The proposed framework is rigorously validated on three public datasets, dubbed GDXray, SIXray, and OPIXray.
To the best of our knowledge, this is the first contour instance segmentation framework that leverages multi-scale information to recognize cluttered and concealed contraband data.
arXiv Detail & Related papers (2021-08-22T00:04:58Z) - Dense Label Encoding for Boundary Discontinuity Free Rotation Detection [69.75559390700887]
This paper explores a relatively less-studied methodology based on classification.
We propose new techniques to push its frontier in two aspects.
Experiments and visual analysis on large-scale public datasets for aerial images show the effectiveness of our approach.
arXiv Detail & Related papers (2020-11-19T05:42:02Z) - Occluded Prohibited Items Detection: an X-ray Security Inspection
Benchmark and De-occlusion Attention Module [50.75589128518707]
We contribute the first high-quality object detection dataset for security inspection, named OPIXray.
OPIXray focused on the widely-occurred prohibited item "cutter", annotated manually by professional inspectors from the international airport.
We propose the De-occlusion Attention Module (DOAM), a plug-and-play module that can be easily inserted into and thus promote most popular detectors.
arXiv Detail & Related papers (2020-04-18T16:10:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.