Operationalizing Convolutional Neural Network Architectures for
Prohibited Object Detection in X-Ray Imagery
- URL: http://arxiv.org/abs/2110.04906v1
- Date: Sun, 10 Oct 2021 21:20:04 GMT
- Title: Operationalizing Convolutional Neural Network Architectures for
Prohibited Object Detection in X-Ray Imagery
- Authors: Thomas W. Webb, Neelanjan Bhowmik, Yona Falinie A. Gaus, Toby P.
Breckon
- Abstract summary: We explore the viability of two recent end-to-end object detection CNN architectures, Cascade R-CNN and FreeAnchor, for prohibited item detection.
With fewer parameters and less training time, FreeAnchor achieves the highest detection inference speed of 13 fps (3.9 ms per image)
The CNN models display substantial resilience to the lossy compression, resulting in only a 1.1% decrease in mAP at the JPEG compression level of 50.
- Score: 15.694880385913534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent advancement in deep Convolutional Neural Network (CNN) has brought
insight into the automation of X-ray security screening for aviation security
and beyond. Here, we explore the viability of two recent end-to-end object
detection CNN architectures, Cascade R-CNN and FreeAnchor, for prohibited item
detection by balancing processing time and the impact of image data compression
from an operational viewpoint. Overall, we achieve maximal detection
performance using a FreeAnchor architecture with a ResNet50 backbone, obtaining
mean Average Precision (mAP) of 87.7 and 85.8 for using the OPIXray and SIXray
benchmark datasets, showing superior performance over prior work on both. With
fewer parameters and less training time, FreeAnchor achieves the highest
detection inference speed of ~13 fps (3.9 ms per image). Furthermore, we
evaluate the impact of lossy image compression upon detector performance. The
CNN models display substantial resilience to the lossy compression, resulting
in only a 1.1% decrease in mAP at the JPEG compression level of 50.
Additionally, a thorough evaluation of data augmentation techniques is
provided, including adaptions of MixUp and CutMix strategy as well as other
standard transformations, further improving the detection accuracy.
Related papers
- Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity.
We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss.
Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - Learning Heavily-Degraded Prior for Underwater Object Detection [59.5084433933765]
This paper seeks transferable prior knowledge from detector-friendly images.
It is based on statistical observations that, the heavily degraded regions of detector-friendly (DFUI) and underwater images have evident feature distribution gaps.
Our method with higher speeds and less parameters still performs better than transformer-based detectors.
arXiv Detail & Related papers (2023-08-24T12:32:46Z) - Attention-based Feature Compression for CNN Inference Offloading in Edge
Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems.
We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device.
Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z) - From Environmental Sound Representation to Robustness of 2D CNN Models
Against Adversarial Attacks [82.21746840893658]
This paper investigates the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network.
We show that while the ResNet-18 model trained on DWT spectrograms achieves a high recognition accuracy, attacking this model is relatively more costly for the adversary.
arXiv Detail & Related papers (2022-04-14T15:14:08Z) - EResFD: Rediscovery of the Effectiveness of Standard Convolution for
Lightweight Face Detection [13.357235715178584]
We re-examine the effectiveness of the standard convolutional block as a lightweight backbone architecture for face detection.
We show that heavily channel-pruned standard convolution layers can achieve better accuracy and inference speed.
Our proposed detector EResFD obtained 80.4% mAP on WIDER FACE Hard subset which only takes 37.7 ms for VGA image inference on CPU.
arXiv Detail & Related papers (2022-04-04T02:30:43Z) - NeighCNN: A CNN based SAR Speckle Reduction using Feature preserving
Loss Function [1.7188280334580193]
NeighCNN is a deep learning-based speckle reduction algorithm that handles multiplicative noise.
Various synthetic, as well as real SAR images, are used for testing the NeighCNN architecture.
arXiv Detail & Related papers (2021-08-26T04:20:07Z) - Efficient CNN-LSTM based Image Captioning using Neural Network
Compression [0.0]
We present an unconventional end to end compression pipeline of a CNN-LSTM based Image Captioning model.
We then examine the effects of different compression architectures on the model and design a compression architecture that achieves a 73.1% reduction in model size.
arXiv Detail & Related papers (2020-12-17T16:25:09Z) - Boosting High-Level Vision with Joint Compression Artifacts Reduction
and Super-Resolution [10.960291115491504]
We generate an artifact-free high-resolution image from a low-resolution one compressed with an arbitrary quality factor.
A context-aware joint CAR and SR neural network (CAJNN) integrates both local and non-local features to solve CAR and SR in one-stage.
A deep reconstruction network is adopted to predict high quality and high-resolution images.
arXiv Detail & Related papers (2020-10-18T04:17:08Z) - On the Impact of Lossy Image and Video Compression on the Performance of
Deep Convolutional Neural Network Architectures [17.349420462716886]
This study investigates the impact of commonplace image and video compression techniques on the performance of deep learning architectures.
We examine the impact on performance across five discrete tasks: human pose estimation, semantic segmentation, object detection, action recognition, and monocular depth estimation.
Results show a non-linear and non-uniform relationship between network performance and the level of lossy compression applied.
arXiv Detail & Related papers (2020-07-28T15:37:37Z) - Perceptually Optimizing Deep Image Compression [53.705543593594285]
Mean squared error (MSE) and $ell_p$ norms have largely dominated the measurement of loss in neural networks.
We propose a different proxy approach to optimize image analysis networks against quantitative perceptual models.
arXiv Detail & Related papers (2020-07-03T14:33:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.