Related papers: Confidence Aware SSD Ensemble with Weighted Boxes Fusion for Weapon Detection

Confidence Aware SSD Ensemble with Weighted Boxes Fusion for Weapon Detection

URL: http://arxiv.org/abs/2509.23697v1
Date: Sun, 28 Sep 2025 07:08:48 GMT
Title: Confidence Aware SSD Ensemble with Weighted Boxes Fusion for Weapon Detection
Authors: Atharva Jadhav, Arush Karekar, Manas Divekar, Shachi Natu,
Abstract summary: The safety and security of public spaces is of vital importance, driving the need for sophisticated surveillance systems capable of accurately detecting weapons.<n>While single-model detectors are advanced, they often lack robustness in challenging conditions.<n>This paper presents the hypothesis that ensemble of Single Shot Multibox Detector (SSD) models with diverse feature extraction backbones can significantly enhance detection robustness.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The safety and security of public spaces is of vital importance, driving the need for sophisticated surveillance systems capable of accurately detecting weapons, which are often hampered by issues like partial occlusion, varying lighting, and cluttered backgrounds. While single-model detectors are advanced, they often lack robustness in these challenging conditions. This paper presents the hypothesis that ensemble of Single Shot Multibox Detector (SSD) models with diverse feature extraction backbones can significantly enhance detection robustness. To leverage diverse feature representations, individual SSD models were trained using a selection of backbone networks: VGG16, ResNet50, EfficientNet, and MobileNetV3. The study is conducted on a dataset consisting of images of three distinct weapon classes: guns, heavy weapons and knives. The predictions from these models are combined using the Weighted Boxes Fusion (WBF) method, an ensemble technique designed to optimize bounding box accuracy. Our key finding is that the fusion strategy is as critical as the ensemble's diversity, a WBF approach using a 'max' confidence scoring strategy achieved a mean Average Precision (mAP) of 0.838. This represents a 2.948% relative improvement over the best-performing single model and consistently outperforms other fusion heuristics. This research offers a robust approach to enhancing real-time weapon detection capabilities in surveillance applications by demonstrating that confidence-aware fusion is a key mechanism for improving accuracy metrics of ensembles.

Related papers

Cross-Domain Malware Detection via Probability-Level Fusion of Lightweight Gradient Boosting Models [0.0]
This paper presents a novel framework for malware detection that employs probability-level fusion across three distinct datasets.<n>Our method trains individual LightGBM classifiers on each dataset, selects top predictive features to ensure efficiency, and fuses their prediction probabilities using optimized weights determined via grid search.<n>Experiments demonstrate that our fusion approach achieves a macro F1-score of 0.823 on a cross-domain validation set, significantly outperforming individual models and providing superior generalizations.
arXiv Detail & Related papers (2025-08-30T12:18:13Z)
Adversarial Robustness for Unified Multi-Modal Encoders via Efficient Calibration [12.763688592842717]
We present the first comprehensive study of adversarial vulnerability in unified multi-modal encoders.<n>Non-visual inputs, such as audio and point clouds, are especially fragile.<n>Our method improves adversarial robustness by up to 47.3 percent at epsilon = 4/255.
arXiv Detail & Related papers (2025-05-17T08:26:04Z)
Reliability-Driven LiDAR-Camera Fusion for Robust 3D Object Detection [0.0]
We propose ReliFusion, a LiDAR-camera fusion framework operating in the bird's-eye view (BEV) space.<n>ReliFusion integrates three key components: the Spatio-Temporal Feature Aggregation (STFA) module, the Reliability module, and the Confidence-Weighted Mutual Cross-Attention (CW-MCA) module.<n>Experiments on the nuScenes dataset show that ReliFusion significantly outperforms state-of-the-art methods, achieving superior robustness and accuracy in scenarios with limited LiDAR fields of view and severe sensor malfunctions.
arXiv Detail & Related papers (2025-02-03T22:07:14Z)
An Optimal Cascade Feature-Level Spatiotemporal Fusion Strategy for Anomaly Detection in CAN Bus [2.8151714475955263]
Intelligent transportation systems (ITS) play a pivotal role in modern infrastructure but face security risks due to the broadcast-based nature of the in-vehicle Controller Area Network (CAN) buses.<n>The proposed framework detects all attack types with 100% accuracy, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2025-01-31T00:36:08Z)
Robust Modality-incomplete Anomaly Detection: A Modality-instructive Framework with Benchmark [69.02666229531322]
We introduce a pioneering study that investigates Modality-Incomplete Industrial Anomaly Detection (MIIAD)<n>We find that most existing MIAD methods perform poorly on the MIIAD Bench, leading to significant performance degradation.<n>We propose a novel two-stage Robust modAlity-aware fusing and Detecting framewoRk, abbreviated as RADAR.
arXiv Detail & Related papers (2024-10-02T16:47:55Z)
Confidence-aware multi-modality learning for eye disease screening [58.861421804458395]
We propose a novel multi-modality evidential fusion pipeline for eye disease screening. It provides a measure of confidence for each modality and elegantly integrates the multi-modality information. Experimental results on both public and internal datasets demonstrate that our model excels in robustness.
arXiv Detail & Related papers (2024-05-28T13:27:30Z)
LoRA-Ensemble: Efficient Uncertainty Modelling for Self-Attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient ensembling method for self-attention networks.<n>The method not only outperforms state-of-the-art implicit techniques like BatchEnsemble, but even matches or exceeds the accuracy of an Explicit Ensemble.
arXiv Detail & Related papers (2024-05-23T11:10:32Z)
Fusion is Not Enough: Single Modal Attacks on Fusion Models for 3D Object Detection [33.0406308223244]
We propose an attack framework that targets advanced camera-LiDAR fusion-based 3D object detection models through camera-only adversarial attacks. Our approach employs a two-stage optimization-based strategy that first thoroughly evaluates vulnerable image areas under adversarial attacks.
arXiv Detail & Related papers (2023-04-28T03:39:00Z)
MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection [89.26380781863665]
Fusing LiDAR and camera information is essential for achieving accurate and reliable 3D object detection in autonomous driving systems. Recent approaches aim at exploring the semantic densities of camera features through lifting points in 2D camera images into 3D space for fusion. We propose a novel framework that focuses on the multi-scale progressive interaction of the multi-granularity LiDAR and camera features.
arXiv Detail & Related papers (2022-09-07T12:29:29Z)
Multimodal Object Detection via Bayesian Fusion [59.31437166291557]
We study multimodal object detection with RGB and thermal cameras, since the latter can provide much stronger object signatures under poor illumination. Our key contribution is a non-learned late-fusion method that fuses together bounding box detections from different modalities. We apply our approach to benchmarks containing both aligned (KAIST) and unaligned (FLIR) multimodal sensor data.
arXiv Detail & Related papers (2021-04-07T04:03:20Z)
ASFD: Automatic and Scalable Face Detector [129.82350993748258]
We propose a novel Automatic and Scalable Face Detector (ASFD) ASFD is based on a combination of neural architecture search techniques as well as a new loss design. Our ASFD-D6 outperforms the prior strong competitors, and our lightweight ASFD-D0 runs at more than 120 FPS with Mobilenet for VGA-resolution images.
arXiv Detail & Related papers (2020-03-25T06:00:47Z)
Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference [119.19779637025444]
Deep networks were recently suggested to face the odds between accuracy (on clean natural images) and robustness (on adversarially perturbed images) This paper studies multi-exit networks associated with input-adaptive inference, showing their strong promise in achieving a "sweet point" in cooptimizing model accuracy, robustness and efficiency.
arXiv Detail & Related papers (2020-02-24T00:40:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.