MODIPHY: Multimodal Obscured Detection for IoT using PHantom Convolution-Enabled Faster YOLO
- URL: http://arxiv.org/abs/2402.07894v2
- Date: Sun, 23 Jun 2024 16:11:19 GMT
- Title: MODIPHY: Multimodal Obscured Detection for IoT using PHantom Convolution-Enabled Faster YOLO
- Authors: Shubhabrata Mukherjee, Cory Beard, Zhu Li,
- Abstract summary: We introduce YOLO Phantom, one of the smallest YOLO models ever conceived.
YOLO Phantom achieves comparable accuracy to the latest YOLOv8n model while simultaneously reducing both parameters and model size.
Its real-world efficacy is demonstrated on an IoT platform with advanced low-light and RGB cameras, seamlessly connecting to an AWS-based notification endpoint.
- Score: 10.183459286746196
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Low-light conditions and occluded scenarios impede object detection in real-world Internet of Things (IoT) applications like autonomous vehicles and security systems. While advanced machine learning models strive for accuracy, their computational demands clash with the limitations of resource-constrained devices, hampering real-time performance. In our current research, we tackle this challenge, by introducing ``YOLO Phantom", one of the smallest YOLO models ever conceived. YOLO Phantom utilizes the novel Phantom Convolution block, achieving comparable accuracy to the latest YOLOv8n model while simultaneously reducing both parameters and model size by 43\%, resulting in a significant 19\% reduction in Giga Floating-Point Operations (GFLOPs). YOLO Phantom leverages transfer learning on our multimodal RGB-infrared dataset to address low-light and occlusion issues, equipping it with robust vision under adverse conditions. Its real-world efficacy is demonstrated on an IoT platform with advanced low-light and RGB cameras, seamlessly connecting to an AWS-based notification endpoint for efficient real-time object detection. Benchmarks reveal a substantial boost of 17\% and 14\% in frames per second (FPS) for thermal and RGB detection, respectively, compared to the baseline YOLOv8n model. For community contribution, both the code and the multimodal dataset are available on GitHub.
Related papers
- YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection.
The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs.
We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z) - YOLO-World: Real-Time Open-Vocabulary Object Detection [87.08732047660058]
We introduce YOLO-World, an innovative approach that enhances YOLO with open-vocabulary detection capabilities.
Our method excels in detecting a wide range of objects in a zero-shot manner with high efficiency.
YOLO-World achieves 35.4 AP with 52.0 FPS on V100, which outperforms many state-of-the-art methods in terms of both accuracy and speed.
arXiv Detail & Related papers (2024-01-30T18:59:38Z) - SATAY: A Streaming Architecture Toolflow for Accelerating YOLO Models on
FPGA Devices [48.47320494918925]
This work tackles the challenges of deploying stateof-the-art object detection models onto FPGA devices for ultralow latency applications.
We employ a streaming architecture design for our YOLO accelerators, implementing the complete model on-chip in a deeply pipelined fashion.
We introduce novel hardware components to support the operations of YOLO models in a dataflow manner, and off-chip memory buffering to address the limited on-chip memory resources.
arXiv Detail & Related papers (2023-09-04T13:15:01Z) - YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time
Object Detection [80.11152626362109]
We provide an efficient and performant object detector, termed YOLO-MS.
We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets.
Our work can also be used as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z) - YOLOBench: Benchmarking Efficient Object Detectors on Embedded Systems [0.0873811641236639]
We present YOLOBench, a benchmark comprised of 550+ YOLO-based object detection models on 4 different datasets and 4 different embedded hardware platforms.
We collect accuracy and latency numbers for a variety of YOLO-based one-stage detectors at different model scales by performing a fair, controlled comparison of these detectors with a fixed training environment.
We evaluate training-free accuracy estimators used in neural architecture search on YOLOBench and demonstrate that, while most state-of-the-art zero-cost accuracy estimators are outperformed by a simple baseline like MAC count, some of them can be effectively used to
arXiv Detail & Related papers (2023-07-26T01:51:10Z) - PE-YOLO: Pyramid Enhancement Network for Dark Object Detection [9.949687351946038]
We propose a pyramid enhanced network (PENet) and joint it with YOLOv3 to build a dark object detection framework named PE-YOLO.
PE-YOLO adopts an end-to-end joint training approach and only uses normal detection loss to simplify the training process.
Results: PE-YOLO achieves 78.0% in mAP and 53.6 in FPS, respectively, which can adapt to object detection under different low-light conditions.
arXiv Detail & Related papers (2023-07-20T15:25:55Z) - YOLO-Drone:Airborne real-time detection of dense small objects from
high-altitude perspective [8.864582442699023]
A real-time object detection algorithm (YOLO-Drone) is proposed and applied to two new UAV platforms and a specific light source.
YOLO-Drone exhibits high real-time inference speed of 53 FPS and a maximum mAP of 34.04%.
Notably, YOLO-Drone achieves high performance under the silicon-based golden LEDs, with a mAP of up to 87.71%.
arXiv Detail & Related papers (2023-04-14T05:21:47Z) - EdgeYOLO: An Edge-Real-Time Object Detector [69.41688769991482]
This paper proposes an efficient, low-complexity and anchor-free object detector based on the state-of-the-art YOLO framework.
We develop an enhanced data augmentation method to effectively suppress overfitting during training, and design a hybrid random loss function to improve the detection accuracy of small objects.
Our baseline model can reach the accuracy of 50.6% AP50:95 and 69.8% AP50 in MS 2017 dataset, 26.4% AP50:95 and 44.8% AP50 in VisDrone 2019-DET dataset, and it meets real-time requirements (FPS>=30) on edge-computing device Nvidia
arXiv Detail & Related papers (2023-02-15T06:05:14Z) - A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.