Related papers: Developing a Compressed Object Detection Model based on YOLOv4 for Deployment on Embedded GPU Platform of Autonomous System

Developing a Compressed Object Detection Model based on YOLOv4 for Deployment on Embedded GPU Platform of Autonomous System

URL: http://arxiv.org/abs/2108.00392v1
Date: Sun, 1 Aug 2021 08:19:51 GMT
Title: Developing a Compressed Object Detection Model based on YOLOv4 for Deployment on Embedded GPU Platform of Autonomous System
Authors: Issac Sim, Ju-Hyung Lim, Young-Wan Jang, JiHwan You, SeonTaek Oh, and Young-Keun Kim
Abstract summary: CNN-based object detection models are quite accurate but require a high-performance GPU to run in real-time. It is preferable to compress the detection network as light as possible while preserving the detection accuracy. This paper proposes a new object detection model, referred as YOffleNet, which is compressed at a high ratio.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Latest CNN-based object detection models are quite accurate but require a high-performance GPU to run in real-time. They still are heavy in terms of memory size and speed for an embedded system with limited memory space. Since the object detection for autonomous system is run on an embedded processor, it is preferable to compress the detection network as light as possible while preserving the detection accuracy. There are several popular lightweight detection models but their accuracy is too low for safe driving applications. Therefore, this paper proposes a new object detection model, referred as YOffleNet, which is compressed at a high ratio while minimizing the accuracy loss for real-time and safe driving application on an autonomous system. The backbone network architecture is based on YOLOv4, but we could compress the network greatly by replacing the high-calculation-load CSP DenseNet with the lighter modules of ShuffleNet. Experiments with KITTI dataset showed that the proposed YOffleNet is compressed by 4.7 times than the YOLOv4-s that could achieve as fast as 46 FPS on an embedded GPU system(NVIDIA Jetson AGX Xavier). Compared to the high compression ratio, the accuracy is reduced slightly to 85.8% mAP, that is only 2.6% lower than YOLOv4-s. Thus, the proposed network showed a high potential to be deployed on the embedded system of the autonomous system for the real-time and accurate object detection applications.

Related papers

A lightweight model FDM-YOLO for small target improvement based on YOLOv8 [0.0]
Small targets are difficult to detect due to their low pixel count, complex backgrounds, and varying shooting angles. This paper focuses on small target detection and explores methods for object detection under low computational constraints.
arXiv Detail & Related papers (2025-03-06T14:06:35Z)
EDNet: Edge-Optimized Small Target Detection in UAV Imagery -- Faster Context Attention, Better Feature Fusion, and Hardware Acceleration [5.419945081601977]
EDNet is a novel edge-target detection framework built on an enhanced YOLOv10 architecture. On an iPhone 12, EDNet variants operate at speeds ranging from 16 to 55 FPS.
arXiv Detail & Related papers (2025-01-10T11:37:50Z)
MODIPHY: Multimodal Obscured Detection for IoT using PHantom Convolution-Enabled Faster YOLO [10.183459286746196]
We introduce YOLO Phantom, one of the smallest YOLO models ever conceived. YOLO Phantom achieves comparable accuracy to the latest YOLOv8n model while simultaneously reducing both parameters and model size. Its real-world efficacy is demonstrated on an IoT platform with advanced low-light and RGB cameras, seamlessly connecting to an AWS-based notification endpoint.
arXiv Detail & Related papers (2024-02-12T18:56:53Z)
EdgeYOLO: An Edge-Real-Time Object Detector [69.41688769991482]
This paper proposes an efficient, low-complexity and anchor-free object detector based on the state-of-the-art YOLO framework. We develop an enhanced data augmentation method to effectively suppress overfitting during training, and design a hybrid random loss function to improve the detection accuracy of small objects. Our baseline model can reach the accuracy of 50.6% AP50:95 and 69.8% AP50 in MS 2017 dataset, 26.4% AP50:95 and 44.8% AP50 in VisDrone 2019-DET dataset, and it meets real-time requirements (FPS>=30) on edge-computing device Nvidia
arXiv Detail & Related papers (2023-02-15T06:05:14Z)
Rethinking Voxelization and Classification for 3D Object Detection [68.8204255655161]
The main challenge in 3D object detection from LiDAR point clouds is achieving real-time performance without affecting the reliability of the network. We present a solution to improve network inference speed and precision at the same time by implementing a fast dynamic voxelizer. In addition, we propose a lightweight detection sub-head model for classifying predicted objects and filter out false detected objects.
arXiv Detail & Related papers (2023-01-10T16:22:04Z)
Using Detection, Tracking and Prediction in Visual SLAM to Achieve Real-time Semantic Mapping of Dynamic Scenarios [70.70421502784598]
RDS-SLAM can build semantic maps at object level for dynamic scenarios in real time using only one commonly used Intel Core i7 CPU. We evaluate RDS-SLAM in TUM RGB-D dataset, and experimental results show that RDS-SLAM can run with 30.3 ms per frame in dynamic scenarios.
arXiv Detail & Related papers (2022-10-10T11:03:32Z)
Edge YOLO: Real-Time Intelligent Object Detection System Based on Edge-Cloud Cooperation in Autonomous Vehicles [5.295478084029605]
We propose an object detection (OD) system based on edge-cloud cooperation and reconstructive convolutional neural networks. This system can effectively avoid the excessive dependence on computing power and uneven distribution of cloud computing resources. We experimentally demonstrate the reliability and efficiency of Edge YOLO on COCO 2017 and KITTI data sets.
arXiv Detail & Related papers (2022-05-30T09:16:35Z)
A lightweight and accurate YOLO-like network for small target detection in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection. YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation. YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z)
EAutoDet: Efficient Architecture Search for Object Detection [110.99532343155073]
EAutoDet framework can discover practical backbone and FPN architectures for object detection in 1.4 GPU-days. We propose a kernel reusing technique by sharing the weights of candidate operations on one edge and consolidating them into one convolution. In particular, the discovered architectures surpass state-of-the-art object detection NAS methods and achieve 40.1 mAP with 120 FPS and 49.2 mAP with 41.3 FPS on COCO test-dev set.
arXiv Detail & Related papers (2022-03-21T05:56:12Z)
ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware. The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation. We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z)
BEVDetNet: Bird's Eye View LiDAR Point Cloud based Real-time 3D Object Detection for Autonomous Driving [6.389322215324224]
We propose a novel semantic segmentation architecture as a single unified model for object center detection using key points, box predictions and orientation prediction. The proposed architecture can be trivially extended to include semantic segmentation classes like road without any additional computation. The model is 5X faster than other top accuracy models with a minimal accuracy degradation of 2% in Average Precision at IoU=0.5 on KITTI dataset.
arXiv Detail & Related papers (2021-04-21T22:06:39Z)
FRDet: Balanced and Lightweight Object Detector based on Fire-Residual Modules for Embedded Processor of Autonomous Driving [0.0]
We propose a lightweight one-stage object detector that is balanced to satisfy all the constraints of accuracy, model size, and real-time processing. Our network aims to maximize the compression of the model while achieving or surpassing YOLOv3 level of accuracy.
arXiv Detail & Related papers (2020-11-16T16:15:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.