Developing a Compressed Object Detection Model based on YOLOv4 for
Deployment on Embedded GPU Platform of Autonomous System
- URL: http://arxiv.org/abs/2108.00392v1
- Date: Sun, 1 Aug 2021 08:19:51 GMT
- Title: Developing a Compressed Object Detection Model based on YOLOv4 for
Deployment on Embedded GPU Platform of Autonomous System
- Authors: Issac Sim, Ju-Hyung Lim, Young-Wan Jang, JiHwan You, SeonTaek Oh, and
Young-Keun Kim
- Abstract summary: CNN-based object detection models are quite accurate but require a high-performance GPU to run in real-time.
It is preferable to compress the detection network as light as possible while preserving the detection accuracy.
This paper proposes a new object detection model, referred as YOffleNet, which is compressed at a high ratio.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Latest CNN-based object detection models are quite accurate but require a
high-performance GPU to run in real-time. They still are heavy in terms of
memory size and speed for an embedded system with limited memory space. Since
the object detection for autonomous system is run on an embedded processor, it
is preferable to compress the detection network as light as possible while
preserving the detection accuracy. There are several popular lightweight
detection models but their accuracy is too low for safe driving applications.
Therefore, this paper proposes a new object detection model, referred as
YOffleNet, which is compressed at a high ratio while minimizing the accuracy
loss for real-time and safe driving application on an autonomous system. The
backbone network architecture is based on YOLOv4, but we could compress the
network greatly by replacing the high-calculation-load CSP DenseNet with the
lighter modules of ShuffleNet. Experiments with KITTI dataset showed that the
proposed YOffleNet is compressed by 4.7 times than the YOLOv4-s that could
achieve as fast as 46 FPS on an embedded GPU system(NVIDIA Jetson AGX Xavier).
Compared to the high compression ratio, the accuracy is reduced slightly to
85.8% mAP, that is only 2.6% lower than YOLOv4-s. Thus, the proposed network
showed a high potential to be deployed on the embedded system of the autonomous
system for the real-time and accurate object detection applications.
Related papers
- MODIPHY: Multimodal Obscured Detection for IoT using PHantom Convolution-Enabled Faster YOLO [10.183459286746196]
We introduce YOLO Phantom, one of the smallest YOLO models ever conceived.
YOLO Phantom achieves comparable accuracy to the latest YOLOv8n model while simultaneously reducing both parameters and model size.
Its real-world efficacy is demonstrated on an IoT platform with advanced low-light and RGB cameras, seamlessly connecting to an AWS-based notification endpoint.
arXiv Detail & Related papers (2024-02-12T18:56:53Z) - EdgeYOLO: An Edge-Real-Time Object Detector [69.41688769991482]
This paper proposes an efficient, low-complexity and anchor-free object detector based on the state-of-the-art YOLO framework.
We develop an enhanced data augmentation method to effectively suppress overfitting during training, and design a hybrid random loss function to improve the detection accuracy of small objects.
Our baseline model can reach the accuracy of 50.6% AP50:95 and 69.8% AP50 in MS 2017 dataset, 26.4% AP50:95 and 44.8% AP50 in VisDrone 2019-DET dataset, and it meets real-time requirements (FPS>=30) on edge-computing device Nvidia
arXiv Detail & Related papers (2023-02-15T06:05:14Z) - Rethinking Voxelization and Classification for 3D Object Detection [68.8204255655161]
The main challenge in 3D object detection from LiDAR point clouds is achieving real-time performance without affecting the reliability of the network.
We present a solution to improve network inference speed and precision at the same time by implementing a fast dynamic voxelizer.
In addition, we propose a lightweight detection sub-head model for classifying predicted objects and filter out false detected objects.
arXiv Detail & Related papers (2023-01-10T16:22:04Z) - Using Detection, Tracking and Prediction in Visual SLAM to Achieve
Real-time Semantic Mapping of Dynamic Scenarios [70.70421502784598]
RDS-SLAM can build semantic maps at object level for dynamic scenarios in real time using only one commonly used Intel Core i7 CPU.
We evaluate RDS-SLAM in TUM RGB-D dataset, and experimental results show that RDS-SLAM can run with 30.3 ms per frame in dynamic scenarios.
arXiv Detail & Related papers (2022-10-10T11:03:32Z) - Edge YOLO: Real-Time Intelligent Object Detection System Based on
Edge-Cloud Cooperation in Autonomous Vehicles [5.295478084029605]
We propose an object detection (OD) system based on edge-cloud cooperation and reconstructive convolutional neural networks.
This system can effectively avoid the excessive dependence on computing power and uneven distribution of cloud computing resources.
We experimentally demonstrate the reliability and efficiency of Edge YOLO on COCO 2017 and KITTI data sets.
arXiv Detail & Related papers (2022-05-30T09:16:35Z) - A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z) - EAutoDet: Efficient Architecture Search for Object Detection [110.99532343155073]
EAutoDet framework can discover practical backbone and FPN architectures for object detection in 1.4 GPU-days.
We propose a kernel reusing technique by sharing the weights of candidate operations on one edge and consolidating them into one convolution.
In particular, the discovered architectures surpass state-of-the-art object detection NAS methods and achieve 40.1 mAP with 120 FPS and 49.2 mAP with 41.3 FPS on COCO test-dev set.
arXiv Detail & Related papers (2022-03-21T05:56:12Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - BEVDetNet: Bird's Eye View LiDAR Point Cloud based Real-time 3D Object
Detection for Autonomous Driving [6.389322215324224]
We propose a novel semantic segmentation architecture as a single unified model for object center detection using key points, box predictions and orientation prediction.
The proposed architecture can be trivially extended to include semantic segmentation classes like road without any additional computation.
The model is 5X faster than other top accuracy models with a minimal accuracy degradation of 2% in Average Precision at IoU=0.5 on KITTI dataset.
arXiv Detail & Related papers (2021-04-21T22:06:39Z) - FRDet: Balanced and Lightweight Object Detector based on Fire-Residual
Modules for Embedded Processor of Autonomous Driving [0.0]
We propose a lightweight one-stage object detector that is balanced to satisfy all the constraints of accuracy, model size, and real-time processing.
Our network aims to maximize the compression of the model while achieving or surpassing YOLOv3 level of accuracy.
arXiv Detail & Related papers (2020-11-16T16:15:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.