PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices
- URL: http://arxiv.org/abs/2111.00902v1
- Date: Mon, 1 Nov 2021 12:53:17 GMT
- Title: PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices
- Authors: Guanghua Yu, Qinyao Chang, Wenyu Lv, Chang Xu, Cheng Cui, Wei Ji,
Qingqing Dang, Kaipeng Deng, Guanzhong Wang, Yuning Du, Baohua Lai, Qiwen
Liu, Xiaoguang Hu, Dianhai Yu, Yanjun Ma
- Abstract summary: PP-PicoDet family of real-time object detectors achieves superior performance on object detection for mobile devices.
Models achieve better trade-offs between accuracy and latency compared to other popular models.
- Score: 13.62426382827205
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The better accuracy and efficiency trade-off has been a challenging problem
in object detection. In this work, we are dedicated to studying key
optimizations and neural network architecture choices for object detection to
improve accuracy and efficiency. We investigate the applicability of the
anchor-free strategy on lightweight object detection models. We enhance the
backbone structure and design the lightweight structure of the neck, which
improves the feature extraction ability of the network. We improve label
assignment strategy and loss function to make training more stable and
efficient. Through these optimizations, we create a new family of real-time
object detectors, named PP-PicoDet, which achieves superior performance on
object detection for mobile devices. Our models achieve better trade-offs
between accuracy and latency compared to other popular models. PicoDet-S with
only 0.99M parameters achieves 30.6% mAP, which is an absolute 4.8% improvement
in mAP while reducing mobile CPU inference latency by 55% compared to
YOLOX-Nano, and is an absolute 7.1% improvement in mAP compared to NanoDet. It
reaches 123 FPS (150 FPS using Paddle Lite) on mobile ARM CPU when the input
size is 320. PicoDet-L with only 3.3M parameters achieves 40.9% mAP, which is
an absolute 3.7% improvement in mAP and 44% faster than YOLOv5s. As shown in
Figure 1, our models far outperform the state-of-the-art results for
lightweight object detection. Code and pre-trained models are available at
https://github.com/PaddlePaddle/PaddleDetection.
Related papers
- LeYOLO, New Scalable and Efficient CNN Architecture for Object Detection [0.0]
We focus on design choices of neural network architectures for efficient object detection based on FLOP.
We propose several optimizations to enhance the efficiency of YOLO-based models.
This paper contributes to a new scaling paradigm for object detection and YOLO-centric models called LeYOLO.
arXiv Detail & Related papers (2024-06-20T12:08:24Z) - YOLO-TLA: An Efficient and Lightweight Small Object Detection Model based on YOLOv5 [19.388112026410045]
YOLO-TLA is an advanced object detection model building on YOLOv5.
We first introduce an additional detection layer for small objects in the neck network pyramid architecture.
This module uses sliding window feature extraction, which effectively minimizes both computational demand and the number of parameters.
arXiv Detail & Related papers (2024-02-22T05:55:17Z) - Learned Two-Plane Perspective Prior based Image Resampling for Efficient
Object Detection [20.886999159134138]
Real-time efficient perception is critical for autonomous navigation and city scale sensing.
In this work, we propose a learnable geometry-guided prior that incorporates rough geometry of the 3D scene.
Our approach improves detection rate by +4.1 $AP_S$ or +39% and in real-time performance by +5.3 $sAP_S$ or +63% for small objects over state-of-the-art (SOTA)
arXiv Detail & Related papers (2023-03-25T00:43:44Z) - EdgeYOLO: An Edge-Real-Time Object Detector [69.41688769991482]
This paper proposes an efficient, low-complexity and anchor-free object detector based on the state-of-the-art YOLO framework.
We develop an enhanced data augmentation method to effectively suppress overfitting during training, and design a hybrid random loss function to improve the detection accuracy of small objects.
Our baseline model can reach the accuracy of 50.6% AP50:95 and 69.8% AP50 in MS 2017 dataset, 26.4% AP50:95 and 44.8% AP50 in VisDrone 2019-DET dataset, and it meets real-time requirements (FPS>=30) on edge-computing device Nvidia
arXiv Detail & Related papers (2023-02-15T06:05:14Z) - Fewer is More: Efficient Object Detection in Large Aerial Images [59.683235514193505]
This paper presents an Objectness Activation Network (OAN) to help detectors focus on fewer patches but achieve more efficient inference and more accurate results.
Using OAN, all five detectors acquire more than 30.0% speed-up on three large-scale aerial image datasets.
We extend our OAN to driving-scene object detection and 4K video object detection, boosting the detection speed by 112.1% and 75.0%, respectively.
arXiv Detail & Related papers (2022-12-26T12:49:47Z) - ETAD: A Unified Framework for Efficient Temporal Action Detection [70.21104995731085]
Untrimmed video understanding such as temporal action detection (TAD) often suffers from the pain of huge demand for computing resources.
We build a unified framework for efficient end-to-end temporal action detection (ETAD)
ETAD achieves state-of-the-art performance on both THUMOS-14 and ActivityNet-1.3.
arXiv Detail & Related papers (2022-05-14T21:16:21Z) - EAutoDet: Efficient Architecture Search for Object Detection [110.99532343155073]
EAutoDet framework can discover practical backbone and FPN architectures for object detection in 1.4 GPU-days.
We propose a kernel reusing technique by sharing the weights of candidate operations on one edge and consolidating them into one convolution.
In particular, the discovered architectures surpass state-of-the-art object detection NAS methods and achieve 40.1 mAP with 120 FPS and 49.2 mAP with 41.3 FPS on COCO test-dev set.
arXiv Detail & Related papers (2022-03-21T05:56:12Z) - YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs [14.85882314822983]
In order to map deep neural network (DNN) based object detection models to edge devices, one typically needs to compress such models significantly.
In this paper, we propose a novel edge GPU friendly module for multi-scale feature interaction.
We also propose a novel learning backbone adoption inspired by the changing translational information flow across various tasks.
arXiv Detail & Related papers (2021-10-26T14:02:59Z) - Small Object Detection Based on Modified FSSD and Model Compression [7.387639662781843]
This paper proposes a small object detection algorithm based on FSSD.
In order to reduce the computational cost and storage space, pruning is carried out to achieve model compression.
The average accuracy (mAP) of the algorithm can reach 80.4% on PASCAL VOC and the speed is 59.5 FPS on GTX1080ti.
arXiv Detail & Related papers (2021-08-24T03:20:32Z) - Non-Parametric Adaptive Network Pruning [125.4414216272874]
We introduce non-parametric modeling to simplify the algorithm design.
Inspired by the face recognition community, we use a message passing algorithm to obtain an adaptive number of exemplars.
EPruner breaks the dependency on the training data in determining the "important" filters.
arXiv Detail & Related papers (2021-01-20T06:18:38Z) - MobileDets: Searching for Object Detection Architectures for Mobile
Accelerators [61.30355783955777]
Inverted bottleneck layers have been the predominant building blocks in state-of-the-art object detection models on mobile devices.
Regular convolutions are a potent component to boost the latency-accuracy trade-off for object detection on accelerators.
We obtain a family of object detection models, MobileDets, that achieve state-of-the-art results across mobile accelerators.
arXiv Detail & Related papers (2020-04-30T00:21:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.