Scaled-YOLOv4: Scaling Cross Stage Partial Network
- URL: http://arxiv.org/abs/2011.08036v2
- Date: Mon, 22 Feb 2021 01:32:18 GMT
- Title: Scaled-YOLOv4: Scaling Cross Stage Partial Network
- Authors: Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao
- Abstract summary: We show that the YOLOv4 object detection neural network based on the CSP approach, scales both up and down.
We propose a network scaling approach that modifies not only the depth, width, resolution, but also structure of the network.
- Score: 14.198747290672854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We show that the YOLOv4 object detection neural network based on the CSP
approach, scales both up and down and is applicable to small and large networks
while maintaining optimal speed and accuracy. We propose a network scaling
approach that modifies not only the depth, width, resolution, but also
structure of the network. YOLOv4-large model achieves state-of-the-art results:
55.5% AP (73.4% AP50) for the MS COCO dataset at a speed of ~16 FPS on Tesla
V100, while with the test time augmentation, YOLOv4-large achieves 56.0% AP
(73.3 AP50). To the best of our knowledge, this is currently the highest
accuracy on the COCO dataset among any published work. The YOLOv4-tiny model
achieves 22.0% AP (42.0% AP50) at a speed of 443 FPS on RTX 2080Ti, while by
using TensorRT, batch size = 4 and FP16-precision the YOLOv4-tiny achieves 1774
FPS.
Related papers
- LeYOLO, New Scalable and Efficient CNN Architecture for Object Detection [0.0]
We focus on design choices of neural network architectures for efficient object detection based on FLOP.
We propose several optimizations to enhance the efficiency of YOLO-based models.
This paper contributes to a new scaling paradigm for object detection and YOLO-centric models called LeYOLO.
arXiv Detail & Related papers (2024-06-20T12:08:24Z) - YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time
Object Detection [80.11152626362109]
We provide an efficient and performant object detector, termed YOLO-MS.
We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets.
Our work can also be used as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z) - Ultra-low Power Deep Learning-based Monocular Relative Localization
Onboard Nano-quadrotors [64.68349896377629]
This work presents a novel autonomous end-to-end system that addresses the monocular relative localization, through deep neural networks (DNNs), of two peer nano-drones.
To cope with the ultra-constrained nano-drone platform, we propose a vertically-integrated framework, including dataset augmentation, quantization, and system optimizations.
Experimental results show that our DNN can precisely localize a 10cm-size target nano-drone by employing only low-resolution monochrome images, up to 2m distance.
arXiv Detail & Related papers (2023-03-03T14:14:08Z) - EdgeYOLO: An Edge-Real-Time Object Detector [69.41688769991482]
This paper proposes an efficient, low-complexity and anchor-free object detector based on the state-of-the-art YOLO framework.
We develop an enhanced data augmentation method to effectively suppress overfitting during training, and design a hybrid random loss function to improve the detection accuracy of small objects.
Our baseline model can reach the accuracy of 50.6% AP50:95 and 69.8% AP50 in MS 2017 dataset, 26.4% AP50:95 and 44.8% AP50 in VisDrone 2019-DET dataset, and it meets real-time requirements (FPS>=30) on edge-computing device Nvidia
arXiv Detail & Related papers (2023-02-15T06:05:14Z) - DAMO-YOLO : A Report on Real-Time Object Detection Design [19.06518351354291]
We present a fast and accurate object detection method dubbed DAMO-YOLO, which achieves higher performance than the state-of-the-art YOLO series.
We use MAE-NAS, a method guided by the principle of maximum entropy, to search our detection backbone.
In the design of necks and heads, we follow the rule of large neck, small head''
arXiv Detail & Related papers (2022-11-23T17:59:12Z) - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for
real-time object detectors [14.198747290672854]
YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS.
YOLOv7 has the highest accuracy 56.8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100.
We train YOLOv7 only on MS dataset from scratch without using any other datasets or pre-trained weights.
arXiv Detail & Related papers (2022-07-06T14:01:58Z) - A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z) - EAutoDet: Efficient Architecture Search for Object Detection [110.99532343155073]
EAutoDet framework can discover practical backbone and FPN architectures for object detection in 1.4 GPU-days.
We propose a kernel reusing technique by sharing the weights of candidate operations on one edge and consolidating them into one convolution.
In particular, the discovered architectures surpass state-of-the-art object detection NAS methods and achieve 40.1 mAP with 120 FPS and 49.2 mAP with 41.3 FPS on COCO test-dev set.
arXiv Detail & Related papers (2022-03-21T05:56:12Z) - YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs [14.85882314822983]
In order to map deep neural network (DNN) based object detection models to edge devices, one typically needs to compress such models significantly.
In this paper, we propose a novel edge GPU friendly module for multi-scale feature interaction.
We also propose a novel learning backbone adoption inspired by the changing translational information flow across various tasks.
arXiv Detail & Related papers (2021-10-26T14:02:59Z) - R-FCN: Object Detection via Region-based Fully Convolutional Networks [87.62557357527861]
We present region-based, fully convolutional networks for accurate and efficient object detection.
Our result is achieved at a test-time speed of 170ms per image, 2.5-20x faster than the Faster R-CNN counterpart.
arXiv Detail & Related papers (2016-05-20T15:50:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.