YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for
real-time object detectors
- URL: http://arxiv.org/abs/2207.02696v1
- Date: Wed, 6 Jul 2022 14:01:58 GMT
- Title: YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for
real-time object detectors
- Authors: Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao
- Abstract summary: YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS.
YOLOv7 has the highest accuracy 56.8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100.
We train YOLOv7 only on MS dataset from scratch without using any other datasets or pre-trained weights.
- Score: 14.198747290672854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: YOLOv7 surpasses all known object detectors in both speed and accuracy in the
range from 5 FPS to 160 FPS and has the highest accuracy 56.8% AP among all
known real-time object detectors with 30 FPS or higher on GPU V100. YOLOv7-E6
object detector (56 FPS V100, 55.9% AP) outperforms both transformer-based
detector SWIN-L Cascade-Mask R-CNN (9.2 FPS A100, 53.9% AP) by 509% in speed
and 2% in accuracy, and convolutional-based detector ConvNeXt-XL Cascade-Mask
R-CNN (8.6 FPS A100, 55.2% AP) by 551% in speed and 0.7% AP in accuracy, as
well as YOLOv7 outperforms: YOLOR, YOLOX, Scaled-YOLOv4, YOLOv5, DETR,
Deformable DETR, DINO-5scale-R50, ViT-Adapter-B and many other object detectors
in speed and accuracy. Moreover, we train YOLOv7 only on MS COCO dataset from
scratch without using any other datasets or pre-trained weights. Source code is
released in https://github.com/WongKinYiu/yolov7.
Related papers
- YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection.
The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs.
We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z) - YOLO-World: Real-Time Open-Vocabulary Object Detection [87.08732047660058]
We introduce YOLO-World, an innovative approach that enhances YOLO with open-vocabulary detection capabilities.
Our method excels in detecting a wide range of objects in a zero-shot manner with high efficiency.
YOLO-World achieves 35.4 AP with 52.0 FPS on V100, which outperforms many state-of-the-art methods in terms of both accuracy and speed.
arXiv Detail & Related papers (2024-01-30T18:59:38Z) - YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time
Object Detection [80.11152626362109]
We provide an efficient and performant object detector, termed YOLO-MS.
We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets.
Our work can also be used as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z) - DETRs Beat YOLOs on Real-time Object Detection [5.426236055184119]
YOLO series has become the most popular framework for real-time object detection due to its reasonable trade-off between speed and accuracy.
Recently, end-to-end Transformer-based detectors (DETRs) have provided an alternative to eliminating NMS.
In this paper, we propose the Real-Time DEtection TRansformer (RT-DETR), the first real-time end-to-end object detector.
arXiv Detail & Related papers (2023-04-17T08:30:02Z) - EdgeYOLO: An Edge-Real-Time Object Detector [69.41688769991482]
This paper proposes an efficient, low-complexity and anchor-free object detector based on the state-of-the-art YOLO framework.
We develop an enhanced data augmentation method to effectively suppress overfitting during training, and design a hybrid random loss function to improve the detection accuracy of small objects.
Our baseline model can reach the accuracy of 50.6% AP50:95 and 69.8% AP50 in MS 2017 dataset, 26.4% AP50:95 and 44.8% AP50 in VisDrone 2019-DET dataset, and it meets real-time requirements (FPS>=30) on edge-computing device Nvidia
arXiv Detail & Related papers (2023-02-15T06:05:14Z) - YOLOv6 v3.0: A Full-Scale Reloading [9.348857966505111]
We refurnish YOLOv6 with numerous novel enhancements on the network architecture and the training scheme.
YOLOv6-N hits 37.5% AP on the COCO dataset at a throughput of 1187 FPS tested with an NVIDIA Tesla T4 GPU.
YOLOv6-S strikes 45.0% AP at 484 FPS, outperforming other mainstream detectors at the same scale.
arXiv Detail & Related papers (2023-01-13T14:46:46Z) - YOLOv6: A Single-Stage Object Detection Framework for Industrial
Applications [16.047499394184985]
YOLOv6-N hits 35.9% AP on the COCO dataset at a throughput of 1234 FPS on an NVIDIA Tesla T4 GPU.
YOLOv6-S strikes 43.5% AP at 495 FPS, outperforming other mainstream detectors at the same scale.
YOLOv6-M/L achieves better accuracy performance (i.e., 49.5%/52.3%) than other detectors with a similar inference speed.
arXiv Detail & Related papers (2022-09-07T07:47:58Z) - A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z) - Workshop on Autonomous Driving at CVPR 2021: Technical Report for
Streaming Perception Challenge [57.647371468876116]
We introduce our real-time 2D object detection system for the realistic autonomous driving scenario.
Our detector is built on a newly designed YOLO model, called YOLOX.
On the Argoverse-HD dataset, our system achieves 41.0 streaming AP, which surpassed second place by 7.8/6.1 on detection-only track/fully track, respectively.
arXiv Detail & Related papers (2021-07-27T06:36:06Z) - YOLOX: Exceeding YOLO Series in 2021 [25.734980783220976]
We switch the YOLO detector to an anchor-free manner and conduct other advanced detection techniques.
For YOLO-Nano with only 0.91M parameters and 1.08G FLOPs, we get 25.3% AP on COCO, surpassing NanoDet by 1.8% AP.
For YOLOX-L with roughly the same amount of parameters as YOLOv4-CSP, YOLOv5-L, we achieve 50.0% AP on COCO at a speed of 68.9 FPS on Tesla V100.
arXiv Detail & Related papers (2021-07-18T12:55:11Z) - Scaled-YOLOv4: Scaling Cross Stage Partial Network [14.198747290672854]
We show that the YOLOv4 object detection neural network based on the CSP approach, scales both up and down.
We propose a network scaling approach that modifies not only the depth, width, resolution, but also structure of the network.
arXiv Detail & Related papers (2020-11-16T15:42:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.