DAMO-YOLO : A Report on Real-Time Object Detection Design
- URL: http://arxiv.org/abs/2211.15444v4
- Date: Mon, 24 Apr 2023 03:32:15 GMT
- Title: DAMO-YOLO : A Report on Real-Time Object Detection Design
- Authors: Xianzhe Xu, Yiqi Jiang, Weihua Chen, Yilun Huang, Yuan Zhang, Xiuyu
Sun
- Abstract summary: We present a fast and accurate object detection method dubbed DAMO-YOLO, which achieves higher performance than the state-of-the-art YOLO series.
We use MAE-NAS, a method guided by the principle of maximum entropy, to search our detection backbone.
In the design of necks and heads, we follow the rule of large neck, small head''
- Score: 19.06518351354291
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this report, we present a fast and accurate object detection method dubbed
DAMO-YOLO, which achieves higher performance than the state-of-the-art YOLO
series. DAMO-YOLO is extended from YOLO with some new technologies, including
Neural Architecture Search (NAS), efficient Reparameterized Generalized-FPN
(RepGFPN), a lightweight head with AlignedOTA label assignment, and
distillation enhancement. In particular, we use MAE-NAS, a method guided by the
principle of maximum entropy, to search our detection backbone under the
constraints of low latency and high performance, producing ResNet/CSP-like
structures with spatial pyramid pooling and focus modules. In the design of
necks and heads, we follow the rule of ``large neck, small head''.We import
Generalized-FPN with accelerated queen-fusion to build the detector neck and
upgrade its CSPNet with efficient layer aggregation networks (ELAN) and
reparameterization. Then we investigate how detector head size affects
detection performance and find that a heavy neck with only one task projection
layer would yield better results.In addition, AlignedOTA is proposed to solve
the misalignment problem in label assignment. And a distillation schema is
introduced to improve performance to a higher level. Based on these new techs,
we build a suite of models at various scales to meet the needs of different
scenarios. For general industry requirements, we propose DAMO-YOLO-T/S/M/L.
They can achieve 43.6/47.7/50.2/51.9 mAPs on COCO with the latency of
2.78/3.83/5.62/7.95 ms on T4 GPUs respectively. Additionally, for edge devices
with limited computing power, we have also proposed DAMO-YOLO-Ns/Nm/Nl
lightweight models. They can achieve 32.3/38.2/40.5 mAPs on COCO with the
latency of 4.08/5.05/6.69 ms on X86-CPU. Our proposed general and lightweight
models have outperformed other YOLO series models in their respective
application scenarios.
Related papers
- Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detection [3.7793767915135295]
We propose a new model named MAF-YOLO in this paper.
It is a novel object detection framework with a versatile neck named Multi-Branch Auxiliary FPN (MAFPN)
Taking the nano version of MAF-YOLO for example, it can achieve 42.4% AP on COCO with only 3.76M learnable parameters and 10.51G FLOPs, and approximately outperforms YOLOv8n by about 5.1%.
arXiv Detail & Related papers (2024-07-05T09:35:30Z) - LeYOLO, New Scalable and Efficient CNN Architecture for Object Detection [0.0]
We focus on design choices of neural network architectures for efficient object detection based on FLOP.
We propose several optimizations to enhance the efficiency of YOLO-based models.
This paper contributes to a new scaling paradigm for object detection and YOLO-centric models called LeYOLO.
arXiv Detail & Related papers (2024-06-20T12:08:24Z) - Mamba YOLO: SSMs-Based YOLO For Object Detection [9.879086222226617]
Mamba-YOLO is a novel object detection model based on State Space Models.
We show that Mamba-YOLO surpasses the existing YOLO series models in both performance and competitiveness.
arXiv Detail & Related papers (2024-06-09T15:56:19Z) - YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection.
The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs.
We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z) - SATAY: A Streaming Architecture Toolflow for Accelerating YOLO Models on
FPGA Devices [48.47320494918925]
This work tackles the challenges of deploying stateof-the-art object detection models onto FPGA devices for ultralow latency applications.
We employ a streaming architecture design for our YOLO accelerators, implementing the complete model on-chip in a deeply pipelined fashion.
We introduce novel hardware components to support the operations of YOLO models in a dataflow manner, and off-chip memory buffering to address the limited on-chip memory resources.
arXiv Detail & Related papers (2023-09-04T13:15:01Z) - YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time
Object Detection [80.11152626362109]
We provide an efficient and performant object detector, termed YOLO-MS.
We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets.
Our work can also be used as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z) - EdgeYOLO: An Edge-Real-Time Object Detector [69.41688769991482]
This paper proposes an efficient, low-complexity and anchor-free object detector based on the state-of-the-art YOLO framework.
We develop an enhanced data augmentation method to effectively suppress overfitting during training, and design a hybrid random loss function to improve the detection accuracy of small objects.
Our baseline model can reach the accuracy of 50.6% AP50:95 and 69.8% AP50 in MS 2017 dataset, 26.4% AP50:95 and 44.8% AP50 in VisDrone 2019-DET dataset, and it meets real-time requirements (FPS>=30) on edge-computing device Nvidia
arXiv Detail & Related papers (2023-02-15T06:05:14Z) - A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z) - Contextual-Bandit Anomaly Detection for IoT Data in Distributed
Hierarchical Edge Computing [65.78881372074983]
IoT devices can hardly afford complex deep neural networks (DNN) models, and offloading anomaly detection tasks to the cloud incurs long delay.
We propose and build a demo for an adaptive anomaly detection approach for distributed hierarchical edge computing (HEC) systems.
We show that our proposed approach significantly reduces detection delay without sacrificing accuracy, as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-04-15T06:13:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.