YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review
- URL: http://arxiv.org/abs/2501.13400v1
- Date: Thu, 23 Jan 2025 05:57:13 GMT
- Title: YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review
- Authors: Priyanto Hidayatullah, Nurjannah Syakrani, Muhammad Rizqi Sholahuddin, Trisna Gelar, Refdinal Tubagus,
- Abstract summary: This study presents a comprehensive and in-depth architecture comparison of the four most recent YOLO models.
The analysis reveals that while each version of YOLO has improvements in architecture and feature extraction, certain blocks remain unchanged.
- Score: 0.0
- License:
- Abstract: In the field of deep learning-based computer vision, YOLO is revolutionary. With respect to deep learning models, YOLO is also the one that is evolving the most rapidly. Unfortunately, not every YOLO model possesses scholarly publications. Moreover, there exists a YOLO model that lacks a publicly accessible official architectural diagram. Naturally, this engenders challenges, such as complicating the understanding of how the model operates in practice. Furthermore, the review articles that are presently available do not delve into the specifics of each model. The objective of this study is to present a comprehensive and in-depth architecture comparison of the four most recent YOLO models, specifically YOLOv8 through YOLO11, thereby enabling readers to quickly grasp not only how each model functions, but also the distinctions between them. To analyze each YOLO version's architecture, we meticulously examined the relevant academic papers, documentation, and scrutinized the source code. The analysis reveals that while each version of YOLO has improvements in architecture and feature extraction, certain blocks remain unchanged. The lack of scholarly publications and official diagrams presents challenges for understanding the model's functionality and future enhancement. Future developers are encouraged to provide these resources.
Related papers
- ODVerse33: Is the New YOLO Version Always Better? A Multi Domain benchmark from YOLO v5 to v11 [6.553031877558699]
Key questions arise with the increasing frequency of new YOLO versions being released.
What are the core innovations in each YOLO version and how do these changes translate into real-world performance gains?
In this paper, we summarize the key innovations from YOLOv1 to YOLOv11, introduce a comprehensive benchmark called ODverse33, and explore the practical impact of model improvements in real-world, multi-domain applications.
arXiv Detail & Related papers (2025-02-20T06:57:58Z) - YOLOv1 to YOLOv10: The fastest and most accurate real-time object detection systems [13.925576406783991]
This review article re-examines the characteristics of the YOLO series from the latest technical point of view.
We take a closer look at how the methods proposed by the YOLO series in the past ten years have affected the development of subsequent technologies.
arXiv Detail & Related papers (2024-08-18T02:11:00Z) - Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation [74.65906322148997]
We introduce a new object detection method that integrates hypergraph computations to capture the complex high-order correlations among visual features.
Hyper-YOLO significantly outperforms the advanced YOLOv8-N and YOLOv9T with 12% $textval$ and 9% $APMoonLab improvements.
arXiv Detail & Related papers (2024-08-09T01:21:15Z) - YOLO11 to Its Genesis: A Decadal and Comprehensive Review of The You Only Look Once (YOLO) Series [6.751138557596013]
This review systematically examines the progression of the You Only Look Once (YOLO) object detection algorithms from YOLOv1 to YOLOv11.
The evolution signifies a path towards integrating YOLO with multimodal, context-aware, and Artificial General Intelligence (AGI) systems for the next YOLO decade.
arXiv Detail & Related papers (2024-06-12T06:41:23Z) - YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection.
The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs.
We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z) - YOLO-World: Real-Time Open-Vocabulary Object Detection [87.08732047660058]
We introduce YOLO-World, an innovative approach that enhances YOLO with open-vocabulary detection capabilities.
Our method excels in detecting a wide range of objects in a zero-shot manner with high efficiency.
YOLO-World achieves 35.4 AP with 52.0 FPS on V100, which outperforms many state-of-the-art methods in terms of both accuracy and speed.
arXiv Detail & Related papers (2024-01-30T18:59:38Z) - Identifying and Mitigating Model Failures through Few-shot CLIP-aided
Diffusion Generation [65.268245109828]
We propose an end-to-end framework to generate text descriptions of failure modes associated with spurious correlations.
These descriptions can be used to generate synthetic data using generative models, such as diffusion models.
Our experiments have shown remarkable textbfimprovements in accuracy ($sim textbf21%$) on hard sub-populations.
arXiv Detail & Related papers (2023-12-09T04:43:49Z) - YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection [63.36722419180875]
We provide an efficient and performant object detector, termed YOLO-MS.
We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets.
Our work can also serve as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z) - Model Compression Methods for YOLOv5: A Review [1.2387676601792899]
We focus on pruning and quantization due to their comparative modularity.
This is the first specific review paper that surveys pruning and quantization methods from an implementation point of view on YOLOv5.
Our study is also extendable to newer versions of YOLO as implementing them on resource-limited devices poses the same challenges that persist even today.
arXiv Detail & Related papers (2023-07-21T21:07:56Z) - A Comprehensive Review of YOLO Architectures in Computer Vision: From
YOLOv1 to YOLOv8 and YOLO-NAS [0.0]
YOLO has become a central real-time object detection system for robotics, driverless cars, and video monitoring applications.
We present a comprehensive analysis of YOLO's evolution, examining the innovations and contributions in each iteration from the original YOLO up to YOLOv8, YOLO-NAS, and YOLO with Transformers.
arXiv Detail & Related papers (2023-04-02T10:27:34Z) - A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.