Related papers: A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS

Related papers

YOLOv1 to YOLOv11: A Comprehensive Survey of Real-Time Object Detection Innovations and Challenges [0.0]
YOLO (You Only Look Once) models transform the landscape of real-time vision applications through unified, end-to-end detection frameworks.<n>This paper offers a comprehensive review of the YOLO family, highlighting architectural innovations, performance benchmarks, extended capabilities, and real-world use cases.<n>We critically analyze the evolution of YOLO models and discuss emerging research directions that extend their impact across diverse computer vision domains.
arXiv Detail & Related papers (2025-08-04T05:13:51Z)
YOLO-Count: Differentiable Object Counting for Text-to-Image Generation [49.79896127854202]
YOLO-Count is a differentiable open-vocabulary object counting model that tackles both general counting challenges and enables precise quantity control for text-to-image (T2I) generation.<n>A core contribution is the 'cardinality' map, a novel regression target that accounts for variations in object size and spatial distribution.
arXiv Detail & Related papers (2025-08-01T15:51:39Z)
YOLOE: Real-Time Seeing Anything [64.35836518093342]
YOLOE integrates detection and segmentation across diverse open prompt mechanisms within a single highly efficient model. YOLOE's exceptional zero-shot performance and transferability with high inference efficiency and low training cost.
arXiv Detail & Related papers (2025-03-10T15:42:59Z)
ODverse33: Is the New YOLO Version Always Better? A Multi Domain benchmark from YOLO v5 to v11 [6.553031877558699]
Key questions arise with the increasing frequency of new YOLO versions being released. What are the core innovations in each YOLO version and how do these changes translate into real-world performance gains? In this paper, we summarize the key innovations from YOLOv1 to YOLOv11, introduce a comprehensive benchmark called ODverse33, and explore the practical impact of model improvements in real-world, multi-domain applications.
arXiv Detail & Related papers (2025-02-20T06:57:58Z)
YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review [0.0]
This study presents a comprehensive and in-depth architecture comparison of the four most recent YOLO models. The analysis reveals that while each version of YOLO has improvements in architecture and feature extraction, certain blocks remain unchanged.
arXiv Detail & Related papers (2025-01-23T05:57:13Z)
YOLOv1 to YOLOv10: The fastest and most accurate real-time object detection systems [13.925576406783991]
This review article re-examines the characteristics of the YOLO series from the latest technical point of view. We take a closer look at how the methods proposed by the YOLO series in the past ten years have affected the development of subsequent technologies.
arXiv Detail & Related papers (2024-08-18T02:11:00Z)
Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation [74.65906322148997]
We introduce a new object detection method that integrates hypergraph computations to capture the complex high-order correlations among visual features. Hyper-YOLO significantly outperforms the advanced YOLOv8-N and YOLOv9T with 12% $textval$ and 9% $APMoonLab improvements.
arXiv Detail & Related papers (2024-08-09T01:21:15Z)
YOLOv5, YOLOv8 and YOLOv10: The Go-To Detectors for Real-time Vision [0.6662800021628277]
This paper focuses on the evolution of the YOLO (You Only Look Once) object detection algorithm, focusing on YOLOv5, YOLOv8, and YOLOv10. We analyze the architectural advancements, performance improvements, and suitability for edge deployment across these versions.
arXiv Detail & Related papers (2024-07-03T10:40:20Z)
Comprehensive Performance Evaluation of YOLO11, YOLOv10, YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments [0.9565934024763958]
This study extensively evaluated You Only Look Once (YOLO) object detection algorithms across all configurations (total 22) of YOLOv8, YOLOv9, YOLOv10, and YOLO11 for green fruit detection in commercial orchards. The research also validated in-field fruitlet counting using an iPhone and machine vision sensors across four apple varieties: Scifresh, Scilate, Honeycrisp and Cosmic Crisp.
arXiv Detail & Related papers (2024-07-01T17:59:55Z)
YOLOv10 to Its Genesis: A Decadal and Comprehensive Review of The You Only Look Once (YOLO) Series [6.751138557596013]
This study examines the advancements introduced by YOLO algorithms, beginning with YOLOv10 and progressing through YOLOv9, YOLOv8, and subsequent versions. The study highlights the transformative impact of YOLO across five critical application areas: automotive safety, healthcare, industrial manufacturing, surveillance, and agriculture.
arXiv Detail & Related papers (2024-06-12T06:41:23Z)
YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection. The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs. We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z)
YOLO-World: Real-Time Open-Vocabulary Object Detection [87.08732047660058]
We introduce YOLO-World, an innovative approach that enhances YOLO with open-vocabulary detection capabilities. Our method excels in detecting a wide range of objects in a zero-shot manner with high efficiency. YOLO-World achieves 35.4 AP with 52.0 FPS on V100, which outperforms many state-of-the-art methods in terms of both accuracy and speed.
arXiv Detail & Related papers (2024-01-30T18:59:38Z)
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection [80.11152626362109]
We provide an efficient and performant object detector, termed YOLO-MS. We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets. Our work can also be used as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z)
SSMTL++: Revisiting Self-Supervised Multi-Task Learning for Video Anomaly Detection [108.57862846523858]
We revisit the self-supervised multi-task learning framework, proposing several updates to the original method. We modernize the 3D convolutional backbone by introducing multi-head self-attention modules. In our attempt to further improve the model, we study additional self-supervised learning tasks, such as predicting segmentation maps.
arXiv Detail & Related papers (2022-07-16T19:25:41Z)
A lightweight and accurate YOLO-like network for small target detection in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection. YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation. YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.