Related papers: YOLO26: A Comprehensive Architecture Overview and Key Improvements

YOLO26: A Comprehensive Architecture Overview and Key Improvements

URL: http://arxiv.org/abs/2602.14582v1
Date: Mon, 16 Feb 2026 09:25:19 GMT
Title: YOLO26: A Comprehensive Architecture Overview and Key Improvements
Authors: Priyanto Hidayatullah, Refdinal Tubagus,
Abstract summary: You Only Look Once (YOLO) has been the prominent model for computer vision in deep learning for a decade.<n>This study explores the novel aspects of YOLO26, the most recent version in the YOLO series.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: You Only Look Once (YOLO) has been the prominent model for computer vision in deep learning for a decade. This study explores the novel aspects of YOLO26, the most recent version in the YOLO series. The elimination of Distribution Focal Loss (DFL), implementation of End-to-End NMS-Free Inference, introduction of ProgLoss + Small-Target-Aware Label Assignment (STAL), and use of the MuSGD optimizer are the primary enhancements designed to improve inference speed, which is claimed to achieve a 43% boost in CPU mode. This is designed to allow YOLO26 to attain real-time performance on edge devices or those without GPUs. Additionally, YOLO26 offers improvements in many computer vision tasks, including instance segmentation, pose estimation, and oriented bounding box (OBB) decoding. We aim for this effort to provide more value than just consolidating information already included in the existing technical documentation. Therefore, we performed a rigorous architectural investigation into YOLO26, mostly using the source code available in its GitHub repository and its official documentation. The authentic and detailed operational mechanisms of YOLO26 are inside the source code, which is seldom extracted by others. The YOLO26 architectural diagram is shown as the outcome of the investigation. This study is, to our knowledge, the first one presenting the CNN-based YOLO26 architecture, which is the core of YOLO26. Our objective is to provide a precise architectural comprehension of YOLO26 for researchers and developers aspiring to enhance the YOLO model, ensuring it remains the leading deep learning model in computer vision.

Related papers

YOLO-IOD: Towards Real Time Incremental Object Detection [57.862742461237055]
We introduce YOLO-IOD, a real-time Incremental Object Detection (IOD) framework that is constructed upon the pretrained YOLO-World model.<n>YOLO-IOD encompasses three principal components: 1) Conflict-Aware Pseudo-Label Refinement (CPR), which mitigates the foreground-background confusion.<n>We also introduce Cross-Stage Asymmetric Knowledge Distillation (CAKD), which addresses the misaligned knowledge distillation conflict.
arXiv Detail & Related papers (2025-12-28T15:35:26Z)
YOLOA: Real-Time Affordance Detection via LLM Adapter [96.61111291833544]
Affordance detection aims to jointly address the fundamental "what-where-how" challenge in embodied AI.<n>We introduce YOLO Affordance (YOLOA), a real-time affordance detection model that jointly handles object detection and affordance learning.<n>Experiments on our relabeled ADG-Det and IIT-Heat benchmarks demonstrate that YOLOA achieves state-of-the-art accuracy while maintaining real-time performance.
arXiv Detail & Related papers (2025-12-03T03:53:31Z)
Ultralytics YOLO Evolution: An Overview of YOLO26, YOLO11, YOLOv8 and YOLOv5 Object Detectors for Computer Vision and Pattern Recognition [3.2882817259131403]
This paper presents a comprehensive overview of the Ultralytics YOLO(You Only Look Once) family of object detectors.<n>The review begins with the most recent release, YOLO26 (or YOLOv26), which introduces key innovations including Distribution Focal Loss (DFL) removal.<n>The paper identifies challenges and future directions, including dense-scene limitations, hybrid CNN-Transformer integration, open-vocabulary detection, and edge-aware training approaches.
arXiv Detail & Related papers (2025-10-06T23:28:44Z)
YOLO26: Key Architectural Enhancements and Performance Benchmarking for Real-Time Object Detection [3.1957907449739764]
This study presents a comprehensive analysis of Ultralytics YOLO26, highlighting its key architectural enhancements and performance benchmarking for real-time object detection.<n>YOLO26, released in September 2025, stands as the newest and most advanced member of the YOLO family, purpose-built to deliver efficiency, accuracy, and deployment readiness on edge and low-power devices.
arXiv Detail & Related papers (2025-09-29T17:58:04Z)
YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review [0.0]
This study presents a comprehensive and in-depth architecture comparison of the four most recent YOLO models.<n>The analysis reveals that while each version of YOLO has improvements in architecture and feature extraction, certain blocks remain unchanged.
arXiv Detail & Related papers (2025-01-23T05:57:13Z)
YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection. The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs. We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z)
YOLO-World: Real-Time Open-Vocabulary Object Detection [87.08732047660058]
We introduce YOLO-World, an innovative approach that enhances YOLO with open-vocabulary detection capabilities. Our method excels in detecting a wide range of objects in a zero-shot manner with high efficiency. YOLO-World achieves 35.4 AP with 52.0 FPS on V100, which outperforms many state-of-the-art methods in terms of both accuracy and speed.
arXiv Detail & Related papers (2024-01-30T18:59:38Z)
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection [63.36722419180875]
We provide an efficient and performant object detector, termed YOLO-MS.<n>We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets.<n>Our work can also serve as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z)
A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS [0.0]
YOLO has become a central real-time object detection system for robotics, driverless cars, and video monitoring applications. We present a comprehensive analysis of YOLO's evolution, examining the innovations and contributions in each iteration from the original YOLO up to YOLOv8, YOLO-NAS, and YOLO with Transformers.
arXiv Detail & Related papers (2023-04-02T10:27:34Z)
A lightweight and accurate YOLO-like network for small target detection in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection. YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation. YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.