Related papers: VajraV1 -- The most accurate Real Time Object Detector of the YOLO family

VajraV1 -- The most accurate Real Time Object Detector of the YOLO family

URL: http://arxiv.org/abs/2512.13834v1
Date: Mon, 15 Dec 2025 19:16:15 GMT
Title: VajraV1 -- The most accurate Real Time Object Detector of the YOLO family
Authors: Naman Balbir Singh Makkar,
Abstract summary: This technical report presents the VajraV1 model architecture, which introduces architectural enhancements over existing YOLO-based detectors.<n>VajraV1 combines effective design choices from prior YOLO models to achieve state-of-the-art accuracy.<n>On the COCO validation set, VajraV1-Nano achieves 44.3% mAP, outperforming YOLOv12-N by 3.7% and YOLOv13-N by 2.7%.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent years have seen significant advances in real-time object detection, with the release of YOLOv10, YOLO11, YOLOv12, and YOLOv13 between 2024 and 2025. This technical report presents the VajraV1 model architecture, which introduces architectural enhancements over existing YOLO-based detectors. VajraV1 combines effective design choices from prior YOLO models to achieve state-of-the-art accuracy among real-time object detectors while maintaining competitive inference speed. On the COCO validation set, VajraV1-Nano achieves 44.3% mAP, outperforming YOLOv12-N by 3.7% and YOLOv13-N by 2.7% at latency competitive with YOLOv12-N and YOLOv11-N. VajraV1-Small achieves 50.4% mAP, exceeding YOLOv12-S and YOLOv13-S by 2.4%. VajraV1-Medium achieves 52.7% mAP, outperforming YOLOv12-M by 0.2%. VajraV1-Large achieves 53.7% mAP, surpassing YOLOv13-L by 0.3%. VajraV1-Xlarge achieves 56.2% mAP, outperforming all existing real-time object detectors.

Related papers

YOLO-DS: Fine-Grained Feature Decoupling via Dual-Statistic Synergy Operator for Object Detection [55.58092342624062]
We propose YOLO-DS, a framework built around a novel Dual-Statistic Synergy Operator (DSO)<n>YOLO-DS decouples object features by jointly modeling the channel-wise mean and the peak-to-mean difference.<n>On the MS-COCO benchmark, YOLO-DS consistently outperforms YOLOv8 across five model scales.
arXiv Detail & Related papers (2026-01-26T05:50:32Z)
YOLO-Drone: An Efficient Object Detection Approach Using the GhostHead Network for Drone Images [0.0]
This paper introduces an enhancement to the Head network of the YOLOv11 algorithm, called the GhostHead Network.<n>The model incorporating this improvement is named YOLO-Drone.<n> Experimental results demonstrate that YOLO-Drone achieves significant improvements in key detection accuracy metrics, including Precision, Recall, F1-Score, and mAP (0.5), compared to the original YOLOv11.
arXiv Detail & Related papers (2025-11-14T02:32:57Z)
YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception [58.06752127687312]
We propose YOLOv13, an accurate and lightweight object detector.<n>We propose a Hypergraph-based Adaptive Correlation Enhancement (HyperACE) mechanism.<n>We also propose a Full-Pipeline Aggregation-and-Distribution (FullPAD) paradigm.
arXiv Detail & Related papers (2025-06-21T15:15:03Z)
Improved YOLOv12 with LLM-Generated Synthetic Data for Enhanced Apple Detection and Benchmarking Against YOLOv11 and YOLOv10 [0.4143603294943439]
The YOLOv12n configuration achieved the highest precision at 0.916, the highest recall at 0.969, and the highest mean Average Precision (mAP@50) at 0.978.<n>The technique also offered a cost-effective solution by reducing the need for extensive manual data collection in the agricultural field.
arXiv Detail & Related papers (2025-02-26T20:24:01Z)
YOLO Evolution: A Comprehensive Benchmark and Architectural Review of YOLOv12, YOLO11, and Their Previous Versions [0.0]
This study represents the first comprehensive experimental evaluation of YOLOv3 to the latest version, YOLOv12.<n>The challenges considered include varying object sizes, diverse aspect ratios, and small-sized objects of a single class.<n>Our analysis highlights the distinctive strengths and limitations of each YOLO version.
arXiv Detail & Related papers (2024-10-31T20:45:00Z)
YOLO11 and Vision Transformers based 3D Pose Estimation of Immature Green Fruits in Commercial Apple Orchards for Robotic Thinning [0.4143603294943439]
Method for 3D pose estimation of immature green apples (fruitlets) in commercial orchards was developed.<n>YOLO11(or YOLOv11) object detection and pose estimation algorithm alongside Vision Transformers (ViT) for depth estimation.<n>YOLO11n surpassed all configurations of YOLO11 and YOLOv8 in terms of box precision and pose precision.
arXiv Detail & Related papers (2024-10-21T17:00:03Z)
Comprehensive Performance Evaluation of YOLOv12, YOLO11, YOLOv10, YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments [0.9565934024763958]
This study systematically performed a real-world evaluation of the performances of YOLOv8, YOLOv9, YOLOv10, YOLO11( or YOLOv11), and YOLOv12 object detection algorithms.<n>YOLOv12l recorded the highest recall rate at 0.90, compared to all other configurations of YOLO models.<n>YOLOv11n achieved highest inference speed of 2.4 ms, outperforming YOLOv8n (4.1 ms), YOLOv9 Gelan-s (11.5 ms), YOLOv10n (5.5 ms), and YOLOv12n
arXiv Detail & Related papers (2024-07-01T17:59:55Z)
YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection. The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs. We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z)
YOLO-World: Real-Time Open-Vocabulary Object Detection [87.08732047660058]
We introduce YOLO-World, an innovative approach that enhances YOLO with open-vocabulary detection capabilities. Our method excels in detecting a wide range of objects in a zero-shot manner with high efficiency. YOLO-World achieves 35.4 AP with 52.0 FPS on V100, which outperforms many state-of-the-art methods in terms of both accuracy and speed.
arXiv Detail & Related papers (2024-01-30T18:59:38Z)
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection [63.36722419180875]
We provide an efficient and performant object detector, termed YOLO-MS.<n>We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets.<n>Our work can also serve as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z)
A lightweight and accurate YOLO-like network for small target detection in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection. YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation. YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z)
YOLOX: Exceeding YOLO Series in 2021 [25.734980783220976]
We switch the YOLO detector to an anchor-free manner and conduct other advanced detection techniques. For YOLO-Nano with only 0.91M parameters and 1.08G FLOPs, we get 25.3% AP on COCO, surpassing NanoDet by 1.8% AP. For YOLOX-L with roughly the same amount of parameters as YOLOv4-CSP, YOLOv5-L, we achieve 50.0% AP on COCO at a speed of 68.9 FPS on Tesla V100.
arXiv Detail & Related papers (2021-07-18T12:55:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.