Related papers: Comprehensive Performance Evaluation of YOLOv12, YOLO11, YOLOv10, YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments

Comprehensive Performance Evaluation of YOLOv12, YOLO11, YOLOv10, YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments

URL: http://arxiv.org/abs/2407.12040v7
Date: Tue, 25 Feb 2025 23:00:05 GMT
Title: Comprehensive Performance Evaluation of YOLOv12, YOLO11, YOLOv10, YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments
Authors: Ranjan Sapkota, Zhichao Meng, Martin Churuvija, Xiaoqiang Du, Zenghong Ma, Manoj Karkee,
Abstract summary: This study systematically performed a real-world evaluation of the performances of YOLOv8, YOLOv9, YOLOv10, YOLO11( or YOLOv11), and YOLOv12 object detection algorithms.<n>YOLOv12l recorded the highest recall rate at 0.90, compared to all other configurations of YOLO models.<n>YOLOv11n achieved highest inference speed of 2.4 ms, outperforming YOLOv8n (4.1 ms), YOLOv9 Gelan-s (11.5 ms), YOLOv10n (5.5 ms), and YOLOv12n
Score: 0.9565934024763958
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This study systematically performed an extensive real-world evaluation of the performances of all configurations of YOLOv8, YOLOv9, YOLOv10, YOLO11( or YOLOv11), and YOLOv12 object detection algorithms in terms of precision, recall, mean Average Precision at 50\% Intersection over Union (mAP@50), and computational speeds including pre-processing, inference, and post-processing times immature green apple (or fruitlet) detection in commercial orchards. Additionally, this research performed and validated in-field counting of the fruitlets using an iPhone and machine vision sensors. Among the configurations, YOLOv12l recorded the highest recall rate at 0.90, compared to all other configurations of YOLO models. Likewise, YOLOv10x achieved the highest precision score of 0.908, while YOLOv9 Gelan-c attained a precision of 0.903. Analysis of mAP@0.50 revealed that YOLOv9 Gelan-base and YOLOv9 Gelan-e reached peak scores of 0.935, with YOLO11s and YOLOv12l following closely at 0.933 and 0.931, respectively. For counting validation using images captured with an iPhone 14 Pro, the YOLO11n configuration demonstrated outstanding accuracy, recording RMSE values of 4.51 for Honeycrisp, 4.59 for Cosmic Crisp, 4.83 for Scilate, and 4.96 for Scifresh; corresponding MAE values were 4.07, 3.98, 7.73, and 3.85. Similar performance trends were observed with RGB-D sensor data. Moreover, sensor-specific training on Intel Realsense data significantly enhanced model performance. YOLOv11n achieved highest inference speed of 2.4 ms, outperforming YOLOv8n (4.1 ms), YOLOv9 Gelan-s (11.5 ms), YOLOv10n (5.5 ms), and YOLOv12n (4.6 ms), underscoring its suitability for real-time object detection applications. (YOLOv12 architecture, YOLOv11 Architecture, YOLOv12 object detection, YOLOv11 object detecion, YOLOv12 segmentation)

Related papers

YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception [44.76134548023668]
We propose YOLOv13, an accurate and lightweight object detector.<n>We propose a Hypergraph-based Adaptive Correlation Enhancement (HyperACE) mechanism.<n>We also propose a Full-Pipeline Aggregation-and-Distribution (FullPAD) paradigm.
arXiv Detail & Related papers (2025-06-21T15:15:03Z)
YOLOE: Real-Time Seeing Anything [64.35836518093342]
YOLOE integrates detection and segmentation across diverse open prompt mechanisms within a single highly efficient model. YOLOE's exceptional zero-shot performance and transferability with high inference efficiency and low training cost.
arXiv Detail & Related papers (2025-03-10T15:42:59Z)
Improved YOLOv12 with LLM-Generated Synthetic Data for Enhanced Apple Detection and Benchmarking Against YOLOv11 and YOLOv10 [0.4143603294943439]
The YOLOv12n configuration achieved the highest precision at 0.916, the highest recall at 0.969, and the highest mean Average Precision (mAP@50) at 0.978. The technique also offered a cost-effective solution by reducing the need for extensive manual data collection in the agricultural field.
arXiv Detail & Related papers (2025-02-26T20:24:01Z)
YOLOv12: Attention-Centric Real-Time Object Detectors [38.507511985479006]
This paper proposes an attention-centric YOLO framework, YOLOv12, that matches the speed of previous CNN-based ones. YOLOv12 surpasses all popular real-time object detectors in accuracy with competitive speed.
arXiv Detail & Related papers (2025-02-18T04:20:14Z)
Evaluating the Evolution of YOLO (You Only Look Once) Models: A Comprehensive Benchmark Study of YOLO11 and Its Predecessors [0.0]
This study presents a benchmark analysis of various YOLO (You Only Look Once) algorithms, from YOLOv3 to the newest addition, YOLO11. It evaluates their performance on three diverse datasets: Traffic Signs (with varying object sizes), African Wildlife (with diverse aspect ratios and at least one instance of the object per image), and Ships and Vessels (with small-sized objects of a single class)
arXiv Detail & Related papers (2024-10-31T20:45:00Z)
Comparing YOLO11 and YOLOv8 for instance segmentation of occluded and non-occluded immature green fruits in complex orchard environment [0.4143603294943439]
This study focused on YOLO11 and YOLOv8's instance segmentation capabilities for immature green apples in orchard environments. YOLO11n-seg achieved the highest mask precision across all categories with a notable score of 0.831. YOLO11m-seg consistently outperformed, registering the highest scores for both box and mask segmentation.
arXiv Detail & Related papers (2024-10-24T00:12:20Z)
YOLO11 and Vision Transformers based 3D Pose Estimation of Immature Green Fruits in Commercial Apple Orchards for Robotic Thinning [0.4143603294943439]
Method for 3D pose estimation of immature green apples (fruitlets) in commercial orchards was developed. YOLO11 object detection and pose estimation algorithm alongside Vision Transformers (ViT) for depth estimation. YOLO11n surpassed all configurations of YOLO11 and YOLOv8 in terms of box precision and pose precision.
arXiv Detail & Related papers (2024-10-21T17:00:03Z)
YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection. The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs. We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z)
YOLO-World: Real-Time Open-Vocabulary Object Detection [87.08732047660058]
We introduce YOLO-World, an innovative approach that enhances YOLO with open-vocabulary detection capabilities. Our method excels in detecting a wide range of objects in a zero-shot manner with high efficiency. YOLO-World achieves 35.4 AP with 52.0 FPS on V100, which outperforms many state-of-the-art methods in terms of both accuracy and speed.
arXiv Detail & Related papers (2024-01-30T18:59:38Z)
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection [80.11152626362109]
We provide an efficient and performant object detector, termed YOLO-MS. We train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets. Our work can also be used as a plug-and-play module for other YOLO models.
arXiv Detail & Related papers (2023-08-10T10:12:27Z)
YOLOv6 v3.0: A Full-Scale Reloading [9.348857966505111]
We refurnish YOLOv6 with numerous novel enhancements on the network architecture and the training scheme. YOLOv6-N hits 37.5% AP on the COCO dataset at a throughput of 1187 FPS tested with an NVIDIA Tesla T4 GPU. YOLOv6-S strikes 45.0% AP at 484 FPS, outperforming other mainstream detectors at the same scale.
arXiv Detail & Related papers (2023-01-13T14:46:46Z)
A lightweight and accurate YOLO-like network for small target detection in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection. YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation. YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z)
Workshop on Autonomous Driving at CVPR 2021: Technical Report for Streaming Perception Challenge [57.647371468876116]
We introduce our real-time 2D object detection system for the realistic autonomous driving scenario. Our detector is built on a newly designed YOLO model, called YOLOX. On the Argoverse-HD dataset, our system achieves 41.0 streaming AP, which surpassed second place by 7.8/6.1 on detection-only track/fully track, respectively.
arXiv Detail & Related papers (2021-07-27T06:36:06Z)
YOLOX: Exceeding YOLO Series in 2021 [25.734980783220976]
We switch the YOLO detector to an anchor-free manner and conduct other advanced detection techniques. For YOLO-Nano with only 0.91M parameters and 1.08G FLOPs, we get 25.3% AP on COCO, surpassing NanoDet by 1.8% AP. For YOLOX-L with roughly the same amount of parameters as YOLOv4-CSP, YOLOv5-L, we achieve 50.0% AP on COCO at a speed of 68.9 FPS on Tesla V100.
arXiv Detail & Related papers (2021-07-18T12:55:11Z)
PP-YOLO: An Effective and Efficient Implementation of Object Detector [44.189808709103865]
This paper implements an object detector with relatively balanced effectiveness and efficiency. Considering that YOLOv3 has been widely used in practice, we develop a new object detector based on YOLOv3. Since all experiments in this paper are conducted based on PaddlePaddle, we call it PP-YOLO.
arXiv Detail & Related papers (2020-07-23T16:06:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.