Related papers: Maize Seedling Detection Dataset (MSDD): A Curated High-Resolution RGB Dataset for Seedling Maize Detection and Benchmarking with YOLOv9, YOLO11, YOLOv12 and Faster-RCNN

Maize Seedling Detection Dataset (MSDD): A Curated High-Resolution RGB Dataset for Seedling Maize Detection and Benchmarking with YOLOv9, YOLO11, YOLOv12 and Faster-RCNN

URL: http://arxiv.org/abs/2509.15181v1
Date: Thu, 18 Sep 2025 17:41:59 GMT
Title: Maize Seedling Detection Dataset (MSDD): A Curated High-Resolution RGB Dataset for Seedling Maize Detection and Benchmarking with YOLOv9, YOLO11, YOLOv12 and Faster-RCNN
Authors: Dewi Endah Kharismawati, Toni Kazic,
Abstract summary: Stand counting determines how many plants germinated, guiding timely decisions such as replanting or adjusting inputs.<n>We introduce MSDD, a high-quality aerial image dataset for maize seedling stand counting, with applications in early-season crop monitoring, yield prediction, and in-field management.<n> MSDD contains three classes-single, double, and triple plants-capturing diverse growth stages, planting setups, soil types, lighting conditions, camera angles, and densities, ensuring robustness for real-world use.
Score: 0.28647133890966986
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Accurate maize seedling detection is crucial for precision agriculture, yet curated datasets remain scarce. We introduce MSDD, a high-quality aerial image dataset for maize seedling stand counting, with applications in early-season crop monitoring, yield prediction, and in-field management. Stand counting determines how many plants germinated, guiding timely decisions such as replanting or adjusting inputs. Traditional methods are labor-intensive and error-prone, while computer vision enables efficient, accurate detection. MSDD contains three classes-single, double, and triple plants-capturing diverse growth stages, planting setups, soil types, lighting conditions, camera angles, and densities, ensuring robustness for real-world use. Benchmarking shows detection is most reliable during V4-V6 stages and under nadir views. Among tested models, YOLO11 is fastest, while YOLOv9 yields the highest accuracy for single plants. Single plant detection achieves precision up to 0.984 and recall up to 0.873, but detecting doubles and triples remains difficult due to rarity and irregular appearance, often from planting errors. Class imbalance further reduces accuracy in multi-plant detection. Despite these challenges, YOLO11 maintains efficient inference at 35 ms per image, with an additional 120 ms for saving outputs. MSDD establishes a strong foundation for developing models that enhance stand counting, optimize resource allocation, and support real-time decision-making. This dataset marks a step toward automating agricultural monitoring and advancing precision agriculture.

Related papers

A Comparative Benchmark of Real-time Detectors for Blueberry Detection towards Precision Orchard Management [2.667064587590596]
This study presents a novel comparative benchmark analysis of advanced real-time object detectors.<n>This dataset comprises 661 canopy images collected with smartphones during the 2022-2023 seasons.<n>Among the YOLO models, YOLOv12m achieved the best accuracy with a mAP@50 of 93.3%.
arXiv Detail & Related papers (2025-09-24T21:42:24Z)
YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception [58.06752127687312]
We propose YOLOv13, an accurate and lightweight object detector.<n>We propose a Hypergraph-based Adaptive Correlation Enhancement (HyperACE) mechanism.<n>We also propose a Full-Pipeline Aggregation-and-Distribution (FullPAD) paradigm.
arXiv Detail & Related papers (2025-06-21T15:15:03Z)
WeedVision: Multi-Stage Growth and Classification of Weeds using DETR and RetinaNet for Precision Agriculture [0.0]
This research uses object detection models to identify and classify 16 weed species of economic concern across 174 classes.<n>A robust dataset comprising 203,567 images was developed, meticulously labeled by species and growth stage.<n>RetinaNet demonstrated superior performance, achieving a mean Average Precision (mAP) of 0.907 on the training set and 0.904 on the test set.
arXiv Detail & Related papers (2025-02-16T20:49:22Z)
Assessing the Capability of YOLO- and Transformer-based Object Detectors for Real-time Weed Detection [0.0]
All available models of YOLOv8, YOLOv9, YOLOv10, and RT-DETR are trained and evaluated with images from a real field situation.<n>The results demonstrate that while all models perform equally well in the metrics evaluated, the YOLOv9 models stand out in terms of their strong recall scores.<n> RT-DETR models, especially RT-DETR-l, excel in precision with reaching 82.44 % on dataset 1 and 81.46 % in dataset 2.
arXiv Detail & Related papers (2025-01-29T02:39:57Z)
Strawberry detection and counting based on YOLOv7 pruning and information based tracking algorithm [2.8246025005347875]
This study proposed an optimal pruning of detection heads of the deep learning model (YOLOv7 and its variants) that could achieve fast and precise strawberry flower, immature fruit, and mature fruit detection. The proposed pruning of detection heads across YOLOv7 variants, notably Pruning-YOLOv7-tiny with detection head 3 and Pruning-YOLOv7-tiny with heads 2 and 3 achieved the best inference speed (163.9 frames per second) and detection accuracy (89.1%)
arXiv Detail & Related papers (2024-07-17T14:41:57Z)
Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets. We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z)
Vision Transformers, a new approach for high-resolution and large-scale mapping of canopy heights [50.52704854147297]
We present a new vision transformer (ViT) model optimized with a classification (discrete) and a continuous loss function. This model achieves better accuracy than previously used convolutional based approaches (ConvNets) optimized with only a continuous loss function.
arXiv Detail & Related papers (2023-04-22T22:39:03Z)
Generative models-based data labeling for deep networks regression: application to seed maturity estimation from UAV multispectral images [3.6868861317674524]
Monitoring seed maturity is an increasing challenge in agriculture due to climate change and more restrictive practices. Traditional methods are based on limited sampling in the field and analysis in laboratory. We propose a method for estimating parsley seed maturity using multispectral UAV imagery, with a new approach for automatic data labeling.
arXiv Detail & Related papers (2022-08-09T09:06:51Z)
End-to-end deep learning for directly estimating grape yield from ground-based imagery [53.086864957064876]
This study demonstrates the application of proximal imaging combined with deep learning for yield estimation in vineyards. Three model architectures were tested: object detection, CNN regression, and transformer models. The study showed the applicability of proximal imaging and deep learning for prediction of grapevine yield on a large scale.
arXiv Detail & Related papers (2022-08-04T01:34:46Z)
A lightweight and accurate YOLO-like network for small target detection in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection. YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation. YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z)
A CNN Approach to Simultaneously Count Plants and Detect Plantation-Rows from UAV Imagery [56.10033255997329]
We propose a novel deep learning method based on a Convolutional Neural Network (CNN) It simultaneously detects and geolocates plantation-rows while counting its plants considering highly-dense plantation configurations. The proposed method achieved state-of-the-art performance for counting and geolocating plants and plant-rows in UAV images from different types of crops.
arXiv Detail & Related papers (2020-12-31T18:51:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.