AppleGrowthVision: A large-scale stereo dataset for phenological analysis, fruit detection, and 3D reconstruction in apple orchards
- URL: http://arxiv.org/abs/2505.14029v1
- Date: Tue, 20 May 2025 07:29:22 GMT
- Title: AppleGrowthVision: A large-scale stereo dataset for phenological analysis, fruit detection, and 3D reconstruction in apple orchards
- Authors: Laura-Sophia von Hirschhausen, Jannes S. Magnusson, Mykyta Kovalenko, Fredrik Boye, Tanay Rawat, Peter Eisert, Anna Hilsmann, Sebastian Pretzsch, Sebastian Bosse,
- Abstract summary: We present AppleGrowthVision, a large-scale dataset comprising two subsets.<n>The first includes 9,317 high resolution stereo images collected from a farm in Brandenburg (Germany), covering six agriculturally validated growth stages over a full growth cycle.<n>The second subset consists of 1,125 densely annotated images from the same farm in Brandenburg and one in Pillnitz (Germany), containing a total of 31,084 apple labels.<n>AppleGrowthVision provides stereo-image data with agriculturally validated growth stages, enabling precise phenological analysis and 3D reconstructions.
- Score: 3.9494466926597487
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning has transformed computer vision for precision agriculture, yet apple orchard monitoring remains limited by dataset constraints. The lack of diverse, realistic datasets and the difficulty of annotating dense, heterogeneous scenes. Existing datasets overlook different growth stages and stereo imagery, both essential for realistic 3D modeling of orchards and tasks like fruit localization, yield estimation, and structural analysis. To address these gaps, we present AppleGrowthVision, a large-scale dataset comprising two subsets. The first includes 9,317 high resolution stereo images collected from a farm in Brandenburg (Germany), covering six agriculturally validated growth stages over a full growth cycle. The second subset consists of 1,125 densely annotated images from the same farm in Brandenburg and one in Pillnitz (Germany), containing a total of 31,084 apple labels. AppleGrowthVision provides stereo-image data with agriculturally validated growth stages, enabling precise phenological analysis and 3D reconstructions. Extending MinneApple with our data improves YOLOv8 performance by 7.69 % in terms of F1-score, while adding it to MinneApple and MAD boosts Faster R-CNN F1-score by 31.06 %. Additionally, six BBCH stages were predicted with over 95 % accuracy using VGG16, ResNet152, DenseNet201, and MobileNetv2. AppleGrowthVision bridges the gap between agricultural science and computer vision, by enabling the development of robust models for fruit detection, growth modeling, and 3D analysis in precision agriculture. Future work includes improving annotation, enhancing 3D reconstruction, and extending multimodal analysis across all growth stages.
Related papers
- Design and Implementation of FourCropNet: A CNN-Based System for Efficient Multi-Crop Disease Detection and Management [3.4161054453684705]
This study proposes FourCropNet, a novel deep learning model designed to detect diseases in multiple crops.<n>FourCropNet achieved the highest accuracy of 99.7% for Grape, 99.5% for Corn, and 95.3% for the combined dataset.
arXiv Detail & Related papers (2025-03-11T12:00:56Z) - Enhancing Fruit and Vegetable Detection in Unconstrained Environment with a Novel Dataset [4.498047714838568]
This paper presents an end-to-end pipeline for detecting and localizing fruits and vegetables in real-world scenarios.
We have curated a dataset named FRUVEG67 that includes images of 67 classes of fruits and vegetables captured in unconstrained scenarios.
For detection, we introduce the Fruit and Vegetable Detection Network (FVDNet), an ensemble version of YOLOv7 featuring three distinct grid configurations.
arXiv Detail & Related papers (2024-09-20T08:46:03Z) - A Dataset and Benchmark for Shape Completion of Fruits for Agricultural Robotics [30.46518628656399]
We propose the first publicly available 3D shape completion dataset for agricultural vision systems.<n>We provide an RGB-D dataset for estimating the 3D shape of fruits.
arXiv Detail & Related papers (2024-07-18T09:07:23Z) - Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes [65.22070581594426]
"Implicit-Zoo" is a large-scale dataset requiring thousands of GPU training days to facilitate research and development in this field.
We showcase two immediate benefits as it enables to: (1) learn token locations for transformer models; (2) directly regress 3D cameras poses of 2D images with respect to NeRF models.
This in turn leads to an improved performance in all three task of image classification, semantic segmentation, and 3D pose regression, thereby unlocking new avenues for research.
arXiv Detail & Related papers (2024-06-25T10:20:44Z) - Leveraging Large-Scale Pretrained Vision Foundation Models for
Label-Efficient 3D Point Cloud Segmentation [67.07112533415116]
We present a novel framework that adapts various foundational models for the 3D point cloud segmentation task.
Our approach involves making initial predictions of 2D semantic masks using different large vision models.
To generate robust 3D semantic pseudo labels, we introduce a semantic label fusion strategy that effectively combines all the results via voting.
arXiv Detail & Related papers (2023-11-03T15:41:15Z) - HarvestNet: A Dataset for Detecting Smallholder Farming Activity Using
Harvest Piles and Remote Sensing [50.4506590177605]
HarvestNet is a dataset for mapping the presence of farms in the Ethiopian regions of Tigray and Amhara during 2020-2023.
We introduce a new approach based on the detection of harvest piles characteristic of many smallholder systems.
We conclude that remote sensing of harvest piles can contribute to more timely and accurate cropland assessments in food insecure regions.
arXiv Detail & Related papers (2023-08-23T11:03:28Z) - Panoptic Mapping with Fruit Completion and Pose Estimation for
Horticultural Robots [33.21287030243106]
Monitoring plants and fruits at high resolution play a key role in the future of agriculture.
Accurate 3D information can pave the way to a diverse number of robotic applications in agriculture ranging from autonomous harvesting to precise yield estimation.
We address the problem of jointly estimating complete 3D shapes of fruit and their pose in a 3D multi-resolution map built by a mobile robot.
arXiv Detail & Related papers (2023-03-15T20:41:24Z) - End-to-end deep learning for directly estimating grape yield from
ground-based imagery [53.086864957064876]
This study demonstrates the application of proximal imaging combined with deep learning for yield estimation in vineyards.
Three model architectures were tested: object detection, CNN regression, and transformer models.
The study showed the applicability of proximal imaging and deep learning for prediction of grapevine yield on a large scale.
arXiv Detail & Related papers (2022-08-04T01:34:46Z) - Geometry-Aware Fruit Grasping Estimation for Robotic Harvesting in
Orchards [6.963582954232132]
geometry-aware network, A3N, is proposed to perform end-to-end instance segmentation and grasping estimation.
We implement a global-to-local scanning strategy, which enables robots to accurately recognise and retrieve fruits in field environments.
Overall, the robotic system achieves success rate of harvesting ranging from 70% - 85% in field harvesting experiments.
arXiv Detail & Related papers (2021-12-08T16:17:26Z) - A CNN Approach to Simultaneously Count Plants and Detect Plantation-Rows
from UAV Imagery [56.10033255997329]
We propose a novel deep learning method based on a Convolutional Neural Network (CNN)
It simultaneously detects and geolocates plantation-rows while counting its plants considering highly-dense plantation configurations.
The proposed method achieved state-of-the-art performance for counting and geolocating plants and plant-rows in UAV images from different types of crops.
arXiv Detail & Related papers (2020-12-31T18:51:17Z) - Agriculture-Vision: A Large Aerial Image Database for Agricultural
Pattern Analysis [110.30849704592592]
We present Agriculture-Vision: a large-scale aerial farmland image dataset for semantic segmentation of agricultural patterns.
Each image consists of RGB and Near-infrared (NIR) channels with resolution as high as 10 cm per pixel.
We annotate nine types of field anomaly patterns that are most important to farmers.
arXiv Detail & Related papers (2020-01-05T20:19:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.