TomatoScanner: phenotyping tomato fruit based on only RGB image
- URL: http://arxiv.org/abs/2503.05568v1
- Date: Fri, 07 Mar 2025 16:47:48 GMT
- Title: TomatoScanner: phenotyping tomato fruit based on only RGB image
- Authors: Xiaobei Zhao, Xiangrong Zeng, Yihang Ma, Pengjin Tang, Xiang Li,
- Abstract summary: In tomato greenhouse, phenotypic measurement is meaningful for researchers and farmers to monitor crop growth.<n>Several studies have explored computer vision-based methods to replace manual phenotyping.<n>In this paper, we propose a non-contact tomato fruit phenotyping method, titled TomatoScanner, where RGB image is all you need for input.
- Score: 4.217003794764974
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In tomato greenhouse, phenotypic measurement is meaningful for researchers and farmers to monitor crop growth, thereby precisely control environmental conditions in time, leading to better quality and higher yield. Traditional phenotyping mainly relies on manual measurement, which is accurate but inefficient, more importantly, endangering the health and safety of people. Several studies have explored computer vision-based methods to replace manual phenotyping. However, the 2D-based need extra calibration, or cause destruction to fruit, or can only measure limited and meaningless traits. The 3D-based need extra depth camera, which is expensive and unacceptable for most farmers. In this paper, we propose a non-contact tomato fruit phenotyping method, titled TomatoScanner, where RGB image is all you need for input. First, pixel feature is extracted by instance segmentation of our proposed EdgeYOLO with preprocessing of individual separation and pose correction. Second, depth feature is extracted by depth estimation of Depth Pro. Third, pixel and depth feature are fused to output phenotype results in reality. We establish self-built Tomato Phenotype Dataset to test TomatoScanner, which achieves excellent phenotyping on width, height, vertical area and volume, with median relative error of 5.63%, 7.03%, -0.64% and 37.06%, respectively. We propose and add three innovative modules - EdgeAttention, EdgeLoss and EdgeBoost - into EdgeYOLO, to enhance the segmentation accuracy on edge portion. Precision and mean Edge Error greatly improve from 0.943 and 5.641% to 0.986 and 2.963%, respectively. Meanwhile, EdgeYOLO keeps lightweight and efficient, with 48.7 M weights size and 76.34 FPS. Codes and datasets: https://github.com/AlexTraveling/TomatoScanner.
Related papers
- Wheat3DGS: In-field 3D Reconstruction, Instance Segmentation and Phenotyping of Wheat Heads with Gaussian Splatting [1.4100451538155885]
We present Wheat3DGS, a novel approach that leverages 3DGS and the Segment Anything Model (SAM) for precise 3D instance segmentation and morphological measurement of hundreds of wheat heads automatically.
We validate the accuracy of wheat breeding head extraction against high-resolution laser scan data, obtaining per-instance mean absolute percentage errors of 15.1%, 18.3%, and 40.2% for length, width, and volume.
arXiv Detail & Related papers (2025-04-09T15:31:42Z) - Aggrotech: Leveraging Deep Learning for Sustainable Tomato Disease Management [0.0]
We propose a deep learning-based approach for Tomato Leaf Disease Detection using two well-established convolutional neural networks (CNNs)<n>The research employs VGG19 and Inception v3 models on the Tomato Villages dataset (4525 images) for tomato leaf disease detection.<n>The models' accuracy of 93.93% with dropout layers demonstrates their usefulness for crop health monitoring.
arXiv Detail & Related papers (2025-01-21T11:25:44Z) - MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps [51.44887282336391]
Key challenge of multi-view indoor 3D object detection is to infer accurate geometry information from images for precise 3D detection.
Previous method relies on NeRF for geometry reasoning.
We propose MVSDet which utilizes plane sweep for geometry-aware 3D object detection.
arXiv Detail & Related papers (2024-10-28T21:58:41Z) - Energy-Efficient Uncertainty-Aware Biomass Composition Prediction at the Edge [6.76163770004542]
Herbage containing clover promotes higher food intake, resulting in higher milk production.
Deep learning algorithms have been proposed with the goal to estimate the dry biomass composition from images of the grass directly in the fields.
This paper proposes to fill this gap by applying filter pruning to reduce the energy requirement of existing deep learning solutions.
We address this challenge by training filter-pruned models using a variance attenuation loss so they can predict the uncertainty of their predictions. When the uncertainty exceeds a threshold, we re-infer using a more accurate unpruned model.
arXiv Detail & Related papers (2024-04-17T10:26:49Z) - SDGE: Stereo Guided Depth Estimation for 360$^\circ$ Camera Sets [65.64958606221069]
Multi-camera systems are often used in autonomous driving to achieve a 360$circ$ perception.
These 360$circ$ camera sets often have limited or low-quality overlap regions, making multi-view stereo methods infeasible for the entire image.
We propose the Stereo Guided Depth Estimation (SGDE) method, which enhances depth estimation of the full image by explicitly utilizing multi-view stereo results on the overlap.
arXiv Detail & Related papers (2024-02-19T02:41:37Z) - Early and Accurate Detection of Tomato Leaf Diseases Using TomFormer [0.3169023552218211]
This paper introduces a transformer-based model called TomFormer for the purpose of tomato leaf disease detection.
We present a novel approach for detecting tomato leaf diseases by employing a fusion model that combines a visual transformer and a convolutional neural network.
arXiv Detail & Related papers (2023-12-26T20:47:23Z) - Tomato Maturity Recognition with Convolutional Transformers [5.220581005698766]
Authors propose a novel method for tomato maturity classification using a convolutional transformer.
New tomato dataset named KUTomaData is designed to train deep-learning models for tomato segmentation and classification.
Authors show that the convolutional transformer outperforms state-of-the-art methods for tomato maturity classification.
arXiv Detail & Related papers (2023-07-04T07:33:53Z) - Visual based Tomato Size Measurement System for an Indoor Farming
Environment [3.176607626141415]
This paper presents a size measurement method combining a machine learning model and depth images captured from three low cost RGBD cameras.
The performance of the presented system is evaluated on a lab environment with real tomato fruits and fake leaves.
Our three-camera system was able to achieve a height measurement accuracy of 0.9114 and a width accuracy of 0.9443.
arXiv Detail & Related papers (2023-04-12T22:27:05Z) - Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh
Reconstruction [66.10717041384625]
Zolly is the first 3DHMR method focusing on perspective-distorted images.
We propose a new camera model and a novel 2D representation, termed distortion image, which describes the 2D dense distortion scale of the human body.
We extend two real-world datasets tailored for this task, all containing perspective-distorted human images.
arXiv Detail & Related papers (2023-03-24T04:22:41Z) - MonoGraspNet: 6-DoF Grasping with a Single RGB Image [73.96707595661867]
6-DoF robotic grasping is a long-lasting but unsolved problem.
Recent methods utilize strong 3D networks to extract geometric grasping representations from depth sensors.
We propose the first RGB-only 6-DoF grasping pipeline called MonoGraspNet.
arXiv Detail & Related papers (2022-09-26T21:29:50Z) - End-to-end deep learning for directly estimating grape yield from
ground-based imagery [53.086864957064876]
This study demonstrates the application of proximal imaging combined with deep learning for yield estimation in vineyards.
Three model architectures were tested: object detection, CNN regression, and transformer models.
The study showed the applicability of proximal imaging and deep learning for prediction of grapevine yield on a large scale.
arXiv Detail & Related papers (2022-08-04T01:34:46Z) - Single image deep defocus estimation and its applications [82.93345261434943]
We train a deep neural network to classify image patches into one of the 20 levels of blurriness.
The trained model is used to determine the patch blurriness which is then refined by applying an iterative weighted guided filter.
The result is a defocus map that carries the information of the degree of blurriness for each pixel.
arXiv Detail & Related papers (2021-07-30T06:18:16Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.