Related papers: Swin-transformer-yolov5 For Real-time Wine Grape Bunch Detection

Swin-transformer-yolov5 For Real-time Wine Grape Bunch Detection

URL: http://arxiv.org/abs/2208.14508v3
Date: Tue, 8 Aug 2023 08:29:12 GMT
Title: Swin-transformer-yolov5 For Real-time Wine Grape Bunch Detection
Authors: Shenglian Lu (1), Xiaoyu Liu (1), Zixaun He (2), Wenbo Liu (3), Xin Zhang (3), and Manoj Karkee (2) ((1) Guangxi Normal University, China, (2) Washington State University, US, (3) Mississippi State University, US)
Abstract summary: The research was conducted on two different grape varieties of Chardonnay and Merlot from July to September in 2019. The proposed Swin-T-YOLOv5 outperformed all other studied models for grape bunch detection.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this research, an integrated detection model, Swin-transformer-YOLOv5 or Swin-T-YOLOv5, was proposed for real-time wine grape bunch detection to inherit the advantages from both YOLOv5 and Swin-transformer. The research was conducted on two different grape varieties of Chardonnay (always white berry skin) and Merlot (white or white-red mix berry skin when immature; red when matured) from July to September in 2019. To verify the superiority of Swin-T-YOLOv5, its performance was compared against several commonly used/competitive object detectors, including Faster R-CNN, YOLOv3, YOLOv4, and YOLOv5. All models were assessed under different test conditions, including two different weather conditions (sunny and cloudy), two different berry maturity stages (immature and mature), and three different sunlight directions/intensities (morning, noon, and afternoon) for a comprehensive comparison. Additionally, the predicted number of grape bunches by Swin-T-YOLOv5 was further compared with ground truth values, including both in-field manual counting and manual labeling during the annotation process. Results showed that the proposed Swin-T-YOLOv5 outperformed all other studied models for grape bunch detection, with up to 97% of mean Average Precision (mAP) and 0.89 of F1-score when the weather was cloudy. This mAP was approximately 44%, 18%, 14%, and 4% greater than Faster R-CNN, YOLOv3, YOLOv4, and YOLOv5, respectively. Swin-T-YOLOv5 achieved its lowest mAP (90%) and F1-score (0.82) when detecting immature berries, where the mAP was approximately 40%, 5%, 3%, and 1% greater than the same. Furthermore, Swin-T-YOLOv5 performed better on Chardonnay variety with achieved up to 0.91 of R2 and 2.36 root mean square error (RMSE) when comparing the predictions with ground truth. However, it underperformed on Merlot variety with achieved only up to 0.70 of R2 and 3.30 of RMSE.

Related papers

Robust Fine-tuning of Zero-shot Models via Variance Reduction [56.360865951192324]
When fine-tuning zero-shot models, our desideratum is for the fine-tuned model to excel in both in-distribution (ID) and out-of-distribution (OOD) We propose a sample-wise ensembling technique that can simultaneously attain the best ID and OOD accuracy without the trade-offs.
arXiv Detail & Related papers (2024-11-11T13:13:39Z)
Comparing YOLO11 and YOLOv8 for instance segmentation of occluded and non-occluded immature green fruits in complex orchard environment [0.4143603294943439]
This study focused on YOLO11 and YOLOv8's instance segmentation capabilities for immature green apples in orchard environments. YOLO11n-seg achieved the highest mask precision across all categories with a notable score of 0.831. YOLO11m-seg consistently outperformed, registering the highest scores for both box and mask segmentation.
arXiv Detail & Related papers (2024-10-24T00:12:20Z)
YOLO11 and Vision Transformers based 3D Pose Estimation of Immature Green Fruits in Commercial Apple Orchards for Robotic Thinning [0.4143603294943439]
Method for 3D pose estimation of immature green apples (fruitlets) in commercial orchards was developed. YOLO11 object detection and pose estimation algorithm alongside Vision Transformers (ViT) for depth estimation. YOLO11n surpassed all configurations of YOLO11 and YOLOv8 in terms of box precision and pose precision.
arXiv Detail & Related papers (2024-10-21T17:00:03Z)
Comprehensive Performance Evaluation of YOLOv12, YOLO11, YOLOv10, YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments [0.9565934024763958]
This study systematically performed a real-world evaluation of the performances of YOLOv8, YOLOv9, YOLOv10, YOLO11( or YOLOv11), and YOLOv12 object detection algorithms. YOLOv12l recorded the highest recall rate at 0.90, compared to all other configurations of YOLO models. YOLOv11n achieved highest inference speed of 2.4 ms, outperforming YOLOv8n (4.1 ms), YOLOv9 Gelan-s (11.5 ms), YOLOv10n (5.5 ms), and YOLOv12n
arXiv Detail & Related papers (2024-07-01T17:59:55Z)
Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation [58.77994391566484]
We propose W1KP, a human-calibrated measure of variability in a set of images. Our best perceptual distance outperforms nine baselines by up to 18 points in accuracy. We analyze 56 linguistic features of real prompts, finding that the prompt's length, CLIP embedding norm, concreteness, and word senses influence variability most.
arXiv Detail & Related papers (2024-06-12T17:59:27Z)
YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection. The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs. We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z)
Predicting Overtakes in Trucks Using CAN Data [51.28632782308621]
We investigate the detection of truck overtakes from CAN data. Our analysis covers up to 10 seconds before the overtaking event. We observe that the prediction scores of the overtake class tend to increase as we approach the overtake trigger.
arXiv Detail & Related papers (2024-04-08T17:58:22Z)
YOLOv5 vs. YOLOv8 in Marine Fisheries: Balancing Class Detection and Instance Count [0.0]
This paper presents a comparative study of object detection using YOLOv5 and YOLOv8 for three distinct classes: artemia, cyst, and excrement. YOLOv5 often performed better in detecting Artemia and cysts with excellent precision and accuracy. However, when it came to detecting excrement, YOLOv5 faced notable challenges and limitations.
arXiv Detail & Related papers (2024-04-01T20:01:04Z)
Real-time Strawberry Detection Based on Improved YOLOv5s Architecture for Robotic Harvesting in open-field environment [0.0]
This study proposed a YOLOv5-based custom object detection model to detect strawberries in an outdoor environment. The highest mean average precision of 80.3% was achieved using the proposed architecture. The model is fast enough for real time strawberry detection and localization for the robotic picking.
arXiv Detail & Related papers (2023-08-08T02:28:48Z)
Assessing The Performance of YOLOv5 Algorithm for Detecting Volunteer Cotton Plants in Corn Fields at Three Different Growth Stages [5.293431074053198]
The Texas Boll Weevil Eradication Program (TBWEP) employs people to locate and eliminate VC plants growing by the side of roads or fields with rotation crops. In this paper, we demonstrate the application of computer vision (CV) algorithm based on You Only Look Once version 5 (YOLOv5) for detecting VC plants growing in the middle of corn fields.
arXiv Detail & Related papers (2022-07-31T21:03:40Z)
A CNN Approach to Simultaneously Count Plants and Detect Plantation-Rows from UAV Imagery [56.10033255997329]
We propose a novel deep learning method based on a Convolutional Neural Network (CNN) It simultaneously detects and geolocates plantation-rows while counting its plants considering highly-dense plantation configurations. The proposed method achieved state-of-the-art performance for counting and geolocating plants and plant-rows in UAV images from different types of crops.
arXiv Detail & Related papers (2020-12-31T18:51:17Z)
CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors and Efficient Neural Networks [51.589769497681175]
The novel coronavirus (SARS-CoV-2) has led to a pandemic. The current testing regime based on Reverse Transcription-Polymerase Chain Reaction for SARS-CoV-2 has been unable to keep up with testing demands. We propose a framework called CovidDeep that combines efficient DNNs with commercially available WMSs for pervasive testing of the virus.
arXiv Detail & Related papers (2020-07-20T21:47:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.