Swin-transformer-yolov5 For Real-time Wine Grape Bunch Detection
- URL: http://arxiv.org/abs/2208.14508v3
- Date: Tue, 8 Aug 2023 08:29:12 GMT
- Title: Swin-transformer-yolov5 For Real-time Wine Grape Bunch Detection
- Authors: Shenglian Lu (1), Xiaoyu Liu (1), Zixaun He (2), Wenbo Liu (3), Xin
Zhang (3), and Manoj Karkee (2) ((1) Guangxi Normal University, China, (2)
Washington State University, US, (3) Mississippi State University, US)
- Abstract summary: The research was conducted on two different grape varieties of Chardonnay and Merlot from July to September in 2019.
The proposed Swin-T-YOLOv5 outperformed all other studied models for grape bunch detection.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this research, an integrated detection model, Swin-transformer-YOLOv5 or
Swin-T-YOLOv5, was proposed for real-time wine grape bunch detection to inherit
the advantages from both YOLOv5 and Swin-transformer. The research was
conducted on two different grape varieties of Chardonnay (always white berry
skin) and Merlot (white or white-red mix berry skin when immature; red when
matured) from July to September in 2019. To verify the superiority of
Swin-T-YOLOv5, its performance was compared against several commonly
used/competitive object detectors, including Faster R-CNN, YOLOv3, YOLOv4, and
YOLOv5. All models were assessed under different test conditions, including two
different weather conditions (sunny and cloudy), two different berry maturity
stages (immature and mature), and three different sunlight
directions/intensities (morning, noon, and afternoon) for a comprehensive
comparison. Additionally, the predicted number of grape bunches by
Swin-T-YOLOv5 was further compared with ground truth values, including both
in-field manual counting and manual labeling during the annotation process.
Results showed that the proposed Swin-T-YOLOv5 outperformed all other studied
models for grape bunch detection, with up to 97% of mean Average Precision
(mAP) and 0.89 of F1-score when the weather was cloudy. This mAP was
approximately 44%, 18%, 14%, and 4% greater than Faster R-CNN, YOLOv3, YOLOv4,
and YOLOv5, respectively. Swin-T-YOLOv5 achieved its lowest mAP (90%) and
F1-score (0.82) when detecting immature berries, where the mAP was
approximately 40%, 5%, 3%, and 1% greater than the same. Furthermore,
Swin-T-YOLOv5 performed better on Chardonnay variety with achieved up to 0.91
of R2 and 2.36 root mean square error (RMSE) when comparing the predictions
with ground truth. However, it underperformed on Merlot variety with achieved
only up to 0.70 of R2 and 3.30 of RMSE.
Related papers
- Robust Fine-tuning of Zero-shot Models via Variance Reduction [56.360865951192324]
When fine-tuning zero-shot models, our desideratum is for the fine-tuned model to excel in both in-distribution (ID) and out-of-distribution (OOD)
We propose a sample-wise ensembling technique that can simultaneously attain the best ID and OOD accuracy without the trade-offs.
arXiv Detail & Related papers (2024-11-11T13:13:39Z) - Comparing YOLO11 and YOLOv8 for instance segmentation of occluded and non-occluded immature green fruits in complex orchard environment [0.4143603294943439]
This study focused on YOLO11 and YOLOv8's instance segmentation capabilities for immature green apples in orchard environments.
YOLO11n-seg achieved the highest mask precision across all categories with a notable score of 0.831.
YOLO11m-seg consistently outperformed, registering the highest scores for both box and mask segmentation.
arXiv Detail & Related papers (2024-10-24T00:12:20Z) - YOLO11 and Vision Transformers based 3D Pose Estimation of Immature Green Fruits in Commercial Apple Orchards for Robotic Thinning [0.4143603294943439]
Method for 3D pose estimation of immature green apples (fruitlets) in commercial orchards was developed.
YOLO11 object detection and pose estimation algorithm alongside Vision Transformers (ViT) for depth estimation.
YOLO11n surpassed all configurations of YOLO11 and YOLOv8 in terms of box precision and pose precision.
arXiv Detail & Related papers (2024-10-21T17:00:03Z) - Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation [58.77994391566484]
We propose W1KP, a human-calibrated measure of variability in a set of images.
Our best perceptual distance outperforms nine baselines by up to 18 points in accuracy.
We analyze 56 linguistic features of real prompts, finding that the prompt's length, CLIP embedding norm, concreteness, and word senses influence variability most.
arXiv Detail & Related papers (2024-06-12T17:59:27Z) - YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection.
The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs.
We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z) - Predicting Overtakes in Trucks Using CAN Data [51.28632782308621]
We investigate the detection of truck overtakes from CAN data.
Our analysis covers up to 10 seconds before the overtaking event.
We observe that the prediction scores of the overtake class tend to increase as we approach the overtake trigger.
arXiv Detail & Related papers (2024-04-08T17:58:22Z) - YOLOv5 vs. YOLOv8 in Marine Fisheries: Balancing Class Detection and Instance Count [0.0]
This paper presents a comparative study of object detection using YOLOv5 and YOLOv8 for three distinct classes: artemia, cyst, and excrement.
YOLOv5 often performed better in detecting Artemia and cysts with excellent precision and accuracy.
However, when it came to detecting excrement, YOLOv5 faced notable challenges and limitations.
arXiv Detail & Related papers (2024-04-01T20:01:04Z) - Real-time Strawberry Detection Based on Improved YOLOv5s Architecture
for Robotic Harvesting in open-field environment [0.0]
This study proposed a YOLOv5-based custom object detection model to detect strawberries in an outdoor environment.
The highest mean average precision of 80.3% was achieved using the proposed architecture.
The model is fast enough for real time strawberry detection and localization for the robotic picking.
arXiv Detail & Related papers (2023-08-08T02:28:48Z) - Assessing The Performance of YOLOv5 Algorithm for Detecting Volunteer
Cotton Plants in Corn Fields at Three Different Growth Stages [5.293431074053198]
The Texas Boll Weevil Eradication Program (TBWEP) employs people to locate and eliminate VC plants growing by the side of roads or fields with rotation crops.
In this paper, we demonstrate the application of computer vision (CV) algorithm based on You Only Look Once version 5 (YOLOv5) for detecting VC plants growing in the middle of corn fields.
arXiv Detail & Related papers (2022-07-31T21:03:40Z) - A CNN Approach to Simultaneously Count Plants and Detect Plantation-Rows
from UAV Imagery [56.10033255997329]
We propose a novel deep learning method based on a Convolutional Neural Network (CNN)
It simultaneously detects and geolocates plantation-rows while counting its plants considering highly-dense plantation configurations.
The proposed method achieved state-of-the-art performance for counting and geolocating plants and plant-rows in UAV images from different types of crops.
arXiv Detail & Related papers (2020-12-31T18:51:17Z) - CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors
and Efficient Neural Networks [51.589769497681175]
The novel coronavirus (SARS-CoV-2) has led to a pandemic.
The current testing regime based on Reverse Transcription-Polymerase Chain Reaction for SARS-CoV-2 has been unable to keep up with testing demands.
We propose a framework called CovidDeep that combines efficient DNNs with commercially available WMSs for pervasive testing of the virus.
arXiv Detail & Related papers (2020-07-20T21:47:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.