Tomato Maturity Recognition with Convolutional Transformers
- URL: http://arxiv.org/abs/2307.01530v2
- Date: Tue, 2 Jan 2024 13:13:49 GMT
- Title: Tomato Maturity Recognition with Convolutional Transformers
- Authors: Asim Khan, Taimur Hassan, Muhammad Shafay, Israa Fahmy, Naoufel
Werghi, Lakmal Seneviratne and Irfan Hussain
- Abstract summary: Authors propose a novel method for tomato maturity classification using a convolutional transformer.
New tomato dataset named KUTomaData is designed to train deep-learning models for tomato segmentation and classification.
Authors show that the convolutional transformer outperforms state-of-the-art methods for tomato maturity classification.
- Score: 5.220581005698766
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Tomatoes are a major crop worldwide, and accurately classifying their
maturity is important for many agricultural applications, such as harvesting,
grading, and quality control. In this paper, the authors propose a novel method
for tomato maturity classification using a convolutional transformer. The
convolutional transformer is a hybrid architecture that combines the strengths
of convolutional neural networks (CNNs) and transformers. Additionally, this
study introduces a new tomato dataset named KUTomaData, explicitly designed to
train deep-learning models for tomato segmentation and classification.
KUTomaData is a compilation of images sourced from a greenhouse in the UAE,
with approximately 700 images available for training and testing. The dataset
is prepared under various lighting conditions and viewing perspectives and
employs different mobile camera sensors, distinguishing it from existing
datasets. The contributions of this paper are threefold:Firstly, the authors
propose a novel method for tomato maturity classification using a modular
convolutional transformer. Secondly, the authors introduce a new tomato image
dataset that contains images of tomatoes at different maturity levels. Lastly,
the authors show that the convolutional transformer outperforms
state-of-the-art methods for tomato maturity classification. The effectiveness
of the proposed framework in handling cluttered and occluded tomato instances
was evaluated using two additional public datasets, Laboro Tomato and Rob2Pheno
Annotated Tomato, as benchmarks. The evaluation results across these three
datasets demonstrate the exceptional performance of our proposed framework,
surpassing the state-of-the-art by 58.14%, 65.42%, and 66.39% in terms of mean
average precision scores for KUTomaData, Laboro Tomato, and Rob2Pheno Annotated
Tomato, respectively.
Related papers
- Deep learning-based approach for tomato classification in complex scenes [0.8287206589886881]
We have proposed a tomato ripening monitoring approach based on deep learning in complex scenes.
The objective is to detect mature tomatoes and harvest them in a timely manner.
Experiments are based on images collected from the internet gathered through searches using tomato state across diverse languages.
arXiv Detail & Related papers (2024-01-26T18:33:57Z) - Early and Accurate Detection of Tomato Leaf Diseases Using TomFormer [0.3169023552218211]
This paper introduces a transformer-based model called TomFormer for the purpose of tomato leaf disease detection.
We present a novel approach for detecting tomato leaf diseases by employing a fusion model that combines a visual transformer and a convolutional neural network.
arXiv Detail & Related papers (2023-12-26T20:47:23Z) - Breast Ultrasound Tumor Classification Using a Hybrid Multitask
CNN-Transformer Network [63.845552349914186]
Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification.
Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations.
In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation.
arXiv Detail & Related papers (2023-08-04T01:19:32Z) - TomatoDIFF: On-plant Tomato Segmentation with Denoising Diffusion Models [3.597418929000278]
TomatoDIFF is a novel diffusion-based model for semantic segmentation of on-plant tomatoes.
Tomatopia is a new, large and challenging dataset of greenhouse tomatoes.
arXiv Detail & Related papers (2023-07-03T14:43:40Z) - Detection of Tomato Ripening Stages using Yolov3-tiny [0.0]
We use a neural network-based model for tomato classification and detection.
Our experiments showed an f1-score of 90.0% in the localization and classification of ripening stages in a custom dataset.
arXiv Detail & Related papers (2023-02-01T00:57:58Z) - Fusion of Satellite Images and Weather Data with Transformer Networks
for Downy Mildew Disease Detection [3.6868861317674524]
Crop diseases significantly affect the quantity and quality of agricultural production.
In this paper, we propose a new approach to realize data fusion using three transformers.
The architecture is built from three main components, a Vision Transformer and two transformer-encoders, allowing to fuse both image and weather modalities.
arXiv Detail & Related papers (2022-09-06T19:55:16Z) - 3D Vision with Transformers: A Survey [114.86385193388439]
The success of the transformer architecture in natural language processing has triggered attention in the computer vision field.
We present a systematic and thorough review of more than 100 transformers methods for different 3D vision tasks.
We discuss transformer design in 3D vision, which allows it to process data with various 3D representations.
arXiv Detail & Related papers (2022-08-08T17:59:11Z) - Visual Saliency Transformer [127.33678448761599]
We develop a novel unified model based on a pure transformer, Visual Saliency Transformer (VST), for both RGB and RGB-D salient object detection (SOD)
It takes image patches as inputs and leverages the transformer to propagate global contexts among image patches.
Experimental results show that our model outperforms existing state-of-the-art results on both RGB and RGB-D SOD benchmark datasets.
arXiv Detail & Related papers (2021-04-25T08:24:06Z) - 3D Human Pose Estimation with Spatial and Temporal Transformers [59.433208652418976]
We present PoseFormer, a purely transformer-based approach for 3D human pose estimation in videos.
Inspired by recent developments in vision transformers, we design a spatial-temporal transformer structure.
We quantitatively and qualitatively evaluate our method on two popular and standard benchmark datasets.
arXiv Detail & Related papers (2021-03-18T18:14:37Z) - Two-View Fine-grained Classification of Plant Species [66.75915278733197]
We propose a novel method based on a two-view leaf image representation and a hierarchical classification strategy for fine-grained recognition of plant species.
A deep metric based on Siamese convolutional neural networks is used to reduce the dependence on a large number of training samples and make the method scalable to new plant species.
arXiv Detail & Related papers (2020-05-18T21:57:47Z) - Automatic image-based identification and biomass estimation of
invertebrates [70.08255822611812]
Time-consuming sorting and identification of taxa pose strong limitations on how many insect samples can be processed.
We propose to replace the standard manual approach of human expert-based sorting and identification with an automatic image-based technology.
We use state-of-the-art Resnet-50 and InceptionV3 CNNs for the classification task.
arXiv Detail & Related papers (2020-02-05T21:38:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.