Real-time object detection and robotic manipulation for agriculture
using a YOLO-based learning approach
- URL: http://arxiv.org/abs/2401.15785v1
- Date: Sun, 28 Jan 2024 22:30:50 GMT
- Title: Real-time object detection and robotic manipulation for agriculture
using a YOLO-based learning approach
- Authors: Hongyu Zhao, Zezhi Tang, Zhenhong Li, Yi Dong, Yuancheng Si, Mingyang
Lu, George Panoutsos
- Abstract summary: This study presents a new framework that combines two separate architectures of convolutional neural networks (CNNs)
Crop images in a simulated environment are subjected to random rotations, cropping, brightness, and contrast adjustments to create augmented images for dataset generation.
The proposed method subsequently utilise the acquired image data via a visual geometry group model in order to reveal the grasping positions for the robotic manipulation.
- Score: 8.482182765640022
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The optimisation of crop harvesting processes for commonly cultivated crops
is of great importance in the aim of agricultural industrialisation. Nowadays,
the utilisation of machine vision has enabled the automated identification of
crops, leading to the enhancement of harvesting efficiency, but challenges
still exist. This study presents a new framework that combines two separate
architectures of convolutional neural networks (CNNs) in order to
simultaneously accomplish the tasks of crop detection and harvesting (robotic
manipulation) inside a simulated environment. Crop images in the simulated
environment are subjected to random rotations, cropping, brightness, and
contrast adjustments to create augmented images for dataset generation. The you
only look once algorithmic framework is employed with traditional rectangular
bounding boxes for crop localization. The proposed method subsequently utilises
the acquired image data via a visual geometry group model in order to reveal
the grasping positions for the robotic manipulation.
Related papers
- Cropper: Vision-Language Model for Image Cropping through In-Context Learning [57.694845787252916]
The goal of image cropping is to identify visually appealing crops within an image.
Recent breakthroughs in large vision-language models (VLMs) have enabled visual in-context learning without explicit training.
We propose an effective approach to leverage VLMs for better image cropping.
arXiv Detail & Related papers (2024-08-14T20:03:03Z) - Semi-Self-Supervised Domain Adaptation: Developing Deep Learning Models with Limited Annotated Data for Wheat Head Segmentation [0.10923877073891444]
We introduce a semi-self-supervised domain adaptation technique based on deep convolutional neural networks with a probabilistic diffusion process.
We develop a two-branch convolutional encoder-decoder model architecture that uses both synthesized image-mask pairs and unannotated images.
The proposed model achieved a Dice score of 80.7% on an internal test dataset and a Dice score of 64.8% on an external test set.
arXiv Detail & Related papers (2024-05-12T04:35:49Z) - Generating Diverse Agricultural Data for Vision-Based Farming Applications [74.79409721178489]
This model is capable of simulating distinct growth stages of plants, diverse soil conditions, and randomized field arrangements under varying lighting conditions.
Our dataset includes 12,000 images with semantic labels, offering a comprehensive resource for computer vision tasks in precision agriculture.
arXiv Detail & Related papers (2024-03-27T08:42:47Z) - Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - Neuromorphic Synergy for Video Binarization [54.195375576583864]
Bimodal objects serve as a visual form to embed information that can be easily recognized by vision systems.
Neuromorphic cameras offer new capabilities for alleviating motion blur, but it is non-trivial to first de-blur and then binarize the images in a real-time manner.
We propose an event-based binary reconstruction method that leverages the prior knowledge of the bimodal target's properties to perform inference independently in both event space and image space.
We also develop an efficient integration method to propagate this binary image to high frame rate binary video.
arXiv Detail & Related papers (2024-02-20T01:43:51Z) - DT/MARS-CycleGAN: Improved Object Detection for MARS Phenotyping Robot [11.869108981066429]
This work proposes a novel Digital-Twin(DT)MARS-CycleGAN model for image augmentation to improve our Modular Agricultural Robotic System's crop object detection.
In addition to the cycle consistency losses in the CycleGAN model, we designed and enforced a new DT-MARS loss in the deep learning model to penalize the inconsistency between real crop images captured by MARS and synthesized images sensed by DT MARS.
arXiv Detail & Related papers (2023-10-19T14:39:34Z) - Inside Out: Transforming Images of Lab-Grown Plants for Machine Learning
Applications in Agriculture [0.0]
We employ a contrastive unpaired translation (CUT) generative adversarial network (GAN) to translate indoor plant images to appear as field images.
While we train our network to translate an image containing only a single plant, we show that our method is easily extendable to produce multiple-plant field images.
We also use our synthetic multi-plant images to train several YoloV5 nano object detection models to perform the task of plant detection.
arXiv Detail & Related papers (2022-11-05T20:51:45Z) - Enlisting 3D Crop Models and GANs for More Data Efficient and
Generalizable Fruit Detection [0.0]
We propose a method that generates agricultural images from a synthetic 3D crop model domain into real world crop domains.
The method uses a semantically constrained GAN (generative adversarial network) to preserve the fruit position and geometry.
Incremental training experiments in vineyard grape detection tasks show that the images generated from our method can significantly speed the domain process.
arXiv Detail & Related papers (2021-08-30T16:11:59Z) - Potato Crop Stress Identification in Aerial Images using Deep
Learning-based Object Detection [60.83360138070649]
The paper presents an approach for analyzing aerial images of a potato crop using deep neural networks.
The main objective is to demonstrate automated spatial recognition of a healthy versus stressed crop at a plant level.
Experimental validation demonstrated the ability for distinguishing healthy and stressed plants in field images, achieving an average Dice coefficient of 0.74.
arXiv Detail & Related papers (2021-06-14T21:57:40Z) - MOGAN: Morphologic-structure-aware Generative Learning from a Single
Image [59.59698650663925]
Recently proposed generative models complete training based on only one image.
We introduce a MOrphologic-structure-aware Generative Adversarial Network named MOGAN that produces random samples with diverse appearances.
Our approach focuses on internal features including the maintenance of rational structures and variation on appearance.
arXiv Detail & Related papers (2021-03-04T12:45:23Z) - Multi-Spectral Image Synthesis for Crop/Weed Segmentation in Precision
Farming [3.4788711710826083]
We propose an alternative solution with respect to the common data augmentation methods, applying it to the problem of crop/weed segmentation in precision farming.
We create semi-artificial samples by replacing the most relevant object classes (i.e., crop and weeds) with their synthesized counterparts.
In addition to RGB data, we take into account also near-infrared (NIR) information, generating four channel multi-spectral synthetic images.
arXiv Detail & Related papers (2020-09-12T08:49:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.