Presenting an extensive lab- and field-image dataset of crops and weeds
for computer vision tasks in agriculture
- URL: http://arxiv.org/abs/2108.05789v1
- Date: Thu, 12 Aug 2021 15:06:32 GMT
- Title: Presenting an extensive lab- and field-image dataset of crops and weeds
for computer vision tasks in agriculture
- Authors: Michael A. Beck, Chen-Yi Liu, Christopher P. Bidinosti, Christopher J.
Henry, Cara M. Godee, Manisha Ajmani
- Abstract summary: We present two large datasets of labelled plant-images that are suited towards the training of machine learning and computer vision models.
The first dataset encompasses as the day of writing over 1.2 million images of indoor-grown crops and weeds common to the Canadian Prairies and many US states.
The second dataset consists of over 540,000 images of plants imaged in farmland.
- Score: 0.9623578875486183
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We present two large datasets of labelled plant-images that are suited
towards the training of machine learning and computer vision models. The first
dataset encompasses as the day of writing over 1.2 million images of
indoor-grown crops and weeds common to the Canadian Prairies and many US
states. The second dataset consists of over 540,000 images of plants imaged in
farmland. All indoor plant images are labelled by species and we provide rich
etadata on the level of individual images. This comprehensive database allows
to filter the datasets under user-defined specifications such as for example
the crop-type or the age of the plant. Furthermore, the indoor dataset contains
images of plants taken from a wide variety of angles, including profile shots,
top-down shots, and angled perspectives. The images taken from plants in fields
are all from a top-down perspective and contain usually multiple plants per
image. For these images metadata is also available. In this paper we describe
both datasets' characteristics with respect to plant variety, plant age, and
number of images. We further introduce an open-access sample of the
indoor-dataset that contains 1,000 images of each species covered in our
dataset. These, in total 14,000 images, had been selected, such that they form
a representative sample with respect to plant age and ndividual plants per
species. This sample serves as a quick entry point for new users to the
dataset, allowing them to explore the data on a small scale and find the
parameters of data most useful for their application without having to deal
with hundreds of thousands of individual images.
Related papers
- 360 in the Wild: Dataset for Depth Prediction and View Synthesis [66.58513725342125]
We introduce a large scale 360$circ$ videos dataset in the wild.
This dataset has been carefully scraped from the Internet and has been captured from various locations worldwide.
Each of the 25K images constituting our dataset is provided with its respective camera's pose and depth map.
arXiv Detail & Related papers (2024-06-27T05:26:38Z) - Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation [58.09421301921607]
We construct the first large-scale dataset for subject-driven image editing and generation.
Our dataset is 5 times the size of previous largest dataset, yet our cost is tens of thousands of GPU hours lower.
arXiv Detail & Related papers (2024-06-13T16:40:39Z) - Generating Diverse Agricultural Data for Vision-Based Farming Applications [74.79409721178489]
This model is capable of simulating distinct growth stages of plants, diverse soil conditions, and randomized field arrangements under varying lighting conditions.
Our dataset includes 12,000 images with semantic labels, offering a comprehensive resource for computer vision tasks in precision agriculture.
arXiv Detail & Related papers (2024-03-27T08:42:47Z) - Improving Data Efficiency for Plant Cover Prediction with Label
Interpolation and Monte-Carlo Cropping [7.993547048820065]
The plant community composition is an essential indicator of environmental changes and is usually analyzed in ecological field studies.
We introduce an approach to interpolate the sparse labels in the collected vegetation plot time series down to the intermediate dense and unlabeled images.
We also introduce a new method we call Monte-Carlo Cropping to deal with high-resolution images efficiently.
arXiv Detail & Related papers (2023-07-17T15:17:39Z) - Bugs in the Data: How ImageNet Misrepresents Biodiversity [98.98950914663813]
We analyze the 13,450 images from 269 classes that represent wild animals in the ImageNet-1k validation set.
We find that many of the classes are ill-defined or overlapping, and that 12% of the images are incorrectly labeled.
We also find that both the wildlife-related labels and images included in ImageNet-1k present significant geographical and cultural biases.
arXiv Detail & Related papers (2022-08-24T17:55:48Z) - GrowliFlower: An image time series dataset for GROWth analysis of
cauLIFLOWER [2.8247971782279615]
This article presents GrowliFlower, an image-based UAV time series dataset of two monitored cauliflower fields of size 0.39 and 0.60 ha acquired in 2020 and 2021.
The dataset contains RGB and multispectral orthophotos from which about 14,000 individual plant coordinates are derived and provided.
The dataset contains collected phenotypic traits of 740 plants, including the developmental stage as well as plant and cauliflower size.
arXiv Detail & Related papers (2022-04-01T08:56:59Z) - Agricultural Plant Cataloging and Establishment of a Data Framework from
UAV-based Crop Images by Computer Vision [4.0382342610484425]
We present a hands-on workflow for the automatized temporal and spatial identification and individualization of crop images from UAVs.
The presented approach improves analysis and interpretation of UAV data in agriculture significantly.
arXiv Detail & Related papers (2022-01-08T21:14:07Z) - Semi-Supervised Semantic Segmentation in Earth Observation: The
MiniFrance Suite, Dataset Analysis and Multi-task Network Study [82.02173199363571]
We introduce a novel large-scale dataset for semi-supervised semantic segmentation in Earth Observation, the MiniFrance suite.
MiniFrance has several unprecedented properties: it is large-scale, containing over 2000 very high resolution aerial images, accounting for more than 200 billions samples (pixels)
We present tools for data representativeness analysis in terms of appearance similarity and a thorough study of MiniFrance data, demonstrating that it is suitable for learning and generalizes well in a semi-supervised setting.
arXiv Detail & Related papers (2020-10-15T15:36:58Z) - PhraseCut: Language-based Image Segmentation in the Wild [62.643450401286]
We consider the problem of segmenting image regions given a natural language phrase.
Our dataset is collected on top of the Visual Genome dataset.
Our experiments show that the scale and diversity of concepts in our dataset poses significant challenges to the existing state-of-the-art.
arXiv Detail & Related papers (2020-08-03T20:58:53Z) - An embedded system for the automated generation of labeled plant images
to enable machine learning applications in agriculture [1.4598479819593448]
A lack of sufficient training data is often the bottleneck in the development of machine learning (ML) applications.
We have developed an embedded robotic system to automatically generate and label large datasets of plant images.
We generated a dataset of over 34,000 labeled images, with which we trained an ML-model to distinguish grasses from non-grasses.
arXiv Detail & Related papers (2020-06-01T20:01:20Z) - An Image Labeling Tool and Agricultural Dataset for Deep Learning [4.107998999964667]
We introduce a labeling tool and dataset aimed to facilitate computer vision research in agriculture.
The dataset includes original images collected from commercial greenhouses, images from PlantVillage, and images from Google Images.
In total the dataset contained 10k tomatoes, 7k leaves, 2k stems, and 2k diseased leaf annotations.
arXiv Detail & Related papers (2020-04-06T13:38:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.