Fields of The World: A Machine Learning Benchmark Dataset For Global Agricultural Field Boundary Segmentation
- URL: http://arxiv.org/abs/2409.16252v1
- Date: Tue, 24 Sep 2024 17:20:58 GMT
- Title: Fields of The World: A Machine Learning Benchmark Dataset For Global Agricultural Field Boundary Segmentation
- Authors: Hannah Kerner, Snehal Chaudhari, Aninda Ghosh, Caleb Robinson, Adeel Ahmad, Eddie Choi, Nathan Jacobs, Chris Holmes, Matthias Mohr, Rahul Dodhia, Juan M. Lavista Ferres, Jennifer Marcus,
- Abstract summary: Fields of The World (FTW) is a novel benchmark dataset for agricultural field instance segmentation.
FTW is an order of magnitude larger than previous datasets with 70,462 samples.
We show that models trained on FTW have better zero-shot and fine-tuning performance in held-out countries.
- Score: 12.039406240082515
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Crop field boundaries are foundational datasets for agricultural monitoring and assessments but are expensive to collect manually. Machine learning (ML) methods for automatically extracting field boundaries from remotely sensed images could help realize the demand for these datasets at a global scale. However, current ML methods for field instance segmentation lack sufficient geographic coverage, accuracy, and generalization capabilities. Further, research on improving ML methods is restricted by the lack of labeled datasets representing the diversity of global agricultural fields. We present Fields of The World (FTW) -- a novel ML benchmark dataset for agricultural field instance segmentation spanning 24 countries on four continents (Europe, Africa, Asia, and South America). FTW is an order of magnitude larger than previous datasets with 70,462 samples, each containing instance and semantic segmentation masks paired with multi-date, multi-spectral Sentinel-2 satellite images. We provide results from baseline models for the new FTW benchmark, show that models trained on FTW have better zero-shot and fine-tuning performance in held-out countries than models that aren't pre-trained with diverse datasets, and show positive qualitative zero-shot results of FTW models in a real-world scenario -- running on Sentinel-2 scenes over Ethiopia.
Related papers
- A Framework for Fine-Tuning LLMs using Heterogeneous Feedback [69.51729152929413]
We present a framework for fine-tuning large language models (LLMs) using heterogeneous feedback.
First, we combine the heterogeneous feedback data into a single supervision format, compatible with methods like SFT and RLHF.
Next, given this unified feedback dataset, we extract a high-quality and diverse subset to obtain performance increases.
arXiv Detail & Related papers (2024-08-05T23:20:32Z) - Adaptive Fusion of Multi-view Remote Sensing data for Optimal Sub-field
Crop Yield Prediction [24.995959334158986]
We present a novel multi-view learning approach to predict crop yield for different crops (soybean, wheat, rapeseed) and regions (Argentina, Uruguay, and Germany).
Our input data includes multi-spectral optical images from Sentinel-2 satellites and weather data as dynamic features during the crop growing season, complemented by static features like soil properties and topographic information.
To effectively fuse the data, we introduce a Multi-view Gated Fusion (MVGF) model, comprising dedicated view-encoders and a Gated Unit (GU) module.
The MVGF model is trained at sub-field level with 10 m resolution
arXiv Detail & Related papers (2024-01-22T11:01:52Z) - Country-Scale Cropland Mapping in Data-Scarce Settings Using Deep
Learning: A Case Study of Nigeria [0.6827423171182154]
We combine a global cropland dataset and a hand-labeled dataset to train machine learning models for generating a new cropland map for Nigeria in 2020 at 10 m resolution.
We provide the models with pixel-wise time series input data from remote sensing sources such as Sentinel-1 and 2, ERA5 climate data, and DEM data, in addition to binary labels indicating cropland presence.
We find that the existing WorldCover map performs the best with an F1-score of 0.825 and accuracy of 0.870 on the test set, followed by a single-headed LSTM model trained with our hand-labeled training
arXiv Detail & Related papers (2023-12-18T01:23:22Z) - Optimization Efficient Open-World Visual Region Recognition [55.76437190434433]
RegionSpot integrates position-aware localization knowledge from a localization foundation model with semantic information from a ViL model.
Experiments in open-world object recognition show that our RegionSpot achieves significant performance gain over prior alternatives.
arXiv Detail & Related papers (2023-11-02T16:31:49Z) - HarvestNet: A Dataset for Detecting Smallholder Farming Activity Using
Harvest Piles and Remote Sensing [50.4506590177605]
HarvestNet is a dataset for mapping the presence of farms in the Ethiopian regions of Tigray and Amhara during 2020-2023.
We introduce a new approach based on the detection of harvest piles characteristic of many smallholder systems.
We conclude that remote sensing of harvest piles can contribute to more timely and accurate cropland assessments in food insecure regions.
arXiv Detail & Related papers (2023-08-23T11:03:28Z) - What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation [2.7036595757881323]
We build a benchmark for Multi-domain Evaluation of Semantic (MESS)
MESS allows a holistic analysis of performance across a wide range of domain-specific datasets.
We evaluate eight recently published models on the proposed MESS benchmark and analyze characteristics for the performance of zero-shot transfer models.
arXiv Detail & Related papers (2023-06-27T14:47:43Z) - A Sentinel-2 multi-year, multi-country benchmark dataset for crop
classification and segmentation with deep learning [0.716879432974126]
Sen4AgriNet is a Sentinel-2 based time series multi country benchmark dataset for agricultural monitoring applications.
It is constructed to cover the period 2016-2020 for Catalonia and France, while it can be extended to include additional countries.
It contains 42.5 million parcels, which makes it significantly larger than other available archives.
arXiv Detail & Related papers (2022-04-02T23:14:46Z) - TIML: Task-Informed Meta-Learning for Agriculture [20.555341678693495]
We build on previous work exploring the use of meta-learning for agricultural contexts in data-sparse regions.
We introduce task-informed meta-learning (TIML), an augmentation to model-agnostic meta-learning which takes advantage of task-specific metadata.
arXiv Detail & Related papers (2022-02-04T13:27:55Z) - Jalisco's multiclass land cover analysis and classification using a
novel lightweight convnet with real-world multispectral and relief data [51.715517570634994]
We present our novel lightweight (only 89k parameters) Convolution Neural Network (ConvNet) to make LC classification and analysis.
In this work, we combine three real-world open data sources to obtain 13 channels.
Our embedded analysis anticipates the limited performance in some classes and gives us the opportunity to group the most similar.
arXiv Detail & Related papers (2022-01-26T14:58:51Z) - MSeg: A Composite Dataset for Multi-domain Semantic Segmentation [100.17755160696939]
We present MSeg, a composite dataset that unifies semantic segmentation datasets from different domains.
We reconcile the generalization and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images.
A model trained on MSeg ranks first on the WildDash-v1 leaderboard for robust semantic segmentation, with no exposure to WildDash data during training.
arXiv Detail & Related papers (2021-12-27T16:16:35Z) - Semi-Supervised Semantic Segmentation in Earth Observation: The
MiniFrance Suite, Dataset Analysis and Multi-task Network Study [82.02173199363571]
We introduce a novel large-scale dataset for semi-supervised semantic segmentation in Earth Observation, the MiniFrance suite.
MiniFrance has several unprecedented properties: it is large-scale, containing over 2000 very high resolution aerial images, accounting for more than 200 billions samples (pixels)
We present tools for data representativeness analysis in terms of appearance similarity and a thorough study of MiniFrance data, demonstrating that it is suitable for learning and generalizes well in a semi-supervised setting.
arXiv Detail & Related papers (2020-10-15T15:36:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.