An Open and Large-Scale Dataset for Multi-Modal Climate Change-aware Crop Yield Predictions
- URL: http://arxiv.org/abs/2406.06081v2
- Date: Mon, 17 Jun 2024 07:35:12 GMT
- Title: An Open and Large-Scale Dataset for Multi-Modal Climate Change-aware Crop Yield Predictions
- Authors: Fudong Lin, Kaleb Guillot, Summer Crawford, Yihe Zhang, Xu Yuan, Nian-Feng Tzeng,
- Abstract summary: CropNet dataset is the first terabyte-sized, publicly available, and multi-modal dataset specifically targeting climate change-aware crop yield predictions.
CropNet dataset is composed of three modalities of data, i.e., Sentinel-2 Imagery, WRF-HRRR, and USDA Cropd dataset, for over 2200 U.S. counties spanning 6 years.
- Score: 20.44172558372343
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Precise crop yield predictions are of national importance for ensuring food security and sustainable agricultural practices. While AI-for-science approaches have exhibited promising achievements in solving many scientific problems such as drug discovery, precipitation nowcasting, etc., the development of deep learning models for predicting crop yields is constantly hindered by the lack of an open and large-scale deep learning-ready dataset with multiple modalities to accommodate sufficient information. To remedy this, we introduce the CropNet dataset, the first terabyte-sized, publicly available, and multi-modal dataset specifically targeting climate change-aware crop yield predictions for the contiguous United States (U.S.) continent at the county level. Our CropNet dataset is composed of three modalities of data, i.e., Sentinel-2 Imagery, WRF-HRRR Computed Dataset, and USDA Crop Dataset, for over 2200 U.S. counties spanning 6 years (2017-2022), expected to facilitate researchers in developing versatile deep learning models for timely and precisely predicting crop yields at the county-level, by accounting for the effects of both short-term growing season weather variations and long-term climate change on crop yields. Besides, we develop the CropNet package, offering three types of APIs, for facilitating researchers in downloading the CropNet data on the fly over the time and region of interest, and flexibly building their deep learning models for accurate crop yield predictions. Extensive experiments have been conducted on our CropNet dataset via employing various types of deep learning solutions, with the results validating the general applicability and the efficacy of the CropNet dataset in climate change-aware crop yield predictions.
Related papers
- A Novel Fusion of Optical and Radar Satellite Data for Crop Phenology Estimation using Machine Learning and Cloud Computing [0.0]
In the era of big Earth observation data ubiquity, attempts have been made to accurately predict crop phenology based on Remote Sensing data.
Here, we estimate phenological developments for eight major crops and 13 phenological stages across Germany at 30m scale using a novel framework.
arXiv Detail & Related papers (2024-08-16T13:44:35Z) - Explainability of Sub-Field Level Crop Yield Prediction using Remote Sensing [6.65506917941232]
We focus on the task of crop yield prediction, specifically for soybean, wheat, and rapeseed crops in Argentina, Uruguay, and Germany.
Our goal is to develop and explain predictive models for these crops, using a large dataset of satellite images, additional data modalities, and crop yield maps.
For model explainability, we utilize feature attribution methods to quantify input feature contributions, identify critical growth stages, analyze yield variability at the field level, and explain less accurate predictions.
arXiv Detail & Related papers (2024-07-11T08:23:46Z) - F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain.
We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size.
Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z) - Long-term drought prediction using deep neural networks based on geospatial weather data [75.38539438000072]
High-quality drought forecasting up to a year in advance is critical for agriculture planning and insurance.
We tackle drought data by introducing an end-to-end approach that adopts a systematic end-to-end approach.
Key findings are the exceptional performance of a Transformer model, EarthFormer, in making accurate short-term (up to six months) forecasts.
arXiv Detail & Related papers (2023-09-12T13:28:06Z) - LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting [65.71129509623587]
Road traffic forecasting plays a critical role in smart city initiatives and has experienced significant advancements thanks to the power of deep learning.
However, the promising results achieved on current public datasets may not be applicable to practical scenarios.
We introduce the LargeST benchmark dataset, which includes a total of 8,600 sensors in California with a 5-year time coverage.
arXiv Detail & Related papers (2023-06-14T05:48:36Z) - High-Resolution Satellite Imagery for Modeling the Impact of
Aridification on Crop Production [2.5402662954395097]
We introduce a first-of-its-kind dataset, SICKLE, having time-series images at different spatial resolutions from 3 different satellites.
The dataset comprises of 2,398 season-wise samples from 388 unique plots distributed across 4 districts of the Delta.
We benchmark the dataset on 3 separate tasks, namely crop type, phenology date (sowing, transplanting, harvesting) and yield prediction, and develop an end-to-end framework for predicting key crop parameters in a real-world setting.
arXiv Detail & Related papers (2022-09-25T14:54:50Z) - Extreme Gradient Boosting for Yield Estimation compared with Deep
Learning Approaches [0.0]
We propose a pipeline to process remote sensing images into feature-based representations that allow the employment of Extreme Gradient Boosting (XGBoost) for yield prediction.
A comparative evaluation of soybean yield prediction within the United States shows promising prediction accuracies compared to state-of-the-art yield prediction systems based on Deep Learning.
arXiv Detail & Related papers (2022-08-26T12:48:18Z) - Jalisco's multiclass land cover analysis and classification using a
novel lightweight convnet with real-world multispectral and relief data [51.715517570634994]
We present our novel lightweight (only 89k parameters) Convolution Neural Network (ConvNet) to make LC classification and analysis.
In this work, we combine three real-world open data sources to obtain 13 channels.
Our embedded analysis anticipates the limited performance in some classes and gives us the opportunity to group the most similar.
arXiv Detail & Related papers (2022-01-26T14:58:51Z) - Estimating crop yields with remote sensing and deep learning [0.2492060267829796]
We present a deep learning model able to perform pre-season and in-season predictions for five different crops.
Our model uses crop calendars, easy-to-obtain remote sensing data and weather forecast information to provide accurate yield estimates.
arXiv Detail & Related papers (2020-07-21T15:09:11Z) - Learning from Data to Optimize Control in Precision Farming [77.34726150561087]
Special issue presents the latest development in statistical inference, machine learning and optimum control for precision farming.
Satellite positioning and navigation followed by Internet-of-Things generate vast information that can be used to optimize farming processes in real-time.
arXiv Detail & Related papers (2020-07-07T12:44:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.