VITA: Variational Pretraining of Transformers for Climate-Robust Crop Yield Forecasting
- URL: http://arxiv.org/abs/2508.03589v2
- Date: Mon, 06 Oct 2025 05:27:56 GMT
- Title: VITA: Variational Pretraining of Transformers for Climate-Robust Crop Yield Forecasting
- Authors: Adib Hasan, Mardavij Roozbehani, Munther Dahleh,
- Abstract summary: Current AI models systematically underperform when yields deviate from historical trends.<n>We introduce VITA, a variational pretraining framework that learns representations from large satellite-based weather datasets.<n>VITA is applied to 763 counties in the U.S. Corn Belt and achieves state-of-the-art performance in predicting corn and soybean yields.
- Score: 1.1470070927586018
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate crop yield forecasting is essential for global food security. However, current AI models systematically underperform when yields deviate from historical trends. We attribute this to the lack of rich, physically grounded datasets directly linking atmospheric states to yields. To address this, we introduce VITA (Variational Inference Transformer for Asymmetric data), a variational pretraining framework that learns representations from large satellite-based weather datasets and transfers to the ground-based limited measurements available for yield prediction. VITA is trained using detailed meteorological variables as proxy targets during pretraining and learns to predict latent atmospheric states under a seasonality-aware sinusoidal prior. This allows the model to be fine-tuned using limited weather statistics during deployment. Applied to 763 counties in the U.S. Corn Belt, VITA achieves state-of-the-art performance in predicting corn and soybean yields across all evaluation scenarios, particularly during extreme years, with statistically significant improvements (paired t-test, $p < 0.0001$). Importantly, VITA outperforms prior frameworks like GNN-RNN without soil data, and bigger foundational models (e.g., Chronos-Bolt) with less compute, making it practical for real-world use--especially in data-scarce regions. This work highlights how domain-aware AI design can overcome data limitations and support resilient agricultural forecasting in a changing climate.
Related papers
- Probabilistic NDVI Forecasting from Sparse Satellite Time Series and Weather Covariates [1.1503354666872168]
Accurate short-term forecasting of vegetation dynamics is a key enabler for data-driven decision support in precision agriculture.<n>NDVI forecasting from satellite observations remains challenging due to sparse and irregular sampling caused by cloud coverage.<n>We propose a probabilistic forecasting framework specifically designed for field-level NDVI prediction under clear-sky acquisition constraints.
arXiv Detail & Related papers (2026-02-04T17:48:52Z) - Out-of-Distribution Generalization in Climate-Aware Yield Prediction with Earth Observation Data [0.0]
We benchmark two state-of-the-art deep learning models, GNN-RNN and MMST-ViT, under realistic out-of-distribution conditions.<n>GNN-RNN demonstrates superior generalization with positive correlations under geographic shifts, while MMST-ViT performs well in-domain but degrades sharply under OOD conditions.
arXiv Detail & Related papers (2025-10-08T03:27:12Z) - Utilizing Strategic Pre-training to Reduce Overfitting: Baguan -- A Pre-trained Weather Forecasting Model [20.98899316909536]
We introduce Baguan, a novel data-driven model for medium-range weather forecasting built on a Siamese Autoencoder pre-trained in a self-supervised manner.<n> Experimental results show that Baguan outperforms traditional methods, delivering more accurate forecasts.
arXiv Detail & Related papers (2025-05-20T03:29:23Z) - Data-driven Seasonal Climate Predictions via Variational Inference and Transformers [31.98107454758077]
We train generative models on climate model output for seasonal predictions.<n>We analyse the method's performance in predicting interannual anomalies beyond the climate change-induced trend.
arXiv Detail & Related papers (2025-03-26T11:51:23Z) - OneForecast: A Universal Framework for Global and Regional Weather Forecasting [67.61381313555091]
We propose a global-regional nested weather forecasting framework (OneForecast) based on graph neural networks.<n>By combining a dynamic system perspective with multi-grid theory, we construct a multi-scale graph structure and densify the target region.<n>We introduce an adaptive messaging mechanism, using dynamic gating units, to deeply integrate node and edge features for more accurate extreme event forecasting.
arXiv Detail & Related papers (2025-02-01T06:49:16Z) - Deep Learning for Weather Forecasting: A CNN-LSTM Hybrid Model for Predicting Historical Temperature Data [7.559331742876793]
This study introduces a hybrid model combining Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks to predict historical temperature data.
CNNs are utilized for spatial feature extraction, while LSTMs handle temporal dependencies, resulting in significantly improved prediction accuracy and stability.
arXiv Detail & Related papers (2024-10-19T03:38:53Z) - A Benchmark for AI-based Weather Data Assimilation [10.100157158477145]
We propose DABench, a benchmark constructed by simulated observations, real-world observations, and ERA5 reanalysis.
Our experimental results demonstrate that the end-to-end weather forecasting system, integrating 4DVarFormerV2 and Sformer, can assimilate real-world observations.
The proposed DABench will significantly advance research in AI-based DA, AI-based weather forecasting, and related domains.
arXiv Detail & Related papers (2024-08-21T08:50:19Z) - MambaDS: Near-Surface Meteorological Field Downscaling with Topography Constrained Selective State Space Modeling [68.69647625472464]
Downscaling, a crucial task in meteorological forecasting, enables the reconstruction of high-resolution meteorological states for target regions.
Previous downscaling methods lacked tailored designs for meteorology and encountered structural limitations.
We propose a novel model called MambaDS, which enhances the utilization of multivariable correlations and topography information.
arXiv Detail & Related papers (2024-08-20T13:45:49Z) - Forecast-PEFT: Parameter-Efficient Fine-Tuning for Pre-trained Motion Forecasting Models [68.23649978697027]
Forecast-PEFT is a fine-tuning strategy that freezes the majority of the model's parameters, focusing adjustments on newly introduced prompts and adapters.
Our experiments show that Forecast-PEFT outperforms traditional full fine-tuning methods in motion prediction tasks.
Forecast-FT further improves prediction performance, evidencing up to a 9.6% enhancement over conventional baseline methods.
arXiv Detail & Related papers (2024-07-28T19:18:59Z) - How far are today's time-series models from real-world weather forecasting applications? [22.68937280154092]
WEATHER-5K is a comprehensive collection of observational weather data that better reflects real-world scenarios.
It enables a better training of models and a more accurate assessment of the real-world forecasting capabilities of TSF models.
We provide researchers with a clear assessment of the gap between academic TSF models and real-world weather forecasting applications.
arXiv Detail & Related papers (2024-06-20T15:18:52Z) - Generalizing Weather Forecast to Fine-grained Temporal Scales via Physics-AI Hybrid Modeling [55.13352174687475]
This paper proposes a physics-AI hybrid model (i.e., WeatherGFT) which generalizes weather forecasts to finer-grained temporal scales beyond training dataset.<n>Specifically, we employ a carefully designed PDE kernel to simulate physical evolution on a small time scale.<n>We also introduce a lead time-aware training framework to promote the generalization of the model at different lead times.
arXiv Detail & Related papers (2024-05-22T16:21:02Z) - ExtremeCast: Boosting Extreme Value Prediction for Global Weather Forecast [57.6987191099507]
We introduce Exloss, a novel loss function that performs asymmetric optimization and highlights extreme values to obtain accurate extreme weather forecast.
We also introduce ExBooster, which captures the uncertainty in prediction outcomes by employing multiple random samples.
Our solution can achieve state-of-the-art performance in extreme weather prediction, while maintaining the overall forecast accuracy comparable to the top medium-range forecast models.
arXiv Detail & Related papers (2024-02-02T10:34:13Z) - FengWu-GHR: Learning the Kilometer-scale Medium-range Global Weather
Forecasting [56.73502043159699]
This work presents FengWu-GHR, the first data-driven global weather forecasting model running at the 0.09$circ$ horizontal resolution.
It introduces a novel approach that opens the door for operating ML-based high-resolution forecasts by inheriting prior knowledge from a low-resolution model.
The hindcast of weather prediction in 2022 indicates that FengWu-GHR is superior to the IFS-HRES.
arXiv Detail & Related papers (2024-01-28T13:23:25Z) - Towards an end-to-end artificial intelligence driven global weather forecasting system [57.5191940978886]
We present an AI-based data assimilation model, i.e., Adas, for global weather variables.
We demonstrate that Adas can assimilate global observations to produce high-quality analysis, enabling the system operate stably for long term.
We are the first to apply the methods to real-world scenarios, which is more challenging and has considerable practical application potential.
arXiv Detail & Related papers (2023-12-18T09:05:28Z) - Long-term drought prediction using deep neural networks based on geospatial weather data [75.38539438000072]
High-quality drought forecasting up to a year in advance is critical for agriculture planning and insurance.
We tackle drought data by introducing an end-to-end approach that adopts a systematic end-to-end approach.
Key findings are the exceptional performance of a Transformer model, EarthFormer, in making accurate short-term (up to six months) forecasts.
arXiv Detail & Related papers (2023-09-12T13:28:06Z) - ClimaX: A foundation model for weather and climate [51.208269971019504]
ClimaX is a deep learning model for weather and climate science.
It can be pre-trained with a self-supervised learning objective on climate datasets.
It can be fine-tuned to address a breadth of climate and weather tasks.
arXiv Detail & Related papers (2023-01-24T23:19:01Z) - Forecasting large-scale circulation regimes using deformable
convolutional neural networks and global spatiotemporal climate data [86.1450118623908]
We investigate a supervised machine learning approach based on deformable convolutional neural networks (deCNNs)
We forecast the North Atlantic-European weather regimes during extended boreal winter for 1 to 15 days into the future.
Due to its wider field of view, we also observe deCNN achieving considerably better performance than regular convolutional neural networks at lead times beyond 5-6 days.
arXiv Detail & Related papers (2022-02-10T11:37:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.