Knowledge-Guided Machine Learning Models to Upscale Evapotranspiration in the U.S. Midwest
- URL: http://arxiv.org/abs/2510.11505v1
- Date: Mon, 13 Oct 2025 15:15:40 GMT
- Title: Knowledge-Guided Machine Learning Models to Upscale Evapotranspiration in the U.S. Midwest
- Authors: Aleksei Rozanov, Samikshya Subedi, Vasudha Sharma, Bryan C. Runck,
- Abstract summary: Evapotranspiration (ET) plays a critical role in the land-atmosphere interactions, yet its accurate quantification across various scales remains a challenge.<n>This study integrates tree-based and knowledge-guided machine learning (ML) techniques with multispectral remote sensing data, griddled meteorology and EC data to upscale ET across the Midwest United States.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Evapotranspiration (ET) plays a critical role in the land-atmosphere interactions, yet its accurate quantification across various spatiotemporal scales remains a challenge. In situ measurement approaches, like eddy covariance (EC) or weather station-based ET estimation, allow for measuring ET at a single location. Agricultural uses of ET require estimates for each field over broad areas, making it infeasible to deploy sensing systems at each location. This study integrates tree-based and knowledge-guided machine learning (ML) techniques with multispectral remote sensing data, griddled meteorology and EC data to upscale ET across the Midwest United States. We compare four tree-based models - Random Forest, CatBoost, XGBoost, LightGBM - and a simple feed-forward artificial neural network in combination with features engineered using knowledge-guided ML principles. Models were trained and tested on EC towers located in the Midwest of the United States using k-fold cross validation with k=5 and site-year, biome stratified train-test split to avoid data leakage. Results show that LightGBM with knowledge-guided features outperformed other methods with an R2=0.86, MSE=14.99 W m^-2 and MAE = 8.82 W m^-2 according to grouped k-fold validation (k=5). Feature importance analysis shows that knowledge-guided features were most important for predicting evapotranspiration. Using the best performing model, we provide a data product at 500 m spatial and one-day temporal resolution for gridded ET for the period of 2019-2024. Intercomparison between the new gridded product and state-level weather station-based ET estimates show best-in-class correspondence.
Related papers
- Comparative Assessment of Multimodal Earth Observation Data for Soil Moisture Estimation [0.9674544640949528]
We present a high-resolution (10m) soil moisture estimation framework for vegetated areas across Europe.<n>We compare modality combinations with temporal parameterizations, using spatial cross-validation, to ensure geographic generalization.<n>We also evaluate whether foundation model embeddings from IBM-NASA's Prithvi model improve upon traditional hand-crafted spectral features.
arXiv Detail & Related papers (2026-02-20T09:17:12Z) - Detect Anything via Next Point Prediction [51.55967987350882]
Rex- Omni is a 3B-scale MLLM that achieves state-of-the-art object perception performance.<n>On benchmarks like COCO and LVIS, Rex- Omni attains performance comparable to or exceeding regression-based models.
arXiv Detail & Related papers (2025-10-14T17:59:54Z) - ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search [53.40810298627443]
ReGUIDE is a framework for web grounding that enables MLLMs to learn data efficiently through self-generated reasoning and spatial-aware criticism.<n>Our experiments demonstrate that ReGUIDE significantly advances web grounding performance across multiple benchmarks.
arXiv Detail & Related papers (2025-05-21T08:36:18Z) - Flow Matching for Atmospheric Retrieval of Exoplanets: Where Reliability meets Adaptive Noise Levels [38.84835238599221]
Flow matching posterior estimation (FMPE) is a new machine learning approach to atmospheric retrieval.
FMPE trains about 3 times faster than neural posterior estimation (NPE) and yields higher IS efficiencies.
IS successfully corrects inaccurate ML results, identifies model failures via low efficiencies, and provides accurate estimates of the Bayesian evidence.
arXiv Detail & Related papers (2024-10-28T19:28:07Z) - A novel fusion of Sentinel-1 and Sentinel-2 with climate data for crop phenology estimation using Machine Learning [0.0]
We train a Machine Learning (ML) LightGBM model to predict 13 phenological stages for eight major crops across Germany at 20 m scale.<n>At national scale, predicted phenology resulted in a reasonable precision of R2 > 0.43 and a low Mean Absolute Error of 6 days.
arXiv Detail & Related papers (2024-08-16T13:44:35Z) - Personalized Adapter for Large Meteorology Model on Devices: Towards Weather Foundation Models [36.229082478423585]
LM-Weather is a generic approach to taming pre-trained language models (PLMs)
We introduce a lightweight personalized adapter into PLMs and endow it with weather pattern awareness.
Experiments show LM-Weather outperforms the state-of-the-art results by a large margin across various tasks.
arXiv Detail & Related papers (2024-05-24T15:25:09Z) - Creating and Leveraging a Synthetic Dataset of Cloud Optical Thickness Measures for Cloud Detection in MSI [3.4764766275808583]
Cloud formations often obscure optical satellite-based monitoring of the Earth's surface.
We propose a novel synthetic dataset for cloud optical thickness estimation.
We leverage for obtaining reliable and versatile cloud masks on real data.
arXiv Detail & Related papers (2023-11-23T14:28:28Z) - Uncertainty estimation of machine learning spatial precipitation predictions from satellite data [3.8623569699070353]
Merging satellite and gauge data with machine learning produces high-resolution precipitation datasets.
We address the gap of how to optimally provide such estimates by benchmarking six algorithms.
We propose a suite of machine learning algorithms for estimating uncertainty in spatial data prediction.
arXiv Detail & Related papers (2023-11-13T17:55:28Z) - FLOGA: A machine learning ready dataset, a benchmark and a novel deep
learning model for burnt area mapping with Sentinel-2 [41.28284355136163]
Wildfires pose significant threats to human and animal lives, ecosystems, and socio-economic stability.
In this work, we create and introduce a machine-learning ready dataset we name FLOGA (Forest wiLdfire Observations for the Greek Area)
This dataset is unique as it comprises of satellite imagery acquired before and after a wildfire event.
We use FLOGA to provide a thorough comparison of multiple Machine Learning and Deep Learning algorithms for the automatic extraction of burnt areas.
arXiv Detail & Related papers (2023-11-06T18:42:05Z) - SSL-SoilNet: A Hybrid Transformer-based Framework with Self-Supervised Learning for Large-scale Soil Organic Carbon Prediction [2.554658234030785]
This study introduces a novel approach that aims to learn the geographical link between multimodal features via self-supervised contrastive learning.
The proposed approach has undergone rigorous testing on two distinct large-scale datasets.
arXiv Detail & Related papers (2023-08-07T13:44:44Z) - Contextualizing MLP-Mixers Spatiotemporally for Urban Data Forecast at Scale [54.15522908057831]
We propose an adapted version of the computationally-Mixer for STTD forecast at scale.
Our results surprisingly show that this simple-yeteffective solution can rival SOTA baselines when tested on several traffic benchmarks.
Our findings contribute to the exploration of simple-yet-effective models for real-world STTD forecasting.
arXiv Detail & Related papers (2023-07-04T05:19:19Z) - Earthformer: Exploring Space-Time Transformers for Earth System
Forecasting [27.60569643222878]
We propose Earthformer, a space-time Transformer for Earth system forecasting.
The Transformer is based on a generic, flexible and efficient space-time attention block, named Cuboid Attention.
Experiments on two real-world benchmarks about precipitation nowcasting and El Nino/Southerntemporaltion show Earthformer achieves state-of-the-art performance.
arXiv Detail & Related papers (2022-07-12T20:52:26Z) - Embedding Earth: Self-supervised contrastive pre-training for dense land
cover classification [61.44538721707377]
We present Embedding Earth a self-supervised contrastive pre-training method for leveraging the large availability of satellite imagery.
We observe significant improvements up to 25% absolute mIoU when pre-trained with our proposed method.
We find that learnt features can generalize between disparate regions opening up the possibility of using the proposed pre-training scheme.
arXiv Detail & Related papers (2022-03-11T16:14:14Z) - Mission-Aware Spatio-Temporal Deep Learning Model for UAS Instantaneous
Density Prediction [3.59465210252619]
Number of daily sUAS operations in uncontrolled low altitude airspace is expected to reach into the millions in a few years.
Deep learning-based UAS instantaneous density prediction model is presented.
arXiv Detail & Related papers (2020-03-22T02:40:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.