Kriging prior Regression: A Case for Kriging-Based Spatial Features with TabPFN in Soil Mapping
- URL: http://arxiv.org/abs/2509.09408v2
- Date: Fri, 12 Sep 2025 14:31:32 GMT
- Title: Kriging prior Regression: A Case for Kriging-Based Spatial Features with TabPFN in Soil Mapping
- Authors: Jonas Schmidinger, Viacheslav Barkov, Sebastian Vogel, Martin Atzmueller, Gerard B M Heuvelink,
- Abstract summary: We propose a hybrid framework that enriches machine learning with spatial context.<n>We call this approach 'kriging prior regression' (KpR) as it follows the inverse logic of regression kriging.<n>KpR with TabPFN demonstrated reliable uncertainty estimates and more accurate predictions in comparison to several other spatial techniques.
- Score: 0.5944508231938734
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Machine learning and geostatistics are two fundamentally different frameworks for predicting and spatially mapping soil properties. Geostatistics leverages the spatial structure of soil properties, while machine learning captures the relationship between available environmental features and soil properties. We propose a hybrid framework that enriches ML with spatial context through engineering of 'spatial lag' features from ordinary kriging. We call this approach 'kriging prior regression' (KpR), as it follows the inverse logic of regression kriging. To evaluate this approach, we assessed both the point and probabilistic prediction performance of KpR, using the TabPFN model across six fieldscale datasets from LimeSoDa. These datasets included soil organic carbon, clay content, and pH, along with features derived from remote sensing and in-situ proximal soil sensing. KpR with TabPFN demonstrated reliable uncertainty estimates and more accurate predictions in comparison to several other spatial techniques (e.g., regression/residual kriging with TabPFN), as well as to established non-spatial machine learning algorithms (e.g., random forest). Most notably, it significantly improved the average R2 by around 30% compared to machine learning algorithms without spatial context. This improvement was due to the strong prediction performance of the TabPFN algorithm itself and the complementary spatial information provided by KpR features. TabPFN is particularly effective for prediction tasks with small sample sizes, common in precision agriculture, whereas KpR can compensate for weak relationships between sensing features and soil properties when proximal soil sensing data are limited. Hence, we conclude that KpR with TabPFN is a very robust and versatile modelling framework for digital soil mapping in precision agriculture.
Related papers
- Predicting California Bearing Ratio with Ensemble and Neural Network Models: A Case Study from Turkiye [0.0]
The California Bearing Ratio (CBR) is a key geotechnical indicator used to assess the load-bearing capacity of subgrade soils.<n>Traditional tests are often time-consuming, costly, and can be impractical, particularly for large-scale or diverse soil profiles.<n>Recent progress in artificial intelligence, especially machine learning (ML), has enabled data-driven approaches for modeling complex soil behavior with greater speed and precision.<n>This study introduces a comprehensive ML framework for CBR prediction using a dataset of 382 soil samples collected from various geoclimatic regions in Trkiye.
arXiv Detail & Related papers (2025-12-09T08:09:55Z) - Tabular foundation model for GEOAI benchmark problems BM/AirportSoilProperties/2/2025 [2.07098502859192]
This paper presents a novel application of the Tabular Prior-Data Fitted Network (TabPFN) to site characterization problems defined in the GEOAI benchmark BM/AirportSoilProperties/2/2025.<n>We apply TabPFN in a zero-training, few-shot, in-spatial learning setting and provide it with additional context from the big indirect database (BID)<n>The study demonstrates that TabPFN, as a general-purpose foundation model, achieved superior accuracy and well-calibrated predictive distributions.
arXiv Detail & Related papers (2025-09-03T10:21:18Z) - Aligned Manifold Property and Topology Point Clouds for Learning Molecular Properties [55.2480439325792]
This work introduces AMPTCR, a molecular surface representation that combines local quantum-derived scalar fields and custom topological descriptors within an aligned point cloud format.<n>For molecular weight, results confirm that AMPTCR encodes physically meaningful data, with a validation R2 of 0.87.<n>In the bacterial inhibition task, AMPTCR enables both classification and direct regression of E. coli inhibition values.
arXiv Detail & Related papers (2025-07-22T04:35:50Z) - Feature-free regression kriging [4.270650728191168]
This study proposes a Feature-Free Regression Kriging (FFRK) method to construct a regression-based trend surface without requiring external explanatory variables.<n>We conducted experiments on the spatial distribution prediction of three heavy metals in a mining area in Australia.<n>This approach effectively addresses spatial nonstationarity while reducing the cost of acquiring explanatory variables, improving both prediction accuracy and ability.
arXiv Detail & Related papers (2025-07-10T02:34:07Z) - Fractal interpolation in the context of prediction accuracy optimization [44.99833362998488]
This paper focuses on the hypothesis of optimizing time series predictions using fractal techniques.
Prediction results obtained with the LSTM model showed a significant accuracy improvement compared to the raw datasets.
arXiv Detail & Related papers (2024-03-01T09:49:53Z) - Minimally Supervised Learning using Topological Projections in
Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs)
Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU)
Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z) - Subsurface Characterization using Ensemble-based Approaches with Deep
Generative Models [2.184775414778289]
Inverse modeling is limited for ill-posed, high-dimensional applications due to computational costs and poor prediction accuracy with sparse datasets.
We combine Wasserstein Geneversarative Adrial Network with Gradient Penalty (WGAN-GP) and Ensemble Smoother with Multiple Data Assimilation (ES-MDA)
WGAN-GP is trained to generate high-dimensional K fields from a low-dimensional latent space and ES-MDA updates the latent variables by assimilating available measurements.
arXiv Detail & Related papers (2023-10-02T01:27:10Z) - Distribution Regression with Sliced Wasserstein Kernels [45.916342378789174]
We propose the first OT-based estimator for distribution regression.
We study the theoretical properties of a kernel ridge regression estimator based on such representation.
arXiv Detail & Related papers (2022-02-08T15:21:56Z) - Importance Weighting Approach in Kernel Bayes' Rule [43.221685127485735]
We study a nonparametric approach to Bayesian computation via feature means, where the expectation of prior features is updated to yield expected posterior features.
All quantities involved in the Bayesian update are learned from observed data, making the method entirely model-free.
Our approach is based on importance weighting, which results in superior numerical stability to the existing approach to KBR.
arXiv Detail & Related papers (2022-02-05T03:06:59Z) - Treeging [0.0]
Treeging combines the flexible mean structure of regression trees with the covariance-based prediction strategy of kriging into the base learner of an ensemble prediction algorithm.
We investigate the predictive accuracy of treeging across a thorough and widely varied battery of spatial and space-time simulation scenarios.
arXiv Detail & Related papers (2021-10-03T17:48:18Z) - Surface Warping Incorporating Machine Learning Assisted Domain
Likelihood Estimation: A New Paradigm in Mine Geology Modelling and
Automation [68.8204255655161]
A Bayesian warping technique has been proposed to reshape modeled surfaces based on geochemical and spatial constraints imposed by newly acquired blasthole data.
This paper focuses on incorporating machine learning in this warping framework to make the likelihood generalizable.
Its foundation is laid by a Bayesian computation in which the geological domain likelihood given the chemistry, p(g|c) plays a similar role to p(y(c)|g.
arXiv Detail & Related papers (2021-02-15T10:37:52Z) - Using vis-NIRS and Machine Learning methods to diagnose sugarcane soil
chemical properties [0.0]
Knowing chemical soil properties might be determinant in crop management and total yield production.
Traditional property estimation approaches are time-consuming and require complex lab setups.
Property estimation from spectral signals(vis-NIRS), emerged as a low-cost, non-invasive, and non-destructive alternative.
arXiv Detail & Related papers (2020-12-23T21:46:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.