Uncertainty estimation in satellite precipitation spatial prediction by combining distributional regression algorithms
- URL: http://arxiv.org/abs/2407.01623v1
- Date: Sat, 29 Jun 2024 05:58:00 GMT
- Title: Uncertainty estimation in satellite precipitation spatial prediction by combining distributional regression algorithms
- Authors: Georgia Papacharalampous, Hristos Tyralis, Nikolaos Doulamis, Anastasios Doulamis,
- Abstract summary: We introduce the concept of distributional regression for the engineering task of creating precipitation datasets through data merging.
We propose new ensemble learning methods that can be valuable not only for spatial prediction but also for prediction problems in general.
- Score: 3.8623569699070353
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To facilitate effective decision-making, gridded satellite precipitation products should include uncertainty estimates. Machine learning has been proposed for issuing such estimates. However, most existing algorithms for this purpose rely on quantile regression. Distributional regression offers distinct advantages over quantile regression, including the ability to model intermittency as well as a stronger ability to extrapolate beyond the training data, which is critical for predicting extreme precipitation. In this work, we introduce the concept of distributional regression for the engineering task of creating precipitation datasets through data merging. Building upon this concept, we propose new ensemble learning methods that can be valuable not only for spatial prediction but also for prediction problems in general. These methods exploit conditional zero-adjusted probability distributions estimated with generalized additive models for location, scale, and shape (GAMLSS), spline-based GAMLSS and distributional regression forests as well as their ensembles (stacking based on quantile regression, and equal-weight averaging). To identify the most effective methods for our specific problem, we compared them to benchmarks using a large, multi-source precipitation dataset. Stacking emerged as the most successful strategy. Three specific stacking methods achieved the best performance based on the quantile scoring rule, although the ranking of these methods varied across quantile levels. This suggests that a task-specific combination of multiple algorithms could yield significant benefits.
Related papers
- Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables.
We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure.
We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z) - A sparse PAC-Bayesian approach for high-dimensional quantile prediction [0.0]
This paper presents a novel probabilistic machine learning approach for high-dimensional quantile prediction.
It uses a pseudo-Bayesian framework with a scaled Student-t prior and Langevin Monte Carlo for efficient computation.
Its effectiveness is validated through simulations and real-world data, where it performs competitively against established frequentist and Bayesian techniques.
arXiv Detail & Related papers (2024-09-03T08:01:01Z) - Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.
We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.
We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z) - Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery [0.0]
We focus on distributed estimation and support recovery for high-dimensional linear quantile regression.
We transform the original quantile regression into the least-squares optimization.
An efficient algorithm is developed, which enjoys high computation and communication efficiency.
arXiv Detail & Related papers (2024-05-13T08:32:22Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Consensus-Adaptive RANSAC [104.87576373187426]
We propose a new RANSAC framework that learns to explore the parameter space by considering the residuals seen so far via a novel attention layer.
The attention mechanism operates on a batch of point-to-model residuals, and updates a per-point estimation state to take into account the consensus found through a lightweight one-step transformer.
arXiv Detail & Related papers (2023-07-26T08:25:46Z) - Scalable Estimation for Structured Additive Distributional Regression [0.0]
We propose a novel backfitting algorithm, which is based on the ideas of gradient descent and can deal virtually with any amount of data on a conventional laptop.
Performance is evaluated using an extensive simulation study and an exceptionally challenging and unique example of lightning count prediction over Austria.
arXiv Detail & Related papers (2023-01-13T14:59:42Z) - Communication-Efficient Distributed Quantile Regression with Optimal
Statistical Guarantees [2.064612766965483]
We address the problem of how to achieve optimal inference in distributed quantile regression without stringent scaling conditions.
The difficulties are resolved through a double-smoothing approach that is applied to the local (at each data source) and global objective functions.
Despite the reliance on a delicate combination of local and global smoothing parameters, the quantile regression model is fully parametric.
arXiv Detail & Related papers (2021-10-25T17:09:59Z) - Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions.
We investigate methods for aggregating any number of conditional quantile models.
All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z) - Distributed Learning of Finite Gaussian Mixtures [21.652015112462]
We study split-and-conquer approaches for the distributed learning of finite Gaussian mixtures.
New estimator is shown to be consistent and retains root-n consistency under some general conditions.
Experiments based on simulated and real-world data show that the proposed split-and-conquer approach has comparable statistical performance with the global estimator.
arXiv Detail & Related papers (2020-10-20T16:17:47Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.