Combinations of distributional regression algorithms with application in uncertainty estimation of corrected satellite precipitation products
- URL: http://arxiv.org/abs/2407.01623v2
- Date: Mon, 06 Jan 2025 18:03:06 GMT
- Title: Combinations of distributional regression algorithms with application in uncertainty estimation of corrected satellite precipitation products
- Authors: Georgia Papacharalampous, Hristos Tyralis, Nikolaos Doulamis, Anastasios Doulamis,
- Abstract summary: We introduce the concept of distributional regression in precipitation dataset creation.
New ensemble learning methods can be valuable not only for spatial prediction but also for other prediction problems.
Stacking was shown to be superior to individual methods at most quantile levels when evaluated with the quantile loss function.
- Score: 3.8623569699070353
- License:
- Abstract: To facilitate effective decision-making, precipitation datasets should include uncertainty estimates. Quantile regression with machine learning has been proposed for issuing such estimates. Distributional regression offers distinct advantages over quantile regression, including the ability to model intermittency as well as a stronger ability to extrapolate beyond the training data, which is critical for predicting extreme precipitation. Therefore, here, we introduce the concept of distributional regression in precipitation dataset creation, specifically for the spatial prediction task of correcting satellite precipitation products. Building upon this concept, we formulated new ensemble learning methods that can be valuable not only for spatial prediction but also for other prediction problems. These methods exploit conditional zero-adjusted probability distributions estimated with generalized additive models for location, scale and shape (GAMLSS), spline-based GAMLSS and distributional regression forests as well as their ensembles (stacking based on quantile regression and equal-weight averaging). To identify the most effective methods for our specific problem, we compared them to benchmarks using a large, multi-source precipitation dataset. Stacking was shown to be superior to individual methods at most quantile levels when evaluated with the quantile loss function. Moreover, while the relative ranking of the methods varied across different quantile levels, stacking methods, and to a lesser extent mean combiners, exhibited lower variance in their performance across different quantiles compared to individual methods that occasionally ranked extremely low. Overall, a task-specific combination of multiple distributional regression algorithms could yield significant benefits in terms of stability.
Related papers
- Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables.
We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure.
We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z) - A sparse PAC-Bayesian approach for high-dimensional quantile prediction [0.0]
This paper presents a novel probabilistic machine learning approach for high-dimensional quantile prediction.
It uses a pseudo-Bayesian framework with a scaled Student-t prior and Langevin Monte Carlo for efficient computation.
Its effectiveness is validated through simulations and real-world data, where it performs competitively against established frequentist and Bayesian techniques.
arXiv Detail & Related papers (2024-09-03T08:01:01Z) - Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.
We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.
We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z) - Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery [0.0]
We focus on distributed estimation and support recovery for high-dimensional linear quantile regression.
We transform the original quantile regression into the least-squares optimization.
An efficient algorithm is developed, which enjoys high computation and communication efficiency.
arXiv Detail & Related papers (2024-05-13T08:32:22Z) - Bayesian Quantile Regression with Subset Selection: A Decision Analysis Perspective [0.0]
Quantile regression is a powerful tool for inferring how covariates affect specific percentiles of the response distribution.
Existing methods estimate conditional quantiles separately for each quantile of interest or estimate the entire conditional distribution using semi- or non-parametric models.
We pose the fundamental problems of linear quantile estimation, uncertainty quantification, and subset selection from a Bayesian decision analysis perspective.
arXiv Detail & Related papers (2023-11-03T17:19:31Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Communication-Efficient Distributed Quantile Regression with Optimal
Statistical Guarantees [2.064612766965483]
We address the problem of how to achieve optimal inference in distributed quantile regression without stringent scaling conditions.
The difficulties are resolved through a double-smoothing approach that is applied to the local (at each data source) and global objective functions.
Despite the reliance on a delicate combination of local and global smoothing parameters, the quantile regression model is fully parametric.
arXiv Detail & Related papers (2021-10-25T17:09:59Z) - Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions.
We investigate methods for aggregating any number of conditional quantile models.
All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z) - Distributed Learning of Finite Gaussian Mixtures [21.652015112462]
We study split-and-conquer approaches for the distributed learning of finite Gaussian mixtures.
New estimator is shown to be consistent and retains root-n consistency under some general conditions.
Experiments based on simulated and real-world data show that the proposed split-and-conquer approach has comparable statistical performance with the global estimator.
arXiv Detail & Related papers (2020-10-20T16:17:47Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - GenDICE: Generalized Offline Estimation of Stationary Values [108.17309783125398]
We show that effective estimation can still be achieved in important applications.
Our approach is based on estimating a ratio that corrects for the discrepancy between the stationary and empirical distributions.
The resulting algorithm, GenDICE, is straightforward and effective.
arXiv Detail & Related papers (2020-02-21T00:27:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.