Fixing the Pitfalls of Probabilistic Time-Series Forecasting Evaluation by Kernel Quadrature
- URL: http://arxiv.org/abs/2503.06079v1
- Date: Sat, 08 Mar 2025 06:01:10 GMT
- Title: Fixing the Pitfalls of Probabilistic Time-Series Forecasting Evaluation by Kernel Quadrature
- Authors: Masaki Adachi, Masahiro Fujisawa, Michael A Osborne,
- Abstract summary: Most widely used metric, the continuous ranked probability score (CRPS), is a strictly proper scoring function.<n>We found that popular CRPS estimators exhibit inherent estimation biases.<n>We introduce a kernel quadrature approach that leverages an unbiased CRPS estimator and employs cubature construction for computation scalable.
- Score: 19.742371911023774
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Despite the significance of probabilistic time-series forecasting models, their evaluation metrics often involve intractable integrations. The most widely used metric, the continuous ranked probability score (CRPS), is a strictly proper scoring function; however, its computation requires approximation. We found that popular CRPS estimators--specifically, the quantile-based estimator implemented in the widely used GluonTS library and the probability-weighted moment approximation--both exhibit inherent estimation biases. These biases lead to crude approximations, resulting in improper rankings of forecasting model performance when CRPS values are close. To address this issue, we introduced a kernel quadrature approach that leverages an unbiased CRPS estimator and employs cubature construction for scalable computation. Empirically, our approach consistently outperforms the two widely used CRPS estimators.
Related papers
- In-Context Parametric Inference: Point or Distribution Estimators? [66.22308335324239]
We show that amortized point estimators generally outperform posterior inference, though the latter remain competitive in some low-dimensional problems.<n>Our experiments indicate that amortized point estimators generally outperform posterior inference, though the latter remain competitive in some low-dimensional problems.
arXiv Detail & Related papers (2025-02-17T10:00:24Z) - Existence of unbiased resilient estimators in discrete quantum systems [0.0]
Bhattacharyya bounds offer a more robust estimation framework with respect to prior accuracy.
We show that when the number of constraints exceeds the number of measurement outcomes, an estimator with finite variance typically does not exist.
arXiv Detail & Related papers (2024-02-23T10:12:35Z) - Score Matching-based Pseudolikelihood Estimation of Neural Marked
Spatio-Temporal Point Process with Uncertainty Quantification [59.81904428056924]
We introduce SMASH: a Score MAtching estimator for learning markedPs with uncertainty quantification.
Specifically, our framework adopts a normalization-free objective by estimating the pseudolikelihood of markedPs through score-matching.
The superior performance of our proposed framework is demonstrated through extensive experiments in both event prediction and uncertainty quantification.
arXiv Detail & Related papers (2023-10-25T02:37:51Z) - Bayesian Cramér-Rao Bound Estimation with Score-Based Models [3.4480437706804503]
The Bayesian Cram'er-Rao bound (CRB) provides a lower bound on the mean square error of any Bayesian estimator under mild regularity conditions.
This work introduces a new data-driven estimator for the CRB using score matching.
arXiv Detail & Related papers (2023-09-28T00:22:21Z) - Mathematical Properties of Continuous Ranked Probability Score
Forecasting [0.0]
We study the rate of convergence in terms of CRPS of distributional regression methods.
We show that the k-nearest neighbor method and the kernel method for the distributional regression reach the optimal rate of convergence in dimension $dgeq2$.
arXiv Detail & Related papers (2022-05-09T15:01:13Z) - Random Noise vs State-of-the-Art Probabilistic Forecasting Methods : A
Case Study on CRPS-Sum Discrimination Ability [4.9449660544238085]
We show that the statistical properties of target data affect the discrimination ability of CRPS-Sum.
We highlight that CRPS-Sum calculation overlooks the performance of the model on each dimension.
We show that it is easily possible to have a better CRPS-Sum for a dummy model, which looks like random noise.
arXiv Detail & Related papers (2022-01-21T12:36:58Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Evaluating probabilistic classifiers: Reliability diagrams and score
decompositions revisited [68.8204255655161]
We introduce the CORP approach, which generates provably statistically Consistent, Optimally binned, and Reproducible reliability diagrams in an automated way.
Corpor is based on non-parametric isotonic regression and implemented via the Pool-adjacent-violators (PAV) algorithm.
arXiv Detail & Related papers (2020-08-07T08:22:26Z) - Nonparametric Score Estimators [49.42469547970041]
Estimating the score from a set of samples generated by an unknown distribution is a fundamental task in inference and learning of probabilistic models.
We provide a unifying view of these estimators under the framework of regularized nonparametric regression.
We propose score estimators based on iterative regularization that enjoy computational benefits from curl-free kernels and fast convergence.
arXiv Detail & Related papers (2020-05-20T15:01:03Z) - Distributional robustness of K-class estimators and the PULSE [4.56877715768796]
We prove that the classical K-class estimator satisfies such optimality by establishing a connection between K-class estimators and anchor regression.
We show that it can be computed efficiently as a data-driven simulation K-class estimator.
There are several settings including weak instrument settings, where it outperforms other estimators.
arXiv Detail & Related papers (2020-05-07T09:39:07Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.