Comparing Sequential Forecasters
- URL: http://arxiv.org/abs/2110.00115v6
- Date: Thu, 9 Nov 2023 05:09:52 GMT
- Title: Comparing Sequential Forecasters
- Authors: Yo Joong Choe and Aaditya Ramdas
- Abstract summary: Consider two forecasters, each making a single prediction for a sequence of events over time.
How might we compare these forecasters, either online or post-hoc, while avoiding unverifiable assumptions on how the forecasts and outcomes were generated?
We present novel sequential inference procedures for estimating the time-varying difference in forecast scores.
We empirically validate our approaches by comparing real-world baseball and weather forecasters.
- Score: 35.38264087676121
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Consider two forecasters, each making a single prediction for a sequence of
events over time. We ask a relatively basic question: how might we compare
these forecasters, either online or post-hoc, while avoiding unverifiable
assumptions on how the forecasts and outcomes were generated? In this paper, we
present a rigorous answer to this question by designing novel sequential
inference procedures for estimating the time-varying difference in forecast
scores. To do this, we employ confidence sequences (CS), which are sequences of
confidence intervals that can be continuously monitored and are valid at
arbitrary data-dependent stopping times ("anytime-valid"). The widths of our
CSs are adaptive to the underlying variance of the score differences.
Underlying their construction is a game-theoretic statistical framework, in
which we further identify e-processes and p-processes for sequentially testing
a weak null hypothesis -- whether one forecaster outperforms another on average
(rather than always). Our methods do not make distributional assumptions on the
forecasts or outcomes; our main theorems apply to any bounded scores, and we
later provide alternative methods for unbounded scores. We empirically validate
our approaches by comparing real-world baseball and weather forecasters.
Related papers
- Distribution-Free Conformal Joint Prediction Regions for Neural Marked Temporal Point Processes [4.324839843326325]
We develop more reliable methods for uncertainty in neural TPP models via the framework of conformal prediction.
A primary objective is to generate a distribution-free joint prediction region for an event's arrival time and mark, with a finite-sample marginal coverage guarantee.
arXiv Detail & Related papers (2024-01-09T15:28:29Z) - Early-Exit Neural Networks with Nested Prediction Sets [26.618810100134862]
Early-exit neural networks (EENNs) enable adaptive and efficient inference by providing predictions at multiple stages during the forward pass.
Standard Bayesian techniques such as conformal prediction and credible sets are not suitable for EENNs.
We investigate anytime-valid confidence sequences (AVCSs)
These sequences are inherently nested and thus well-suited for an EENN's sequential predictions.
arXiv Detail & Related papers (2023-11-10T08:38:18Z) - SMURF-THP: Score Matching-based UnceRtainty quantiFication for
Transformer Hawkes Process [76.98721879039559]
We propose SMURF-THP, a score-based method for learning Transformer Hawkes process and quantifying prediction uncertainty.
Specifically, SMURF-THP learns the score function of events' arrival time based on a score-matching objective.
We conduct extensive experiments in both event type prediction and uncertainty quantification of arrival time.
arXiv Detail & Related papers (2023-10-25T03:33:45Z) - Score Matching-based Pseudolikelihood Estimation of Neural Marked
Spatio-Temporal Point Process with Uncertainty Quantification [59.81904428056924]
We introduce SMASH: a Score MAtching estimator for learning markedPs with uncertainty quantification.
Specifically, our framework adopts a normalization-free objective by estimating the pseudolikelihood of markedPs through score-matching.
The superior performance of our proposed framework is demonstrated through extensive experiments in both event prediction and uncertainty quantification.
arXiv Detail & Related papers (2023-10-25T02:37:51Z) - Streaming Motion Forecasting for Autonomous Driving [71.7468645504988]
We introduce a benchmark that queries future trajectories on streaming data and we refer to it as "streaming forecasting"
Our benchmark inherently captures the disappearance and re-appearance of agents, which is a safety-critical problem yet overlooked by snapshot-based benchmarks.
We propose a plug-and-play meta-algorithm called "Predictive Streamer" that can adapt any snapshot-based forecaster into a streaming forecaster.
arXiv Detail & Related papers (2023-10-02T17:13:16Z) - Sequential Predictive Two-Sample and Independence Testing [114.4130718687858]
We study the problems of sequential nonparametric two-sample and independence testing.
We build upon the principle of (nonparametric) testing by betting.
arXiv Detail & Related papers (2023-04-29T01:30:33Z) - Sequential Predictive Conformal Inference for Time Series [16.38369532102931]
We present a new distribution-free conformal prediction algorithm for sequential data (e.g., time series)
We specifically account for the nature that time series data are non-exchangeable, and thus many existing conformal prediction algorithms are not applicable.
arXiv Detail & Related papers (2022-12-07T05:07:27Z) - How to Evaluate Uncertainty Estimates in Machine Learning for
Regression? [1.4610038284393165]
We show that both approaches to evaluating the quality of uncertainty estimates have serious flaws.
Firstly, both approaches cannot disentangle the separate components that jointly create the predictive uncertainty.
Thirdly, the current approach to test prediction intervals directly has additional flaws.
arXiv Detail & Related papers (2021-06-07T07:47:46Z) - Quantifying Uncertainty in Deep Spatiotemporal Forecasting [67.77102283276409]
We describe two types of forecasting problems: regular grid-based and graph-based.
We analyze UQ methods from both the Bayesian and the frequentist point view, casting in a unified framework via statistical decision theory.
Through extensive experiments on real-world road network traffic, epidemics, and air quality forecasting tasks, we reveal the statistical computational trade-offs for different UQ methods.
arXiv Detail & Related papers (2021-05-25T14:35:46Z) - Sequential Aggregation of Probabilistic Forecasts -- Applicaton to Wind
Speed Ensemble Forecasts [0.0]
This article adapts the theory of prediction with expert advice to the case of probabilistic forecasts issued as step-wise cumulative distribution functions (CDFs)
The second goal of this study is to explore the use of two forecast performance criteria: the Continous ranked probability score (CRPS) and the Jolliffe-Primo test.
arXiv Detail & Related papers (2020-05-07T15:07:43Z) - Ambiguity in Sequential Data: Predicting Uncertain Futures with
Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data.
We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.