ModelRadar: Aspect-based Forecast Evaluation
- URL: http://arxiv.org/abs/2504.00059v1
- Date: Mon, 31 Mar 2025 11:50:45 GMT
- Title: ModelRadar: Aspect-based Forecast Evaluation
- Authors: Vitor Cerqueira, Luis Roque, Carlos Soares,
- Abstract summary: Current practices for evaluating and comparing forecasting models focus on summarising performance into a single score.<n>This limitation is especially problematic for time series forecasting, where multiple layers of averaging--across time steps, horizons, and multiple time series in a dataset--can mask relevant performance variations.<n>We propose ModelRadar, a framework for evaluating univariate time series forecasting models across multiple aspects, such as stationarity, presence of anomalies, or forecasting horizons.
- Score: 0.393259574660092
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate evaluation of forecasting models is essential for ensuring reliable predictions. Current practices for evaluating and comparing forecasting models focus on summarising performance into a single score, using metrics such as SMAPE. While convenient, averaging performance over all samples dilutes relevant information about model behavior under varying conditions. This limitation is especially problematic for time series forecasting, where multiple layers of averaging--across time steps, horizons, and multiple time series in a dataset--can mask relevant performance variations. We address this limitation by proposing ModelRadar, a framework for evaluating univariate time series forecasting models across multiple aspects, such as stationarity, presence of anomalies, or forecasting horizons. We demonstrate the advantages of this framework by comparing 24 forecasting methods, including classical approaches and different machine learning algorithms. NHITS, a state-of-the-art neural network architecture, performs best overall but its superiority varies with forecasting conditions. For instance, concerning the forecasting horizon, we found that NHITS (and also other neural networks) only outperforms classical approaches for multi-step ahead forecasting. Another relevant insight is that classical approaches such as ETS or Theta are notably more robust in the presence of anomalies. These and other findings highlight the importance of aspect-based model evaluation for both practitioners and researchers. ModelRadar is available as a Python package.
Related papers
- ChronosX: Adapting Pretrained Time Series Models with Exogenous Variables [30.679739751673655]
This paper introduces a new method to incorporate covariates into pretrained time series forecasting models.<n>Our proposed approach incorporates covariate information into pretrained forecasting models through modular blocks.<n>In evaluations on both synthetic and real datasets, our approach effectively incorporates covariate information into pretrained models, outperforming existing baselines.
arXiv Detail & Related papers (2025-03-15T12:34:19Z) - On conditional diffusion models for PDE simulations [53.01911265639582]
We study score-based diffusion models for forecasting and assimilation of sparse observations.
We propose an autoregressive sampling approach that significantly improves performance in forecasting.
We also propose a new training strategy for conditional score-based models that achieves stable performance over a range of history lengths.
arXiv Detail & Related papers (2024-10-21T18:31:04Z) - Forecasting with Deep Learning: Beyond Average of Average of Average Performance [0.393259574660092]
Current practices for evaluating and comparing forecasting models focus on summarising performance into a single score.
We propose a novel framework for evaluating models from multiple perspectives.
We show the advantages of this framework by comparing a state-of-the-art deep learning approach with classical forecasting techniques.
arXiv Detail & Related papers (2024-06-24T12:28:22Z) - Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - Lag-Llama: Towards Foundation Models for Probabilistic Time Series
Forecasting [54.04430089029033]
We present Lag-Llama, a general-purpose foundation model for time series forecasting based on a decoder-only transformer architecture.
Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities.
When fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-12T12:29:32Z) - Counterfactual Explanations for Time Series Forecasting [14.03870816983583]
We formulate the novel problem of counterfactual generation for time series forecasting, and propose an algorithm, called ForecastCF.
ForecastCF solves the problem by applying gradient-based perturbations to the original time series.
Our results show that ForecastCF outperforms the baseline in terms of counterfactual validity and data manifold closeness.
arXiv Detail & Related papers (2023-10-12T08:51:59Z) - Evaluation of Time-Series Forecasting Models for Chickenpox Cases
Estimation in Hungary [0.0]
We use time-series forecasting techniques to model and predict the future incidence of chickenpox.
We implement and simulate multiple models and data preprocessing techniques on a Hungary-collected dataset.
arXiv Detail & Related papers (2022-09-28T14:27:07Z) - Cluster-and-Conquer: A Framework For Time-Series Forecasting [94.63501563413725]
We propose a three-stage framework for forecasting high-dimensional time-series data.
Our framework is highly general, allowing for any time-series forecasting and clustering method to be used in each step.
When instantiated with simple linear autoregressive models, we are able to achieve state-of-the-art results on several benchmark datasets.
arXiv Detail & Related papers (2021-10-26T20:41:19Z) - Feature-weighted Stacking for Nonseasonal Time Series Forecasts: A Case
Study of the COVID-19 Epidemic Curves [0.0]
We investigate ensembling techniques in forecasting and examine their potential for use in nonseasonal time-series.
We propose using late data fusion, using a stacked ensemble of two forecasting models and two meta-features that prove their predictive power during a preliminary forecasting stage.
arXiv Detail & Related papers (2021-08-19T14:44:46Z) - Back2Future: Leveraging Backfill Dynamics for Improving Real-time
Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task.
'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature.
We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z) - Learning Interpretable Deep State Space Model for Probabilistic Time
Series Forecasting [98.57851612518758]
Probabilistic time series forecasting involves estimating the distribution of future based on its history.
We propose a deep state space model for probabilistic time series forecasting whereby the non-linear emission model and transition model are parameterized by networks.
We show in experiments that our model produces accurate and sharp probabilistic forecasts.
arXiv Detail & Related papers (2021-01-31T06:49:33Z) - A Multi-Channel Neural Graphical Event Model with Negative Evidence [76.51278722190607]
Event datasets are sequences of events of various types occurring irregularly over the time-line.
We propose a non-parametric deep neural network approach in order to estimate the underlying intensity functions.
arXiv Detail & Related papers (2020-02-21T23:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.