Related papers: Measuring Time Series Forecast Stability for Demand Planning

Measuring Time Series Forecast Stability for Demand Planning

URL: http://arxiv.org/abs/2508.10063v1
Date: Wed, 13 Aug 2025 04:21:37 GMT
Title: Measuring Time Series Forecast Stability for Demand Planning
Authors: Steven Klee, Yuntian Xia,
Abstract summary: Time series forecasting is a critical first step in generating demand plans for supply chains.<n>In production systems, demand planners often value consistency and stability over incremental accuracy improvements.<n>We show that ensemble models improve stability without significantly deteriorating (or even improving) forecast accuracy.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Time series forecasting is a critical first step in generating demand plans for supply chains. Experiments on time series models typically focus on demonstrating improvements in forecast accuracy over existing/baseline solutions, quantified according to some accuracy metric. There is no doubt that forecast accuracy is important; however in production systems, demand planners often value consistency and stability over incremental accuracy improvements. Assuming that the inputs have not changed significantly, forecasts that vary drastically from one planning cycle to the next require high amounts of human intervention, which frustrates demand planners and can even cause them to lose trust in ML forecasting models. We study model-induced stochasticity, which quantifies the variance of a set of forecasts produced by a single model when the set of inputs is fixed. Models with lower variance are more stable. Recently the forecasting community has seen significant advances in forecast accuracy through the development of deep machine learning models for time series forecasting. We perform a case study measuring the stability and accuracy of state-of-the-art forecasting models (Chronos, DeepAR, PatchTST, Temporal Fusion Transformer, TiDE, and the AutoGluon best quality ensemble) on public data sets from the M5 competition and Favorita grocery sales. We show that ensemble models improve stability without significantly deteriorating (or even improving) forecast accuracy. While these results may not be surprising, the main point of this paper is to propose the need for further study of forecast stability for models that are being deployed in production systems.

Related papers

Back to the Future: Look-ahead Augmentation and Parallel Self-Refinement for Time Series Forecasting [10.615433089293228]
Back to the Future is a simple yet effective framework that enhances forecasting stability through look-ahead augmentation and self-corrective refinement.<n>Despite its simplicity, our approach consistently improves long-horizon accuracy and mitigates the instability of linear forecasting models.<n>These results suggest that leveraging model-generated forecasts as augmentation can be a simple yet powerful way to enhance long-term prediction, even without complex architectures.
arXiv Detail & Related papers (2026-02-02T14:23:31Z)
Beyond Accuracy: A Stability-Aware Metric for Multi-Horizon Forecasting [0.0]
We introduce the forecast accuracy and coherence score (forecast AC score for short) for measuring the quality of probabilistic multi-horizon forecasts.<n>Results demonstrate substantial improvements over traditional maximum likelihood estimation.
arXiv Detail & Related papers (2026-01-15T21:26:57Z)
SimDiff: Simpler Yet Better Diffusion Model for Time Series Point Forecasting [8.141505251306622]
Diffusion models have recently shown promise in time series forecasting.<n>They often fail to achieve state-of-the-art point estimation performance.<n>We propose SimDiff, a single-stage, end-to-end framework for point estimation.
arXiv Detail & Related papers (2025-11-24T16:09:55Z)
Accuracy Law for the Future of Deep Time Series Forecasting [65.46625911002202]
Time series forecasting inherently faces a non-zero error lower bound due to its partially observable and uncertain nature.<n>This paper focuses on a fundamental question: how to estimate the performance upper bound of deep time series forecasting.<n>Based on rigorous statistical tests of over 2,800 newly trained deep forecasters, we discover a significant exponential relationship between the minimum forecasting error of deep models and the complexity of window-wise series patterns.
arXiv Detail & Related papers (2025-10-03T05:18:47Z)
FinZero: Launching Multi-modal Financial Time Series Forecast with Large Reasoning Model [27.20045729222667]
FinZero is a multimodal pre-trained model finetuned by UARPO to perform reasoning, prediction, and analytical understanding on the FVLDB financial time series.<n>After fine-tuning with UARPO, FinZero achieves an approximate 13.48% improvement in prediction accuracy over GPT-4o in the high-confidence group.
arXiv Detail & Related papers (2025-09-10T16:32:41Z)
Time Series Forecastability Measures [4.136441456697068]
This paper proposes using two metrics to quantify the forecastability of time series prior to model development.<n>The spectral predictability score evaluates the strength and regularity of frequency components in the time series.<n>The Lyapunov exponents quantify the chaos and stability of the system generating the data.
arXiv Detail & Related papers (2025-07-17T22:23:51Z)
Enforcing tail calibration when training probabilistic forecast models [0.0]
We study how the loss function used to train probabilistic forecast models can be adapted to improve the reliability of forecasts made for extreme events.<n>We demonstrate that state-of-the-art models do not issue calibrated forecasts for extreme wind speeds, and that the calibration of forecasts for extreme events can be improved by suitable adaptations to the loss function during model training.
arXiv Detail & Related papers (2025-06-16T16:51:06Z)
Uncertainty-aware segmentation for rainfall prediction post processing [0.7646713951724011]
We explore uncertainty-aware deep learning models for post-processing daily cumulative quantitative precipitation forecasts. Our study compares different state-of-the-art models, and we propose a variant of the well-known SDE-Net. Our results show that all deep learning models significantly outperform the average baseline NWP solution.
arXiv Detail & Related papers (2024-08-28T16:31:40Z)
Predictive Churn with the Set of Good Models [61.00058053669447]
This paper explores connections between two seemingly unrelated concepts of predictive inconsistency.<n>The first, known as predictive multiplicity, occurs when models that perform similarly produce conflicting predictions for individual samples.<n>The second concept, predictive churn, examines the differences in individual predictions before and after model updates.
arXiv Detail & Related papers (2024-02-12T16:15:25Z)
When Rigidity Hurts: Soft Consistency Regularization for Probabilistic Hierarchical Time Series Forecasting [69.30930115236228]
Probabilistic hierarchical time-series forecasting is an important variant of time-series forecasting. Most methods focus on point predictions and do not provide well-calibrated probabilistic forecasts distributions. We propose PROFHiT, a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy.
arXiv Detail & Related papers (2023-10-17T20:30:16Z)
Mlinear: Rethink the Linear Model for Time-series Forecasting [9.841293660201261]
Mlinear is a simple yet effective method based mainly on linear layers. We introduce a new loss function that significantly outperforms the widely used mean squared error (MSE) on multiple datasets. Our method significantly outperforms PatchTST with a ratio of 21:3 at 336 sequence length input and 29:10 at 512 sequence length input.
arXiv Detail & Related papers (2023-05-08T15:54:18Z)
Toward Reliable Human Pose Forecasting with Uncertainty [51.628234388046195]
We develop an open-source library for human pose forecasting, including multiple models, supporting several datasets. We devise two types of uncertainty in the problem to increase performance and convey better trust.
arXiv Detail & Related papers (2023-04-13T17:56:08Z)
When Rigidity Hurts: Soft Consistency Regularization for Probabilistic Hierarchical Time Series Forecasting [69.30930115236228]
Probabilistic hierarchical time-series forecasting is an important variant of time-series forecasting. Most methods focus on point predictions and do not provide well-calibrated probabilistic forecasts distributions. We propose PROFHiT, a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy.
arXiv Detail & Related papers (2022-06-16T06:13:53Z)
When in Doubt: Neural Non-Parametric Uncertainty Quantification for Epidemic Forecasting [70.54920804222031]
Most existing forecasting models disregard uncertainty quantification, resulting in mis-calibrated predictions. Recent works in deep neural models for uncertainty-aware time-series forecasting also have several limitations. We model the forecasting task as a probabilistic generative process and propose a functional neural process model called EPIFNP.
arXiv Detail & Related papers (2021-06-07T18:31:47Z)
Learning Interpretable Deep State Space Model for Probabilistic Time Series Forecasting [98.57851612518758]
Probabilistic time series forecasting involves estimating the distribution of future based on its history. We propose a deep state space model for probabilistic time series forecasting whereby the non-linear emission model and transition model are parameterized by networks. We show in experiments that our model produces accurate and sharp probabilistic forecasts.
arXiv Detail & Related papers (2021-01-31T06:49:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.