Evaluation of a Foundational Model and Stochastic Models for Forecasting Sporadic or Spiky Production Outages of High-Performance Machine Learning Services
- URL: http://arxiv.org/abs/2507.01067v1
- Date: Mon, 30 Jun 2025 23:59:12 GMT
- Title: Evaluation of a Foundational Model and Stochastic Models for Forecasting Sporadic or Spiky Production Outages of High-Performance Machine Learning Services
- Authors: Keun Soo Yim,
- Abstract summary: We optimize a state-of-the-art foundational model to forecast sporadic or spiky production outages of high-performance machine learning services.<n>The analysis helps us understand how each of the evaluated models performs for the sporadic or spiky events.<n>We use the models with optimal parameters to estimate a year-long outage statistics of a particular root cause with less than 6% value errors.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time series forecasting models have diverse real world applications (e.g., from electricity metrics to software workload). Latest foundational models trained for time series forecasting show strengths (e.g., for long sequences and in zero-shot settings). However, foundational model was not yet used for forecasting rare, spiky events, i.e., a challenging target because those are a corner case of extreme events. In this paper, we optimize a state-of-the-art foundational model to forecast sporadic or spiky production outages of high-performance machine learning services powering billions of client devices. We evaluate the forecasting errors of the foundational model compared with classical stochastic forecasting models (e.g., moving average and autoregressive). The analysis helps us understand how each of the evaluated models performs for the sporadic or spiky events. For example, it identifies the key patterns in the target data that are well tracked by the foundational model vs. each of the stochastic models. We use the models with optimal parameters to estimate a year-long outage statistics of a particular root cause with less than 6% value errors.
Related papers
- Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting [64.45587649141842]
Time-series forecasting plays a critical role in many real-world applications.<n>No single model consistently outperforms others across different test samples, but instead (ii) each model excels in specific cases.<n>We introduce TimeFuse, a framework for collective time-series forecasting with sample-level adaptive fusion of heterogeneous models.
arXiv Detail & Related papers (2025-05-24T00:45:07Z) - ModelRadar: Aspect-based Forecast Evaluation [0.393259574660092]
Current practices for evaluating and comparing forecasting models focus on summarising performance into a single score.<n>This limitation is especially problematic for time series forecasting, where multiple layers of averaging--across time steps, horizons, and multiple time series in a dataset--can mask relevant performance variations.<n>We propose ModelRadar, a framework for evaluating univariate time series forecasting models across multiple aspects, such as stationarity, presence of anomalies, or forecasting horizons.
arXiv Detail & Related papers (2025-03-31T11:50:45Z) - ChronosX: Adapting Pretrained Time Series Models with Exogenous Variables [30.679739751673655]
This paper introduces a new method to incorporate covariates into pretrained time series forecasting models.<n>Our proposed approach incorporates covariate information into pretrained forecasting models through modular blocks.<n>In evaluations on both synthetic and real datasets, our approach effectively incorporates covariate information into pretrained models, outperforming existing baselines.
arXiv Detail & Related papers (2025-03-15T12:34:19Z) - On conditional diffusion models for PDE simulations [53.01911265639582]
We study score-based diffusion models for forecasting and assimilation of sparse observations.
We propose an autoregressive sampling approach that significantly improves performance in forecasting.
We also propose a new training strategy for conditional score-based models that achieves stable performance over a range of history lengths.
arXiv Detail & Related papers (2024-10-21T18:31:04Z) - GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation [90.53485251837235]
Time series foundation models excel in zero-shot forecasting, handling diverse tasks without explicit training.
GIFT-Eval is a pioneering benchmark aimed at promoting evaluation across diverse datasets.
GIFT-Eval encompasses 23 datasets over 144,000 time series and 177 million data points.
arXiv Detail & Related papers (2024-10-14T11:29:38Z) - Predictive Churn with the Set of Good Models [61.00058053669447]
This paper explores connections between two seemingly unrelated concepts of predictive inconsistency.<n>The first, known as predictive multiplicity, occurs when models that perform similarly produce conflicting predictions for individual samples.<n>The second concept, predictive churn, examines the differences in individual predictions before and after model updates.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - Lag-Llama: Towards Foundation Models for Probabilistic Time Series
Forecasting [54.04430089029033]
We present Lag-Llama, a general-purpose foundation model for time series forecasting based on a decoder-only transformer architecture.
Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities.
When fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-12T12:29:32Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - Bayesian Regression Approach for Building and Stacking Predictive Models
in Time Series Analytics [0.0]
The paper describes the use of Bayesian regression for building time series models and stacking different predictive models for time series.
It makes it possible to estimate an uncertainty of time series prediction and calculate value at risk characteristics.
arXiv Detail & Related papers (2022-01-06T12:58:23Z) - Model-based micro-data reinforcement learning: what are the crucial
model properties and which model to choose? [0.2836066255205732]
We contribute to micro-data model-based reinforcement learning (MBRL) by rigorously comparing popular generative models.
We find that on an environment that requires multimodal posterior predictives, mixture density nets outperform all other models by a large margin.
We also found that deterministic models are on par, in fact they consistently (although non-significantly) outperform their probabilistic counterparts.
arXiv Detail & Related papers (2021-07-24T11:38:25Z) - Back2Future: Leveraging Backfill Dynamics for Improving Real-time
Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task.
'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature.
We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z) - A Worrying Analysis of Probabilistic Time-series Models for Sales
Forecasting [10.690379201437015]
Probabilistic time-series models become popular in the forecasting field as they help to make optimal decisions under uncertainty.
We analyze the performance of three prominent probabilistic time-series models for sales forecasting.
arXiv Detail & Related papers (2020-11-21T03:31:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.