Related papers: Enhancing Transformer-Based Foundation Models for Time Series Forecasting via Bagging, Boosting and Statistical Ensembles

Enhancing Transformer-Based Foundation Models for Time Series Forecasting via Bagging, Boosting and Statistical Ensembles

URL: http://arxiv.org/abs/2508.16641v1
Date: Mon, 18 Aug 2025 04:06:26 GMT
Title: Enhancing Transformer-Based Foundation Models for Time Series Forecasting via Bagging, Boosting and Statistical Ensembles
Authors: Dhruv D. Modi, Rong Pan,
Abstract summary: Time series foundation models (TSFMs) have shown strong generalization and zero-shot capabilities for time series forecasting, anomaly detection, classification, and imputation.<n>This paper investigates a suite of statistical and ensemble-based enhancement techniques to improve robustness and accuracy.
Score: 7.787518725874443
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Time series foundation models (TSFMs) such as Lag-Llama, TimeGPT, Chronos, MOMENT, UniTS, and TimesFM have shown strong generalization and zero-shot capabilities for time series forecasting, anomaly detection, classification, and imputation. Despite these advantages, their predictions still suffer from variance, domain-specific bias, and limited uncertainty quantification when deployed on real operational data. This paper investigates a suite of statistical and ensemble-based enhancement techniques, including bootstrap-based bagging, regression-based stacking, prediction interval construction, statistical residual modeling, and iterative error feedback, to improve robustness and accuracy. Using the Belgium Electricity Short-Term Load Forecasting dataset as a case study, we demonstrate that the proposed hybrids consistently outperform standalone foundation models across multiple horizons. Regression-based ensembles achieve the lowest mean squared error; bootstrap aggregation markedly reduces long-context errors; residual modeling corrects systematic bias; and the resulting prediction intervals achieve near nominal coverage with widths shrinking as context length increases. The results indicate that integrating statistical reasoning with modern foundation models yields measurable gains in accuracy, reliability, and interpretability for real-world time series applications.

Related papers

SimDiff: Simpler Yet Better Diffusion Model for Time Series Point Forecasting [8.141505251306622]
Diffusion models have recently shown promise in time series forecasting.<n>They often fail to achieve state-of-the-art point estimation performance.<n>We propose SimDiff, a single-stage, end-to-end framework for point estimation.
arXiv Detail & Related papers (2025-11-24T16:09:55Z)
A Unified Frequency Domain Decomposition Framework for Interpretable and Robust Time Series Forecasting [81.73338008264115]
Current approaches for time series forecasting, whether in the time or frequency domain, predominantly use deep learning models based on linear layers or transformers.<n>We propose FIRE, a unified frequency domain decomposition framework that provides a mathematical abstraction for diverse types of time series.<n>Fire consistently outperforms state-of-the-art models on long-term forecasting benchmarks.
arXiv Detail & Related papers (2025-10-11T09:59:25Z)
Estimating Time Series Foundation Model Transferability via In-Context Learning [74.65355820906355]
Time series foundation models (TSFMs) offer strong zero-shot forecasting via large-scale pre-training.<n>Fine-tuning remains critical for boosting performance in domains with limited public data.<n>We introduce TimeTic, a transferability estimation framework that recasts model selection as an in-context-learning problem.
arXiv Detail & Related papers (2025-09-28T07:07:13Z)
Revisiting Multivariate Time Series Forecasting with Missing Values [65.30332997607141]
Missing values are common in real-world time series.<n>Current approaches have developed an imputation-then-prediction framework that uses imputation modules to fill in missing values, followed by forecasting on the imputed data.<n>This framework overlooks a critical issue: there is no ground truth for the missing values, making the imputation process susceptible to errors that can degrade prediction accuracy.<n>We introduce Consistency-Regularized Information Bottleneck (CRIB), a novel framework built on the Information Bottleneck principle.
arXiv Detail & Related papers (2025-09-27T20:57:48Z)
BayesTTA: Continual-Temporal Test-Time Adaptation for Vision-Language Models via Gaussian Discriminant Analysis [41.09181390655176]
Vision-language models (VLMs) such as CLIP achieve strong zero-shot recognition but degrade significantly under textittemporally evolving distribution shifts common in real-world scenarios.<n>We formalize this practical problem as textitContinual-Temporal Test-Time Adaptation (CT-TTA), where test distributions evolve gradually over time.<n>We propose textitBayesTTA, a Bayesian adaptation framework that enforces temporally consistent predictions and dynamically aligns visual representations.
arXiv Detail & Related papers (2025-07-11T14:02:54Z)
Bridging the Last Mile of Prediction: Enhancing Time Series Forecasting with Conditional Guided Flow Matching [9.465542901469815]
Conditional Guided Flow Matching (CGFM) is a model-agnostic framework that extends flow matching by integrating outputs from an auxiliary predictive model.<n>CGFM incorporates historical data as both conditions and guidance, uses two-sided conditional paths, and employs affine paths to expand the path space.<n> Experiments across datasets and baselines show CGFM consistently outperforms state-of-the-art models, advancing forecasting.
arXiv Detail & Related papers (2025-07-09T18:03:31Z)
BLAST: Balanced Sampling Time Series Corpus for Universal Forecasting Models [47.66064662912721]
We introduce a novel pre-training corpus designed to enhance data diversity through a balanced sampling strategy.<n>BLT incorporates 321 billion observations from publicly available datasets and employs a comprehensive suite of statistical metrics to characterize time series patterns.<n>Our findings highlight the pivotal role of data diversity in improving both training efficiency and model performance for the universal forecasting task.
arXiv Detail & Related papers (2025-05-23T13:20:47Z)
Series-to-Series Diffusion Bridge Model [8.590453584544386]
We present a comprehensive framework that encompasses most existing diffusion-based methods. We propose a novel diffusion-based time series forecasting model, the Series-to-Series Diffusion Bridge Model ($mathrmS2DBM$) Experimental results demonstrate that $mathrmS2DBM$ delivers superior performance in point-to-point forecasting.
arXiv Detail & Related papers (2024-11-07T07:37:34Z)
When Rigidity Hurts: Soft Consistency Regularization for Probabilistic Hierarchical Time Series Forecasting [69.30930115236228]
Probabilistic hierarchical time-series forecasting is an important variant of time-series forecasting. Most methods focus on point predictions and do not provide well-calibrated probabilistic forecasts distributions. We propose PROFHiT, a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy.
arXiv Detail & Related papers (2023-10-17T20:30:16Z)
When Rigidity Hurts: Soft Consistency Regularization for Probabilistic Hierarchical Time Series Forecasting [69.30930115236228]
Probabilistic hierarchical time-series forecasting is an important variant of time-series forecasting. Most methods focus on point predictions and do not provide well-calibrated probabilistic forecasts distributions. We propose PROFHiT, a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy.
arXiv Detail & Related papers (2022-06-16T06:13:53Z)
TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance. We propose a versatile method that estimates joint distributions using an attention-based decoder. We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z)
Simultaneously Reconciled Quantile Forecasting of Hierarchically Related Time Series [11.004159006784977]
We propose a flexible nonlinear model that optimize quantile regression loss coupled with suitable regularization terms to maintain consistency of forecasts across hierarchies. The theoretical framework introduced herein can be applied to any forecasting model with an underlying differentiable loss function.
arXiv Detail & Related papers (2021-02-25T00:59:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.