Foundation Models for Demand Forecasting via Dual-Strategy Ensembling
- URL: http://arxiv.org/abs/2507.22053v1
- Date: Tue, 29 Jul 2025 17:56:38 GMT
- Title: Foundation Models for Demand Forecasting via Dual-Strategy Ensembling
- Authors: Wei Yang, Defu Cao, Yan Liu,
- Abstract summary: We propose a unified ensemble framework that enhances the performance of foundation models for sales forecasting in real-world supply chains.<n>Our method combines two complementary strategies: (1) Hierarchical Ensemble (HE), which partitions training and inference by semantic levels to capture localized patterns; and (2) Architectural Ensemble (AE), which integrates predictions from diverse model backbones to mitigate bias and improve stability.
- Score: 11.926658499983446
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate demand forecasting is critical for supply chain optimization, yet remains difficult in practice due to hierarchical complexity, domain shifts, and evolving external factors. While recent foundation models offer strong potential for time series forecasting, they often suffer from architectural rigidity and limited robustness under distributional change. In this paper, we propose a unified ensemble framework that enhances the performance of foundation models for sales forecasting in real-world supply chains. Our method combines two complementary strategies: (1) Hierarchical Ensemble (HE), which partitions training and inference by semantic levels (e.g., store, category, department) to capture localized patterns; and (2) Architectural Ensemble (AE), which integrates predictions from diverse model backbones to mitigate bias and improve stability. We conduct extensive experiments on the M5 benchmark and three external sales datasets, covering both in-domain and zero-shot forecasting. Results show that our approach consistently outperforms strong baselines, improves accuracy across hierarchical levels, and provides a simple yet effective mechanism for boosting generalization in complex forecasting environments.
Related papers
- Elucidated Rolling Diffusion Models for Probabilistic Weather Forecasting [52.6508222408558]
We introduce Elucidated Rolling Diffusion Models (ERDM)<n>ERDM is the first framework to unify a rolling forecast structure with the principled, performant design of Elucidated Diffusion Models (EDM)<n>On 2D Navier-Stokes simulations and ERA5 global weather forecasting at 1.5circ resolution, ERDM consistently outperforms key diffusion-based baselines.
arXiv Detail & Related papers (2025-06-24T21:44:31Z) - Hierarchical Time Series Forecasting Via Latent Mean Encoding [44.99833362998488]
Coherently forecasting the behaviour of a target variable across both coarse and fine temporal scales is crucial for profit-optimized decision-making in several business applications.<n>We propose a new hierarchical architecture that tackles this problem by leveraging modules that specialize in forecasting the different temporal aggregation levels of interest.
arXiv Detail & Related papers (2025-06-24T13:54:47Z) - Learning Time-Aware Causal Representation for Model Generalization in Evolving Domains [50.66049136093248]
We develop a time-aware structural causal model (SCM) that incorporates dynamic causal factors and the causal mechanism drifts.<n>We show that our method can yield the optimal causal predictor for each time domain.<n>Results on both synthetic and real-world datasets exhibit that SYNC can achieve superior temporal generalization performance.
arXiv Detail & Related papers (2025-06-21T14:05:37Z) - Tuning for Trustworthiness -- Balancing Performance and Explanation Consistency in Neural Network Optimization [49.567092222782435]
We introduce the novel concept of XAI consistency, defined as the agreement among different feature attribution methods.<n>We create a multi-objective optimization framework that balances predictive performance with explanation.<n>Our research provides a foundation for future investigations into whether models from the trade-off zone-balancing performance loss and XAI consistency-exhibit greater robustness.
arXiv Detail & Related papers (2025-05-12T13:19:14Z) - Unified Enhancement of the Generalization and Robustness of Language Models via Bi-Stage Optimization [2.502393972789905]
We propose a bi-stage optimization framework to uniformly enhance both the generalization and robustness of LMs.<n>We show that our method significantly improves the generalization and robustness of LMs compared to other existing methods.
arXiv Detail & Related papers (2025-03-19T13:50:36Z) - A Comprehensive Forecasting Framework based on Multi-Stage Hierarchical Forecasting Reconciliation and Adjustment [16.859089765648356]
We propose a novel framework to address the challenges of preserving seasonality, ensuring coherence, and improving accuracy.<n>The proposed framework has been deployed and leveraged by Walmart's ads, sales and operations teams to track future demands.
arXiv Detail & Related papers (2024-12-19T10:33:19Z) - $\clubsuit$ CLOVER $\clubsuit$: Probabilistic Forecasting with Coherent Learning Objective Reparameterization [42.215158938066054]
We augment an MQForecaster neural network architecture with a modified multivariate Gaussian factor model that achieves coherence by construction.<n>We call our method the Coherent Learning Objective Reparametrization Neural Network (CLOVER)<n>In comparison to state-of-the-art coherent forecasting methods, CLOVER achieves significant improvements in scaled CRPS forecast accuracy, with average gains of 15%.
arXiv Detail & Related papers (2023-07-19T07:31:37Z) - Exploring Optimal Substructure for Out-of-distribution Generalization
via Feature-targeted Model Pruning [23.938392334438582]
We propose a novel Spurious Feature-targeted model Pruning framework, dubbed SFP, to automatically explore invariant substructures.
SFP can significantly outperform both structure-based and non-structure OOD generalization SOTAs, with accuracy improvement up to 4.72% and 23.35%, respectively.
arXiv Detail & Related papers (2022-12-19T13:51:06Z) - An Empirical Study on Distribution Shift Robustness From the Perspective
of Pre-Training and Data Augmentation [91.62129090006745]
This paper studies the distribution shift problem from the perspective of pre-training and data augmentation.
We provide the first comprehensive empirical study focusing on pre-training and data augmentation.
arXiv Detail & Related papers (2022-05-25T13:04:53Z) - Posterior Differential Regularization with f-divergence for Improving
Model Robustness [95.05725916287376]
We focus on methods that regularize the model posterior difference between clean and noisy inputs.
We generalize the posterior differential regularization to the family of $f$-divergences.
Our experiments show that regularizing the posterior differential with $f$-divergence can result in well-improved model robustness.
arXiv Detail & Related papers (2020-10-23T19:58:01Z) - Target-Embedding Autoencoders for Supervised Representation Learning [111.07204912245841]
This paper analyzes a framework for improving generalization in a purely supervised setting, where the target space is high-dimensional.
We motivate and formalize the general framework of target-embedding autoencoders (TEA) for supervised prediction, learning intermediate latent representations jointly optimized to be both predictable from features as well as predictive of targets.
arXiv Detail & Related papers (2020-01-23T02:37:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.