Bridging Forecast Accuracy and Inventory KPIs: A Simulation-Based Software Framework
- URL: http://arxiv.org/abs/2601.21844v2
- Date: Sat, 31 Jan 2026 12:21:26 GMT
- Title: Bridging Forecast Accuracy and Inventory KPIs: A Simulation-Based Software Framework
- Authors: So Fukuhara, Abdallah Alabdallah, Nuwan Gunasekara, Slawomir Nowaczyk,
- Abstract summary: We propose a decision-centric simulation framework that enables systematic evaluation of forecasting models in realistic inventory management setting.<n>We show that improvements in accuracy metrics do not necessarily lead to better, and that models with similar error profiles can induce different cost-service trade-offs.<n>Overall, the framework links demand forecasting and inventory management, shifting evaluation from predictive accuracy toward operational relevance.
- Score: 4.089848545480847
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Efficient management of spare parts inventory is crucial in the automotive aftermarket, where demand is highly intermittent and uncertainty drives substantial cost and service risks. Forecasting is therefore central, but the quality of forecasting models should be judged not by statistical accuracy (e.g., MAE, RMSE) but rather by its impact on key operational performance indicators (KPIs), such as total cost and service level. Yet most existing work evaluates models exclusively using accuracy metrics, and the relationship between these metrics and KPIs remains poorly understood. To address this gap, we propose a decision-centric simulation software framework that enables systematic evaluation of forecasting models in realistic inventory management setting. The framework comprises: (i) a synthetic demand generator tailored to spare-parts demand characteristics, (ii) a flexible forecasting module that can host arbitrary predictive models, and (iii) an inventory control simulator that consumes the forecasts and computes operational KPIs. This closed-loop setup enables researchers to evaluate models not only in terms of statistical error but also in terms of downstream inventory implications. Using a wide range of simulation scenarios, we show that improvements in accuracy metrics do not necessarily lead to better KPIs, and that models with similar error profiles can induce different cost-service trade-offs. We analyze these discrepancies to characterize how forecast performance affects inventory outcomes and derive guidance for model selection. Overall, the framework links demand forecasting and inventory management, shifting evaluation from predictive accuracy toward operational relevance in the automotive aftermarket and related domains. An open-source implementation of the software is available at https://github.com/caisr-hh/TruckParts-Demand-Inventory-Simulator/releases/tag/IDA_2026.
Related papers
- Beyond Demand Estimation: Consumer Surplus Evaluation via Cumulative Propensity Weights [14.103811043596666]
We introduce an estimator that avoids explicit estimation and numerical integration of the demand function.<n>We extend this framework to an inequality-aware surplus measure, allowing regulators and firms to quantify the profit-equity trade-off.
arXiv Detail & Related papers (2026-01-03T01:41:40Z) - The Forecast Critic: Leveraging Large Language Models for Poor Forecast Identification [74.64864354503204]
We propose The Forecast Critic, a system that leverages Large Language Models (LLMs) for automated forecast monitoring.<n>We evaluate the ability of LLMs to assess time series forecast quality.<n>We present three experiments, including on both synthetic and real-world forecasting data.
arXiv Detail & Related papers (2025-12-12T21:59:53Z) - Hierarchical Evaluation Function: A Multi-Metric Approach for Optimizing Demand Forecasting Models [0.479839492673697]
The Hierarchical Evaluation Function (HEF) is proposed as a multi-metric framework for hyperparameter optimization.<n>HEF integrates explanatory power (R2), sensitivity to extreme errors (RMSE), and average accuracy (MAE)<n>The performance of HEF was assessed using four widely recognized benchmark datasets in the forecasting domain.
arXiv Detail & Related papers (2025-08-18T16:25:49Z) - Comparative Analysis of Modern Machine Learning Models for Retail Sales Forecasting [0.0]
When forecasts underestimate the level of sales, firms experience lost sales, shortages, and impact on the reputation of the retailer in their relevant market.<n>This study provides an exhaustive assessment of the forecasting models applied to a high-resolution brick-and-mortar retail dataset.
arXiv Detail & Related papers (2025-06-06T10:08:17Z) - On Large-scale Evaluation of Embedding Models for Knowledge Graph Completion [1.2703808802607108]
Knowledge graph embedding (KGE) models are extensively studied for knowledge graph completion.<n>Standard evaluation metrics rely on the closed-world assumption, which penalizes models for correctly predicting missing triples.<n>This paper conducts a comprehensive evaluation of four representative KGE models on large-scale datasets FB-CVT-REV and FB+CVT-REV.
arXiv Detail & Related papers (2025-04-11T20:49:02Z) - DUPRE: Data Utility Prediction for Efficient Data Valuation [49.60564885180563]
Cooperative game theory-based data valuation, such as Data Shapley, requires evaluating the data utility and retraining the ML model for multiple data subsets.<n>Our framework, textttDUPRE, takes an alternative yet complementary approach that reduces the cost per subset evaluation by predicting data utilities instead of evaluating them by model retraining.<n>Specifically, given the evaluated data utilities of some data subsets, textttDUPRE fits a emphGaussian process (GP) regression model to predict the utility of every other data subset.
arXiv Detail & Related papers (2025-02-22T08:53:39Z) - Optimizing Sequential Recommendation Models with Scaling Laws and Approximate Entropy [104.48511402784763]
Performance Law for SR models aims to theoretically investigate and model the relationship between model performance and data quality.<n>We propose Approximate Entropy (ApEn) to assess data quality, presenting a more nuanced approach compared to traditional data quantity metrics.
arXiv Detail & Related papers (2024-11-30T10:56:30Z) - A Probabilistic Perspective on Unlearning and Alignment for Large Language Models [48.96686419141881]
We introduce the first formal probabilistic evaluation framework for Large Language Models (LLMs)<n> Namely, we propose novel metrics with high probability guarantees concerning the output distribution of a model.<n>Our metrics are application-independent and allow practitioners to make more reliable estimates about model capabilities before deployment.
arXiv Detail & Related papers (2024-10-04T15:44:23Z) - F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - Predictability Analysis of Regression Problems via Conditional Entropy Estimations [1.8913544072080544]
Conditional entropy estimators are developed to assess predictability in regression problems.
Experiments on synthesized and real-world datasets demonstrate the robustness and utility of these estimators.
arXiv Detail & Related papers (2024-06-06T07:59:19Z) - Backorder Prediction in Inventory Management: Classification Techniques
and Cost Considerations [0.0]
This article introduces an advanced analytical approach for predicting backorders in inventory management.
Backorder refers to an order that cannot be immediately fulfilled due to stock depletion.
Study suggests that a combination of modeling approaches, including ensemble techniques and VAE, can effectively address imbalanced datasets in inventory management.
arXiv Detail & Related papers (2023-09-25T02:50:20Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Blending MPC & Value Function Approximation for Efficient Reinforcement
Learning [42.429730406277315]
Model-Predictive Control (MPC) is a powerful tool for controlling complex, real-world systems.
We present a framework for improving on MPC with model-free reinforcement learning (RL)
We show that our approach can obtain performance comparable with MPC with access to true dynamics.
arXiv Detail & Related papers (2020-12-10T11:32:01Z) - A New Metric for Lumpy and Intermittent Demand Forecasts:
Stock-keeping-oriented Prediction Error Costs [0.0]
In this paper, we propose a novel metric for evaluating product demand forecasts.
The metric is based on simulated and real demand time series from the automotive aftermarket.
arXiv Detail & Related papers (2020-04-22T12:50:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.