Non-Stationary Inventory Control with Lead Times
- URL: http://arxiv.org/abs/2602.05799v1
- Date: Thu, 05 Feb 2026 15:53:37 GMT
- Title: Non-Stationary Inventory Control with Lead Times
- Authors: Nele H. Amiri, Sean R. Sinclair, Maximiliano Udenio,
- Abstract summary: We study non-stationary single-item, periodic-review inventory control problems.<n>We analyze how demand non-stationarity affects learning performance across inventory models.
- Score: 0.4927882324444362
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study non-stationary single-item, periodic-review inventory control problems in which the demand distribution is unknown and may change over time. We analyze how demand non-stationarity affects learning performance across inventory models, including systems with demand backlogging or lost-sales, both with and without lead times. For each setting, we propose an adaptive online algorithm that optimizes over the class of base-stock policies and establish performance guarantees in terms of dynamic regret relative to the optimal base-stock policy at each time step. Our results reveal a sharp separation across inventory models. In backlogging systems and lost-sales models with zero lead time, we show that it is possible to adapt to demand changes without incurring additional performance loss in stationary environments, even without prior knowledge of the demand distributions or the number of demand shifts. In contrast, for lost-sales systems with positive lead times, we establish weaker guarantees that reflect fundamental limitations imposed by delayed replenishment in combination with censored feedback. Our algorithms leverage the convexity and one-sided feedback structure of inventory costs to enable counterfactual policy evaluation despite demand censoring. We complement the theoretical analysis with simulation results showing that our methods significantly outperform existing benchmarks.
Related papers
- What is the Value of Censored Data? An Exact Analysis for the Data-driven Newsvendor [1.5469452301122175]
We study the setting where demand is censored at the inventory level and only sales are observed.<n>Our results show that policies based on this sales-as-demand data can suffer performance as sales data accumulates.
arXiv Detail & Related papers (2026-02-18T20:13:02Z) - Continual Action Quality Assessment via Adaptive Manifold-Aligned Graph Regularization [53.82400605816587]
Action Quality Assessment (AQA) quantifies human actions in videos, supporting applications in sports scoring, rehabilitation, and skill evaluation.<n>A major challenge lies in the non-stationary nature of quality distributions in real-world scenarios.<n>We introduce Continual AQA (CAQA), which equips with Continual Learning capabilities to handle evolving distributions.
arXiv Detail & Related papers (2025-10-08T10:09:47Z) - Offline Dynamic Inventory and Pricing Strategy: Addressing Censored and Dependent Demand [7.289672463326423]
We study the offline sequential feature-based pricing and inventory control problem.<n>Our goal is to leverage the offline dataset to estimate the optimal pricing and inventory control policy.
arXiv Detail & Related papers (2025-04-14T02:57:51Z) - Spatial Supply Repositioning with Censored Demand Data [10.797160099834306]
We consider a network inventory system motivated by one-way, on-demand vehicle sharing services.<n>Finding an optimal policy in such a general inventory network is analytically and computationally challenging.<n>Our work highlights the critical role of inventory in the viability of shared mobility businesses.
arXiv Detail & Related papers (2025-01-31T15:16:02Z) - Contextual Bandits for Evaluating and Improving Inventory Control
Policies [2.2530496464901106]
We introduce the concept of an equilibrium policy, a desirable property of a policy that intuitively means that, in hindsight, changing only a small fraction of actions does not result in materially more reward.
We provide a light-weight contextual bandit-based algorithm to evaluate and occasionally tweak policies, and show that this method achieves favorable guarantees, both theoretically and in empirical studies.
arXiv Detail & Related papers (2023-10-24T18:00:40Z) - Online Inventory Problems: Beyond the i.i.d. Setting with Online Convex
Optimization [0.8602553195689513]
We study multi-product inventory control problems where a manager makes sequential replenishment decisions based on partial historical information in order to minimize its cumulative losses.
We propose MaxCOSD, an online algorithm that has provable guarantees even for problems with non-i.i.d. demands and stateful dynamics.
arXiv Detail & Related papers (2023-07-12T10:00:22Z) - Approaching sales forecasting using recurrent neural networks and
transformers [57.43518732385863]
We develop three alternatives to tackle the problem of forecasting the customer sales at day/store/item level using deep learning techniques.
Our empirical results show how good performance can be achieved by using a simple sequence to sequence architecture with minimal data preprocessing effort.
The proposed solution achieves a RMSLE of around 0.54, which is competitive with other more specific solutions to the problem proposed in the Kaggle competition.
arXiv Detail & Related papers (2022-04-16T12:03:52Z) - A Learning Based Framework for Handling Uncertain Lead Times in
Multi-Product Inventory Management [8.889304968879163]
Most existing literature on supply chain and inventory management consider demand processes with zero or constant lead times.
Motivated by the recently introduced delay-resolved deep Q-learning (DRDQN) algorithm, this paper develops a reinforcement learning based paradigm for handling uncertainty in lead times.
arXiv Detail & Related papers (2022-03-02T05:50:04Z) - False Correlation Reduction for Offline Reinforcement Learning [115.11954432080749]
We propose falSe COrrelation REduction (SCORE) for offline RL, a practically effective and theoretically provable algorithm.
We empirically show that SCORE achieves the SoTA performance with 3.1x acceleration on various tasks in a standard benchmark (D4RL)
arXiv Detail & Related papers (2021-10-24T15:34:03Z) - Stateful Offline Contextual Policy Evaluation and Learning [88.9134799076718]
We study off-policy evaluation and learning from sequential data.
We formalize the relevant causal structure of problems such as dynamic personalized pricing.
We show improved out-of-sample policy performance in this class of relevant problems.
arXiv Detail & Related papers (2021-10-19T16:15:56Z) - MUSBO: Model-based Uncertainty Regularized and Sample Efficient Batch
Optimization for Deployment Constrained Reinforcement Learning [108.79676336281211]
Continuous deployment of new policies for data collection and online learning is either cost ineffective or impractical.
We propose a new algorithmic learning framework called Model-based Uncertainty regularized and Sample Efficient Batch Optimization.
Our framework discovers novel and high quality samples for each deployment to enable efficient data collection.
arXiv Detail & Related papers (2021-02-23T01:30:55Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z) - GenDICE: Generalized Offline Estimation of Stationary Values [108.17309783125398]
We show that effective estimation can still be achieved in important applications.
Our approach is based on estimating a ratio that corrects for the discrepancy between the stationary and empirical distributions.
The resulting algorithm, GenDICE, is straightforward and effective.
arXiv Detail & Related papers (2020-02-21T00:27:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.