Online Inventory Problems: Beyond the i.i.d. Setting with Online Convex
Optimization
- URL: http://arxiv.org/abs/2307.06048v1
- Date: Wed, 12 Jul 2023 10:00:22 GMT
- Title: Online Inventory Problems: Beyond the i.i.d. Setting with Online Convex
Optimization
- Authors: Massil Hihat, St\'ephane Ga\"iffas, Guillaume Garrigos, Simon Bussy
- Abstract summary: We study multi-product inventory control problems where a manager makes sequential replenishment decisions based on partial historical information in order to minimize its cumulative losses.
We propose MaxCOSD, an online algorithm that has provable guarantees even for problems with non-i.i.d. demands and stateful dynamics.
- Score: 0.8602553195689513
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We study multi-product inventory control problems where a manager makes
sequential replenishment decisions based on partial historical information in
order to minimize its cumulative losses. Our motivation is to consider general
demands, losses and dynamics to go beyond standard models which usually rely on
newsvendor-type losses, fixed dynamics, and unrealistic i.i.d. demand
assumptions. We propose MaxCOSD, an online algorithm that has provable
guarantees even for problems with non-i.i.d. demands and stateful dynamics,
including for instance perishability. We consider what we call non-degeneracy
assumptions on the demand process, and argue that they are necessary to allow
learning.
Related papers
- A Primal-Dual Online Learning Approach for Dynamic Pricing of Sequentially Displayed Complementary Items under Sale Constraints [54.46126953873298]
We address the problem of dynamically pricing complementary items that are sequentially displayed to customers.
Coherent pricing policies for complementary items are essential because optimizing the pricing of each item individually is ineffective.
We empirically evaluate our approach using synthetic settings randomly generated from real-world data, and compare its performance in terms of constraints violation and regret.
arXiv Detail & Related papers (2024-07-08T09:55:31Z) - Learning with Posterior Sampling for Revenue Management under Time-varying Demand [36.22276574805786]
We discuss the revenue management problem to maximize revenue by pricing items or services.
One challenge in this problem is that the demand distribution is unknown and varies over time in real applications such as airline and retail industries.
arXiv Detail & Related papers (2024-05-08T09:28:26Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - Learning an Inventory Control Policy with General Inventory Arrival
Dynamics [2.3715198714015893]
This paper addresses the problem of learning and backtesting inventory control policies in the presence of general arrival dynamics.
To the best of our knowledge this is the first work to handle either arbitrary arrival dynamics or an arbitrary downstream post-processing of order quantities.
arXiv Detail & Related papers (2023-10-26T05:49:13Z) - From Chaos to Clarity: Claim Normalization to Empower Fact-Checking [57.024192702939736]
Claim Normalization (aka ClaimNorm) aims to decompose complex and noisy social media posts into more straightforward and understandable forms.
We propose CACN, a pioneering approach that leverages chain-of-thought and claim check-worthiness estimation.
Our experiments demonstrate that CACN outperforms several baselines across various evaluation measures.
arXiv Detail & Related papers (2023-10-22T16:07:06Z) - Online Learning under Budget and ROI Constraints via Weak Adaptivity [57.097119428915796]
Existing primal-dual algorithms for constrained online learning problems rely on two fundamental assumptions.
We show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers.
We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions.
arXiv Detail & Related papers (2023-02-02T16:30:33Z) - Control of Dual-Sourcing Inventory Systems using Recurrent Neural
Networks [0.0]
We show that proposed neural network controllers (NNCs) are able to learn near-optimal policies of commonly used instances within a few minutes of CPU time.
Our research opens up new ways of efficiently managing complex, high-dimensional inventory dynamics.
arXiv Detail & Related papers (2022-01-16T19:44:06Z) - Stateful Offline Contextual Policy Evaluation and Learning [88.9134799076718]
We study off-policy evaluation and learning from sequential data.
We formalize the relevant causal structure of problems such as dynamic personalized pricing.
We show improved out-of-sample policy performance in this class of relevant problems.
arXiv Detail & Related papers (2021-10-19T16:15:56Z) - Regularized Online Allocation Problems: Fairness and Beyond [7.433931244705934]
We introduce the emphregularized online allocation problem, a variant that includes a non-linear regularizer acting on the total resource consumption.
In this problem, requests repeatedly arrive over time and, for each request, a decision maker needs to take an action that generates a reward and consumes resources.
The objective is to simultaneously maximize additively separable rewards and the value of a non-separable regularizer subject to the resource constraints.
arXiv Detail & Related papers (2020-07-01T14:24:58Z) - MOPO: Model-based Offline Policy Optimization [183.6449600580806]
offline reinforcement learning (RL) refers to the problem of learning policies entirely from a large batch of previously collected data.
We show that an existing model-based RL algorithm already produces significant gains in the offline setting.
We propose to modify the existing model-based RL methods by applying them with rewards artificially penalized by the uncertainty of the dynamics.
arXiv Detail & Related papers (2020-05-27T08:46:41Z) - Uncertainty Quantification for Demand Prediction in Contextual Dynamic
Pricing [20.828160401904697]
We study the problem of constructing accurate confidence intervals for the demand function.
We develop a debiased approach and provide the normality guarantee of the debiased estimator.
arXiv Detail & Related papers (2020-03-16T04:21:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.