The Data-Driven Censored Newsvendor Problem
- URL: http://arxiv.org/abs/2412.01763v2
- Date: Wed, 18 Dec 2024 22:34:36 GMT
- Title: The Data-Driven Censored Newsvendor Problem
- Authors: Chamsi Hssaine, Sean R. Sinclair,
- Abstract summary: We study a censored variant of the data-driven newsvendor problem, where the decision-maker must select an ordering quantity that minimizes expected overage and underage costs.<n>Our goal is to understand how the degree of historical demand censoring affects the performance of any learning algorithm for this problem.<n>We propose a natural robust algorithm that adapts to the historical level of demand censoring.
- Score: 0.552480439325792
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study a censored variant of the data-driven newsvendor problem, where the decision-maker must select an ordering quantity that minimizes expected overage and underage costs based only on offline censored sales data, rather than historical demand realizations. Our goal is to understand how the degree of historical demand censoring affects the performance of any learning algorithm for this problem. To isolate this impact, we adopt a distributionally robust optimization framework, evaluating policies according to their worst-case regret over an ambiguity set of distributions. This set is defined by the largest historical order quantity (the observable boundary of the dataset), and contains all distributions matching the true demand distribution up to this boundary, while allowing them to be arbitrary afterwards. We demonstrate a spectrum of achievability under demand censoring by deriving a natural necessary and sufficient condition under which vanishing regret is an achievable goal. In regimes in which it is not, we exactly characterize the information loss due to censoring: an insurmountable lower bound on the performance of any policy, even when the decision-maker has access to infinitely many demand samples. We then leverage these sharp characterizations to propose a natural robust algorithm that adapts to the historical level of demand censoring. We derive finite-sample guarantees for this algorithm across all possible censoring regimes and show its near-optimality with matching lower bounds (up to polylogarithmic factors). We moreover demonstrate its robust performance via extensive numerical experiments on both synthetic and real-world datasets.
Related papers
- Offline Dynamic Inventory and Pricing Strategy: Addressing Censored and Dependent Demand [7.289672463326423]
We study the offline sequential feature-based pricing and inventory control problem.
Our goal is to leverage the offline dataset to estimate the optimal pricing and inventory control policy.
arXiv Detail & Related papers (2025-04-14T02:57:51Z) - Controllable Generation via Locally Constrained Resampling [77.48624621592523]
We propose a tractable probabilistic approach that performs Bayesian conditioning to draw samples subject to a constraint.
Our approach considers the entire sequence, leading to a more globally optimal constrained generation than current greedy methods.
We show that our approach is able to steer the model's outputs away from toxic generations, outperforming similar approaches to detoxification.
arXiv Detail & Related papers (2024-10-17T00:49:53Z) - Private Optimal Inventory Policy Learning for Feature-based Newsvendor with Unknown Demand [13.594765018457904]
This paper introduces a novel approach to estimate a privacy-preserving optimal inventory policy within the f-differential privacy framework.
We develop a clipped noisy gradient descent algorithm based on convolution smoothing for optimal inventory estimation.
Our numerical experiments demonstrate that the proposed new method can achieve desirable privacy protection with a marginal increase in cost.
arXiv Detail & Related papers (2024-04-23T19:15:43Z) - Generalization Error Bounds for Learning under Censored Feedback [15.367801388932145]
Generalization error bounds from learning theory provide statistical guarantees on how well an algorithm will perform on previously unseen data.
We characterize the impacts of data non-IIDness due to censored feedback on such bounds.
We show that existing generalization error bounds fail to correctly capture the model's generalization guarantees.
arXiv Detail & Related papers (2024-04-14T13:17:32Z) - Is Offline Decision Making Possible with Only Few Samples? Reliable
Decisions in Data-Starved Bandits via Trust Region Enhancement [25.68354404229254]
We show that even in a data-starved setting it may still be possible to find a policy competitive with the optimal one.
This paves the way to reliable decision-making in settings where critical decisions must be made by relying only on a handful of samples.
arXiv Detail & Related papers (2024-02-24T03:41:09Z) - Group Fairness with Uncertainty in Sensitive Attributes [34.608332397776245]
A fair predictive model is crucial to mitigate biased decisions against minority groups in high-stakes applications.
We propose a bootstrap-based algorithm that achieves the target level of fairness despite the uncertainty in sensitive attributes.
Our algorithm is applicable to both discrete and continuous sensitive attributes and is effective in real-world classification and regression tasks.
arXiv Detail & Related papers (2023-02-16T04:33:00Z) - Effective Dimension in Bandit Problems under Censorship [22.269565708490468]
We study both multi-armed and contextual bandit problems in censored environments.
Our goal is to estimate the performance loss due to censorship in the context of classical algorithms designed for uncensored environments.
arXiv Detail & Related papers (2023-02-14T09:03:35Z) - Optimal Treatment Regimes for Proximal Causal Learning [7.672587258250301]
We propose a novel optimal individualized treatment regime based on outcome and treatment confounding bridges.
We show that the value function of this new optimal treatment regime is superior to that of existing ones in the literature.
arXiv Detail & Related papers (2022-12-19T14:29:25Z) - Mitigating Algorithmic Bias with Limited Annotations [65.060639928772]
When sensitive attributes are not disclosed or available, it is needed to manually annotate a small part of the training data to mitigate bias.
We propose Active Penalization Of Discrimination (APOD), an interactive framework to guide the limited annotations towards maximally eliminating the effect of algorithmic bias.
APOD shows comparable performance to fully annotated bias mitigation, which demonstrates that APOD could benefit real-world applications when sensitive information is limited.
arXiv Detail & Related papers (2022-07-20T16:31:19Z) - Byzantine-Robust Online and Offline Distributed Reinforcement Learning [60.970950468309056]
We consider a distributed reinforcement learning setting where multiple agents explore the environment and communicate their experiences through a central server.
$alpha$-fraction of agents are adversarial and can report arbitrary fake information.
We seek to identify a near-optimal policy for the underlying Markov decision process in the presence of these adversarial agents.
arXiv Detail & Related papers (2022-06-01T00:44:53Z) - Sparse Feature Selection Makes Batch Reinforcement Learning More Sample
Efficient [62.24615324523435]
This paper provides a statistical analysis of high-dimensional batch Reinforcement Learning (RL) using sparse linear function approximation.
When there is a large number of candidate features, our result sheds light on the fact that sparsity-aware methods can make batch RL more sample efficient.
arXiv Detail & Related papers (2020-11-08T16:48:02Z) - What are the Statistical Limits of Offline RL with Linear Function
Approximation? [70.33301077240763]
offline reinforcement learning seeks to utilize offline (observational) data to guide the learning of sequential decision making strategies.
This work focuses on the basic question of what are necessary representational and distributional conditions that permit provable sample-efficient offline reinforcement learning.
arXiv Detail & Related papers (2020-10-22T17:32:13Z) - Privacy Preserving Recalibration under Domain Shift [119.21243107946555]
We introduce a framework that abstracts out the properties of recalibration problems under differential privacy constraints.
We also design a novel recalibration algorithm, accuracy temperature scaling, that outperforms prior work on private datasets.
arXiv Detail & Related papers (2020-08-21T18:43:37Z) - Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement
Learning [70.01650994156797]
Off- evaluation of sequential decision policies from observational data is necessary in batch reinforcement learning such as education healthcare.
We develop an approach that estimates the bounds of a given policy.
We prove convergence to the sharp bounds as we collect more confounded data.
arXiv Detail & Related papers (2020-02-11T16:18:14Z) - The Simulator: Understanding Adaptive Sampling in the
Moderate-Confidence Regime [52.38455827779212]
We propose a novel technique for analyzing adaptive sampling called the em Simulator.
We prove the first instance-based lower bounds the top-k problem which incorporate the appropriate log-factors.
Our new analysis inspires a simple and near-optimal for the best-arm and top-k identification, the first em practical of its kind for the latter problem.
arXiv Detail & Related papers (2017-02-16T23:42:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.