Related papers: Robust Assortment Optimization from Observational Data

Robust Assortment Optimization from Observational Data

URL: http://arxiv.org/abs/2602.10696v1
Date: Wed, 11 Feb 2026 09:57:16 GMT
Title: Robust Assortment Optimization from Observational Data
Authors: Miao Lu, Yuxuan Han, Han Zhong, Zhengyuan Zhou, Jose Blanchet,
Abstract summary: We propose a framework for data-driven assortment optimization that accounts for potential distributional shifts in customer choice behavior.<n>Our approach models potential preference shift from a nominal choice model that generates data and seeks to maximize worst-case expected revenue.
Score: 32.720761309403436
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Assortment optimization is a fundamental challenge in modern retail and recommendation systems, where the goal is to select a subset of products that maximizes expected revenue under complex customer choice behaviors. While recent advances in data-driven methods have leveraged historical data to learn and optimize assortments, these approaches typically rely on strong assumptions -- namely, the stability of customer preferences and the correctness of the underlying choice models. However, such assumptions frequently break in real-world scenarios due to preference shifts and model misspecification, leading to poor generalization and revenue loss. Motivated by this limitation, we propose a robust framework for data-driven assortment optimization that accounts for potential distributional shifts in customer choice behavior. Our approach models potential preference shift from a nominal choice model that generates data and seeks to maximize worst-case expected revenue. We first establish the computational tractability of robust assortment planning when the nominal model is known, then advance to the data-driven setting, where we design statistically optimal algorithms that minimize the data requirements while maintaining robustness. Our theoretical analysis provides both upper bounds and matching lower bounds on the sample complexity, offering theoretical guarantees for robust generalization. Notably, we uncover and identify the notion of ``robust item-wise coverage'' as the minimal data requirement to enable sample-efficient robust assortment learning. Our work bridges the gap between robustness and statistical efficiency in assortment learning, contributing new insights and tools for reliable assortment optimization under uncertainty.

Related papers

Autocorrelated Optimize-via-Estimate: Predict-then-Optimize versus Finite-sample Optimal [2.0228793142608588]
Models that directly optimize for out-of-sample performance in the finite-sample regime have emerged as a promising alternative to traditional estimate-then-optimize approaches.<n>We compare their performance in the context of autocorrelated uncertainties, specifically, under a Vector Autoregressive Moving Average VARMA(p,q) process.
arXiv Detail & Related papers (2026-02-02T09:49:51Z)
Not All Preferences Are Created Equal: Stability-Aware and Gradient-Efficient Alignment for Reasoning Models [52.48582333951919]
We propose a dynamic framework designed to enhance alignment reliability by maximizing the Signal-to-Noise Ratio of policy updates.<n>SAGE (Stability-Aware Gradient Efficiency) integrates a coarse-grained curriculum mechanism that refreshes candidate pools based on model competence.<n> Experiments on multiple mathematical reasoning benchmarks demonstrate that SAGE significantly accelerates convergence and outperforms static baselines.
arXiv Detail & Related papers (2026-02-01T12:56:10Z)
PASTA: A Unified Framework for Offline Assortment Learning [22.89627688628713]
We study a broad class of assortment optimization problems in an offline and data-driven setting.<n>In such problems, a firm lacks prior knowledge of the underlying choice model, and aims to determine an optimal assortment based on historical customer choice data.<n>We introduce a novel Pessimistic Optimization framework that leverages the principle of pessimism to achieve optimal expected revenue under general choice models.
arXiv Detail & Related papers (2025-10-02T05:47:56Z)
Leveraging Robust Optimization for LLM Alignment under Distribution Shifts [51.74394601039711]
Preference alignment methods are increasingly critical for steering large language models to generate outputs consistent with human values.<n>We propose a novel distribution-aware optimization framework that improves preference alignment despite such shifts.
arXiv Detail & Related papers (2025-04-08T09:14:38Z)
Learning an Optimal Assortment Policy under Observational Data [32.720761309403436]
We study the fundamental problem of offline assortment optimization under the Multinomial Logit (MNL) model.<n>In this paper, we consider the offline learning paradigm and investigate the minimal data requirements for efficient offline assortment optimization.
arXiv Detail & Related papers (2025-02-10T18:54:41Z)
Self-Steering Optimization: Autonomous Preference Optimization for Large Language Models [79.84205827056907]
We present Self-Steering Optimization ($SSO$), an algorithm that autonomously generates high-quality preference data.<n>$SSO$ employs a specialized optimization objective to build a data generator from the policy model itself, which is used to produce accurate and on-policy data.<n>Our evaluation shows that $SSO$ consistently outperforms baselines in human preference alignment and reward optimization.
arXiv Detail & Related papers (2024-10-22T16:04:03Z)
An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences. We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration. Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z)
Bayesian Nonparametrics Meets Data-Driven Distributionally Robust Optimization [29.24821214671497]
Training machine learning and statistical models often involve optimizing a data-driven risk criterion. We propose a novel robust criterion by combining insights from Bayesian nonparametric (i.e., Dirichlet process) theory and a recent decision-theoretic model of smooth ambiguity-averse preferences. For practical implementation, we propose and study tractable approximations of the criterion based on well-known Dirichlet process representations.
arXiv Detail & Related papers (2024-01-28T21:19:15Z)
Optimizer's Information Criterion: Dissecting and Correcting Bias in Data-Driven Optimization [16.57676001669012]
In data-driven optimization, the sample performance of the obtained decision typically incurs an optimistic bias against the true performance.<n>Common techniques to correct this bias, such as cross-validation, require repeatedly solving additional optimization problems and are therefore expensive.<n>We develop a general bias correction approach that directly approximates the first-order bias and does not require solving any additional optimization problems.
arXiv Detail & Related papers (2023-06-16T07:07:58Z)
When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent. Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z)
Modeling the Second Player in Distributionally Robust Optimization [90.25995710696425]
We argue for the use of neural generative models to characterize the worst-case distribution. This approach poses a number of implementation and optimization challenges. We find that the proposed approach yields models that are more robust than comparable baselines.
arXiv Detail & Related papers (2021-03-18T14:26:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.