PASTA: Pessimistic Assortment Optimization
- URL: http://arxiv.org/abs/2302.03821v1
- Date: Wed, 8 Feb 2023 01:11:51 GMT
- Title: PASTA: Pessimistic Assortment Optimization
- Authors: Juncheng Dong, Weibin Mo, Zhengling Qi, Cong Shi, Ethan X. Fang, Vahid
Tarokh
- Abstract summary: We consider a class of assortment optimization problems in an offline data-driven setting.
We propose an algorithm referred to as Pessimistic ASsortment opTimizAtion (PASTA) based on the principle of pessimism.
- Score: 25.51792135903357
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider a class of assortment optimization problems in an offline
data-driven setting. A firm does not know the underlying customer choice model
but has access to an offline dataset consisting of the historically offered
assortment set, customer choice, and revenue. The objective is to use the
offline dataset to find an optimal assortment. Due to the combinatorial nature
of assortment optimization, the problem of insufficient data coverage is likely
to occur in the offline dataset. Therefore, designing a provably efficient
offline learning algorithm becomes a significant challenge. To this end, we
propose an algorithm referred to as Pessimistic ASsortment opTimizAtion (PASTA
for short) designed based on the principle of pessimism, that can correctly
identify the optimal assortment by only requiring the offline data to cover the
optimal assortment under general settings. In particular, we establish a regret
bound for the offline assortment optimization problem under the celebrated
multinomial logit model. We also propose an efficient computational procedure
to solve our pessimistic assortment optimization problem. Numerical studies
demonstrate the superiority of the proposed method over the existing baseline
method.
Related papers
- An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences.
We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration.
Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z) - Offline Model-Based Optimization via Policy-Guided Gradient Search [30.87992788876113]
We introduce a new learning-to-search- gradient perspective for offline optimization by reformulating it as an offline reinforcement learning problem.
Our proposed policy-guided search approach explicitly learns the best policy for a given surrogate model created from the offline data.
arXiv Detail & Related papers (2024-05-08T18:27:37Z) - Stop Relying on No-Choice and Do not Repeat the Moves: Optimal,
Efficient and Practical Algorithms for Assortment Optimization [38.57171985309975]
We develop efficient algorithms for the problem of regret in assortment selection with emphPlackett Luce (PL) based user choices.
Our methods are practical, provably optimal, and devoid of the aforementioned limitations of the existing methods.
arXiv Detail & Related papers (2024-02-29T07:17:04Z) - Towards Efficient Exact Optimization of Language Model Alignment [93.39181634597877]
Direct preference optimization (DPO) was proposed to directly optimize the policy from preference data.
We show that DPO derived based on the optimal solution of problem leads to a compromised mean-seeking approximation of the optimal solution in practice.
We propose efficient exact optimization (EXO) of the alignment objective.
arXiv Detail & Related papers (2024-02-01T18:51:54Z) - Functional Graphical Models: Structure Enables Offline Data-Driven Optimization [111.28605744661638]
We show how structure can enable sample-efficient data-driven optimization.
We also present a data-driven optimization algorithm that infers the FGM structure itself.
arXiv Detail & Related papers (2024-01-08T22:33:14Z) - Online Joint Assortment-Inventory Optimization under MNL Choices [14.530542487845732]
We study an online joint assortment-inventory optimization problem, in which we assume that the choice behavior of each customer follows the Multinomial Logit (MNL) choice model.
We propose a novel algorithm that can effectively balance the exploration and exploitation in the online decision-making of assortment and inventory.
arXiv Detail & Related papers (2023-04-04T09:25:34Z) - Data-Driven Offline Decision-Making via Invariant Representation
Learning [97.49309949598505]
offline data-driven decision-making involves synthesizing optimized decisions with no active interaction.
A key challenge is distributional shift: when we optimize with respect to the input into a model trained from offline data, it is easy to produce an out-of-distribution (OOD) input that appears erroneously good.
In this paper, we formulate offline data-driven decision-making as domain adaptation, where the goal is to make accurate predictions for the value of optimized decisions.
arXiv Detail & Related papers (2022-11-21T11:01:37Z) - Optimal Data Selection: An Online Distributed View [61.31708750038692]
We develop algorithms for the online and distributed version of the problem.
We show that our selection methods outperform random selection by $5-20%$.
In learning tasks on ImageNet and MNIST, we show that our selection methods outperform random selection by $5-20%$.
arXiv Detail & Related papers (2022-01-25T18:56:16Z) - Is Pessimism Provably Efficient for Offline RL? [104.00628430454479]
We study offline reinforcement learning (RL), which aims to learn an optimal policy based on a dataset collected a priori.
We propose a pessimistic variant of the value iteration algorithm (PEVI), which incorporates an uncertainty quantifier as the penalty function.
arXiv Detail & Related papers (2020-12-30T09:06:57Z) - DeepCO: Offline Combinatorial Optimization Framework Utilizing Deep
Learning [1.2183405753834562]
We propose DeepCO, an offline optimization framework utilizing deep learning.
We also design an offline variation of Travelling Salesman Problem (TSP) to model warehouse operation sequence optimization problem.
With only limited historical data, novel proposed distribution regularized optimization outperforms existing baseline method in offline experiment reducing route length by 5.7% averagely.
arXiv Detail & Related papers (2020-07-20T04:17:30Z) - Optimizing Revenue while showing Relevant Assortments at Scale [1.0200170217746136]
Real-time assortment optimization has become essential in e-commerce operations.
We design fast and flexible algorithms that find the optimal assortment in difficult regimes.
Empirical validations using a real world dataset show that our algorithms are competitive even when the number of items is $sim 105$ ($10times$ larger instances than previously studied)
arXiv Detail & Related papers (2020-03-06T20:16:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.