Online Learning and Optimization for Revenue Management Problems with
Add-on Discounts
- URL: http://arxiv.org/abs/2005.00947v1
- Date: Sat, 2 May 2020 23:54:17 GMT
- Title: Online Learning and Optimization for Revenue Management Problems with
Add-on Discounts
- Authors: David Simchi-Levi, Rui Sun, Huanan Zhang
- Abstract summary: We formulate this problem as an optimization problem to determine the prices of different products and the selection of products with add-on discounts.
We propose an efficient FPTAS algorithm that can solve the problem approximately to any desired accuracy.
We show that our learning algorithm can converge to the optimal algorithm that has access to the true demand functions.
- Score: 14.844382070740524
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study in this paper a revenue management problem with add-on discounts.
The problem is motivated by the practice in the video game industry, where a
retailer offers discounts on selected supportive products (e.g. video games) to
customers who have also purchased the core products (e.g. video game consoles).
We formulate this problem as an optimization problem to determine the prices of
different products and the selection of products with add-on discounts. To
overcome the computational challenge of this optimization problem, we propose
an efficient FPTAS algorithm that can solve the problem approximately to any
desired accuracy. Moreover, we consider the revenue management problem in the
setting where the retailer has no prior knowledge of the demand functions of
different products. To resolve this problem, we propose a UCB-based learning
algorithm that uses the FPTAS optimization algorithm as a subroutine. We show
that our learning algorithm can converge to the optimal algorithm that has
access to the true demand functions, and we prove that the convergence rate is
tight up to a certain logarithmic term. In addition, we conduct numerical
experiments with the real-world transaction data we collect from a popular
video gaming brand's online store on Tmall.com. The experiment results
illustrate our learning algorithm's robust performance and fast convergence in
various scenarios. We also compare our algorithm with the optimal policy that
does not use any add-on discount, and the results show the advantages of using
the add-on discount strategy in practice.
Related papers
- Contractual Reinforcement Learning: Pulling Arms with Invisible Hands [68.77645200579181]
We propose a theoretical framework for aligning economic interests of different stakeholders in the online learning problems through contract design.
For the planning problem, we design an efficient dynamic programming algorithm to determine the optimal contracts against the far-sighted agent.
For the learning problem, we introduce a generic design of no-regret learning algorithms to untangle the challenges from robust design of contracts to the balance of exploration and exploitation.
arXiv Detail & Related papers (2024-07-01T16:53:00Z) - Learning with Posterior Sampling for Revenue Management under Time-varying Demand [36.22276574805786]
We discuss the revenue management problem to maximize revenue by pricing items or services.
One challenge in this problem is that the demand distribution is unknown and varies over time in real applications such as airline and retail industries.
arXiv Detail & Related papers (2024-05-08T09:28:26Z) - Online Conversion with Switching Costs: Robust and Learning-Augmented Algorithms [11.029788598491077]
We study online conversion with switching costs, a family of online problems that capture emerging problems at the intersection of energy and sustainability.
We introduce competitive (robust) threshold-based algorithms for both the deterministic and deterministic variants of this problem.
We then propose learning-augmented algorithms that take advantage of black-box advice to achieve significantly better average-case performance.
arXiv Detail & Related papers (2023-10-31T16:34:49Z) - Landscape Surrogate: Learning Decision Losses for Mathematical
Optimization Under Partial Information [48.784330281177446]
Recent works in learning-integrated optimization have shown promise in settings where the optimization is only partially observed or where general-purposes perform poorly without expert tuning.
We propose using a smooth and learnable Landscape Surrogate as a replacement for $fcirc mathbfg$.
This surrogate, learnable by neural networks, can be computed faster than the $mathbfg$ solver, provides dense and smooth gradients during training, can generalize to unseen optimization problems, and is efficiently learned via alternating optimization.
arXiv Detail & Related papers (2023-07-18T04:29:16Z) - Buying Information for Stochastic Optimization [16.272461309965284]
We study how to buy information for optimization and formulate this question as an online learning problem.
We show that this ratio is tight as the problem is equivalent to a robust generalization of the ski-rental problem, which we call super-martingale stopping.
We provide an $8$-competitive algorithm running in time that chooses actions and decides when to buy information about the underlying request.
arXiv Detail & Related papers (2023-06-06T11:50:09Z) - Learning to Optimize Permutation Flow Shop Scheduling via Graph-based
Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems.
We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately.
Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - MNL-Bandits under Inventory and Limited Switches Constraints [38.960764902819434]
We develop an efficient UCB-like algorithm to optimize the assortments while learning customers' choices from data.
We prove that our algorithm can achieve a sub-linear regret bound $tildeOleft(T1-alpha/2right)$ if $O(Talpha)$ switches are allowed.
arXiv Detail & Related papers (2022-04-22T16:02:27Z) - Efficient and Optimal Algorithms for Contextual Dueling Bandits under
Realizability [59.81339109121384]
We study the $K$ contextual dueling bandit problem, a sequential decision making setting in which the learner uses contextual information to make two decisions, but only observes emphpreference-based feedback suggesting that one decision was better than the other.
We provide a new algorithm that achieves the optimal regret rate for a new notion of best response regret, which is a strictly stronger performance measure than those considered in prior works.
arXiv Detail & Related papers (2021-11-24T07:14:57Z) - Online Apprenticeship Learning [58.45089581278177]
In Apprenticeship Learning (AL), we are given a Markov Decision Process (MDP) without access to the cost function.
The goal is to find a policy that matches the expert's performance on some predefined set of cost functions.
We show that the OAL problem can be effectively solved by combining two mirror descent based no-regret algorithms.
arXiv Detail & Related papers (2021-02-13T12:57:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.