Dynamic Pricing with Volume Discounts in Online Settings
- URL: http://arxiv.org/abs/2211.09612v1
- Date: Thu, 17 Nov 2022 16:01:06 GMT
- Title: Dynamic Pricing with Volume Discounts in Online Settings
- Authors: Marco Mussi, Gianmarco Genalti, Alessandro Nuara, Francesco Trov\`o,
Marcello Restelli and Nicola Gatti
- Abstract summary: This paper focuses on pricing in e-commerce when objective function is profit and only transaction data are available.
Our work aims to find a pricing strategy that allows defining optimal prices at different volume thresholds to serve different classes of users.
We design a two-phase online learning algorithm, namely-B- capable of exploiting the data in an online fashion.
- Score: 102.00782184214326
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: According to the main international reports, more pervasive industrial and
business-process automation, thanks to machine learning and advanced analytic
tools, will unlock more than 14 trillion USD worldwide annually by 2030. In the
specific case of pricing problems-which constitute the class of problems we
investigate in this paper-, the estimated unlocked value will be about 0.5
trillion USD per year. In particular, this paper focuses on pricing in
e-commerce when the objective function is profit maximization and only
transaction data are available. This setting is one of the most common in
real-world applications. Our work aims to find a pricing strategy that allows
defining optimal prices at different volume thresholds to serve different
classes of users. Furthermore, we face the major challenge, common in
real-world settings, of dealing with limited data available. We design a
two-phase online learning algorithm, namely PVD-B, capable of exploiting the
data incrementally in an online fashion. The algorithm first estimates the
demand curve and retrieves the optimal average price, and subsequently it
offers discounts to differentiate the prices for each volume threshold. We ran
a real-world 4-month-long A/B testing experiment in collaboration with an
Italian e-commerce company, in which our algorithm PVD-B-corresponding to A
configuration-has been compared with human pricing specialists-corresponding to
B configuration. At the end of the experiment, our algorithm produced a total
turnover of about 300 KEuros, outperforming the B configuration performance by
about 55%. The Italian company we collaborated with decided to adopt our
algorithm for more than 1,200 products since January 2022.
Related papers
- $f$-PO: Generalizing Preference Optimization with $f$-divergence Minimization [91.43730624072226]
$f$-PO is a novel framework that generalizes and extends existing approaches.
We conduct experiments on state-of-the-art language models using benchmark datasets.
arXiv Detail & Related papers (2024-10-29T02:11:45Z) - A Primal-Dual Online Learning Approach for Dynamic Pricing of Sequentially Displayed Complementary Items under Sale Constraints [54.46126953873298]
We address the problem of dynamically pricing complementary items that are sequentially displayed to customers.
Coherent pricing policies for complementary items are essential because optimizing the pricing of each item individually is ineffective.
We empirically evaluate our approach using synthetic settings randomly generated from real-world data, and compare its performance in terms of constraints violation and regret.
arXiv Detail & Related papers (2024-07-08T09:55:31Z) - Cost-Sensitive Multi-Fidelity Bayesian Optimization with Transfer of Learning Curve Extrapolation [55.75188191403343]
We introduce utility, which is a function predefined by each user and describes the trade-off between cost and performance of BO.
We validate our algorithm on various LC datasets and found it outperform all the previous multi-fidelity BO and transfer-BO baselines we consider.
arXiv Detail & Related papers (2024-05-28T07:38:39Z) - Learning-augmented Online Algorithm for Two-level Ski-rental Problem [8.381344684943212]
We study the two-level ski-rental problem,where a user needs to fulfill a sequence of demands for multiple items by choosing one of three payment options.
We develop a learning-augmented algorithm (LADTSR) by integrating Machine Learning predictions into the robust online algorithm.
arXiv Detail & Related papers (2024-02-09T16:10:54Z) - Data Market Design through Deep Learning [16.505791601397185]
We introduce the application of deep learning for the design of revenue-optimal data markets.
Our experiments demonstrate that this new deep learning framework can almost precisely replicate all known solutions from theory.
arXiv Detail & Related papers (2023-10-31T00:21:09Z) - Dynamic Pricing and Learning with Bayesian Persuasion [18.59029578133633]
We consider a novel dynamic pricing and learning setting where in addition to setting prices of products, the seller also ex-ante commits to 'advertising schemes'
We use the popular Bayesian persuasion framework to model the effect of these signals on the buyers' valuation and purchase responses.
We design an online algorithm that can use past purchase responses to adaptively learn the optimal pricing and advertising strategy.
arXiv Detail & Related papers (2023-04-27T17:52:06Z) - Human-in-the-loop: Provably Efficient Preference-based Reinforcement
Learning with General Function Approximation [107.54516740713969]
We study human-in-the-loop reinforcement learning (RL) with trajectory preferences.
Instead of receiving a numeric reward at each step, the agent only receives preferences over trajectory pairs from a human overseer.
We propose the first optimistic model-based algorithm for PbRL with general function approximation.
arXiv Detail & Related papers (2022-05-23T09:03:24Z) - Adaptively Optimize Content Recommendation Using Multi Armed Bandit
Algorithms in E-commerce [4.143179903857126]
We analyze using three classic MAB algorithms, epsilon-greedy, Thompson sampling (TS), and upper confidence bound 1 (UCB1) for dynamic content recommendations.
We compare the accumulative rewards of the three MAB algorithms with more than 1,000 trials using actual historical A/B test datasets.
We develop a batch-updated MAB algorithm to overcome the delayed reward issue in e-commerce.
arXiv Detail & Related papers (2021-07-30T21:03:38Z) - Online Apprenticeship Learning [58.45089581278177]
In Apprenticeship Learning (AL), we are given a Markov Decision Process (MDP) without access to the cost function.
The goal is to find a policy that matches the expert's performance on some predefined set of cost functions.
We show that the OAL problem can be effectively solved by combining two mirror descent based no-regret algorithms.
arXiv Detail & Related papers (2021-02-13T12:57:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.