Distribution-free Contextual Dynamic Pricing
- URL: http://arxiv.org/abs/2109.07340v1
- Date: Wed, 15 Sep 2021 14:52:44 GMT
- Title: Distribution-free Contextual Dynamic Pricing
- Authors: Yiyun Luo and Will Wei Sun and and Yufeng Liu
- Abstract summary: Contextual dynamic pricing aims to set personalized prices based on sequential interactions with customers.
In this paper, we consider contextual dynamic pricing with unknown random noise in the valuation model.
Our distribution-free pricing policy learns both the contextual function and the market noise simultaneously.
- Score: 5.773269033551628
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Contextual dynamic pricing aims to set personalized prices based on
sequential interactions with customers. At each time period, a customer who is
interested in purchasing a product comes to the platform. The customer's
valuation for the product is a linear function of contexts, including product
and customer features, plus some random market noise. The seller does not
observe the customer's true valuation, but instead needs to learn the valuation
by leveraging contextual information and historical binary purchase feedbacks.
Existing models typically assume full or partial knowledge of the random noise
distribution. In this paper, we consider contextual dynamic pricing with
unknown random noise in the valuation model. Our distribution-free pricing
policy learns both the contextual function and the market noise simultaneously.
A key ingredient of our method is a novel perturbed linear bandit framework,
where a modified linear upper confidence bound algorithm is proposed to balance
the exploration of market noise and the exploitation of the current knowledge
for better pricing. We establish the regret upper bound and a matching lower
bound of our policy in the perturbed linear bandit framework and prove a
sub-linear regret bound in the considered pricing problem. Finally, we
demonstrate the superior performance of our policy on simulations and a
real-life auto-loan dataset.
Related papers
- A Primal-Dual Online Learning Approach for Dynamic Pricing of Sequentially Displayed Complementary Items under Sale Constraints [54.46126953873298]
We address the problem of dynamically pricing complementary items that are sequentially displayed to customers.
Coherent pricing policies for complementary items are essential because optimizing the pricing of each item individually is ineffective.
We empirically evaluate our approach using synthetic settings randomly generated from real-world data, and compare its performance in terms of constraints violation and regret.
arXiv Detail & Related papers (2024-07-08T09:55:31Z) - Pricing with Contextual Elasticity and Heteroscedastic Valuation [23.96777734246062]
We study an online contextual dynamic pricing problem, where customers decide whether to purchase a product based on its features and price.
We introduce a novel approach to modeling a customer's expected demand by incorporating feature-based price elasticity.
Our results shed light on the relationship between contextual elasticity and heteroscedastic valuation, providing insights for effective and practical pricing strategies.
arXiv Detail & Related papers (2023-12-26T11:07:37Z) - Contextual Dynamic Pricing with Strategic Buyers [93.97401997137564]
We study the contextual dynamic pricing problem with strategic buyers.
Seller does not observe the buyer's true feature, but a manipulated feature according to buyers' strategic behavior.
We propose a strategic dynamic pricing policy that incorporates the buyers' strategic behavior into the online learning to maximize the seller's cumulative revenue.
arXiv Detail & Related papers (2023-07-08T23:06:42Z) - Structured Dynamic Pricing: Optimal Regret in a Global Shrinkage Model [50.06663781566795]
We consider a dynamic model with the consumers' preferences as well as price sensitivity varying over time.
We measure the performance of a dynamic pricing policy via regret, which is the expected revenue loss compared to a clairvoyant that knows the sequence of model parameters in advance.
Our regret analysis results not only demonstrate optimality of the proposed policy but also show that for policy planning it is essential to incorporate available structural information.
arXiv Detail & Related papers (2023-03-28T00:23:23Z) - A Reinforcement Learning Approach in Multi-Phase Second-Price Auction
Design [158.0041488194202]
We study reserve price optimization in multi-phase second price auctions.
From the seller's perspective, we need to efficiently explore the environment in the presence of potentially nontruthful bidders.
Third, the seller's per-step revenue is unknown, nonlinear, and cannot even be directly observed from the environment.
arXiv Detail & Related papers (2022-10-19T03:49:05Z) - Online Nonsubmodular Minimization with Delayed Costs: From Full
Information to Bandit Feedback [98.7678704343537]
We focus on a class of nonsubmodular functions with special structure, and prove regret guarantees for several variants of the online and approximate online bandit gradient descent algorithms.
We derive bounds for the agent's regret in the full information and bandit feedback setting, even if the delay between choosing a decision and receiving the incurred cost is unbounded.
arXiv Detail & Related papers (2022-05-15T08:27:12Z) - Convex Loss Functions for Contextual Pricing with Observational
Posted-Price Data [2.538209532048867]
We study an off-policy contextual pricing problem where the seller has access to samples of prices which customers were previously offered.
This is in contrast to the well-studied setting in which samples of the customer's valuation (willingness to pay) are observed.
In our setting, the observed data is influenced by the historic pricing policy, and we do not know how customers would have responded to alternative prices.
arXiv Detail & Related papers (2022-02-16T22:35:39Z) - Loss Functions for Discrete Contextual Pricing with Observational Data [8.661128420558349]
We study a pricing setting where each customer is offered a contextualized price based on customer and/or product features.
We observe whether each customer purchased a product at the price prescribed rather than the customer's true valuation.
arXiv Detail & Related papers (2021-11-18T20:12:57Z) - Stateful Offline Contextual Policy Evaluation and Learning [88.9134799076718]
We study off-policy evaluation and learning from sequential data.
We formalize the relevant causal structure of problems such as dynamic personalized pricing.
We show improved out-of-sample policy performance in this class of relevant problems.
arXiv Detail & Related papers (2021-10-19T16:15:56Z) - Model Distillation for Revenue Optimization: Interpretable Personalized
Pricing [8.07517029746865]
We present a customized, prescriptive tree-based algorithm that distills knowledge from a complex black-box machine learning algorithm.
It segments customers with similar valuations and prescribes prices in such a way that maximizes revenue while maintaining interpretability.
arXiv Detail & Related papers (2020-07-03T18:33:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.