Personalized Pricing with Invalid Instrumental Variables:
Identification, Estimation, and Policy Learning
- URL: http://arxiv.org/abs/2302.12670v1
- Date: Fri, 24 Feb 2023 14:50:47 GMT
- Title: Personalized Pricing with Invalid Instrumental Variables:
Identification, Estimation, and Policy Learning
- Authors: Rui Miao, Zhengling Qi, Cong Shi, Lin Lin
- Abstract summary: This work studies offline personalized pricing under endogeneity using an instrumental variable approach.
We propose a new policy learning method for Personalized pRicing using Invalid iNsTrumental variables.
- Score: 5.372349090093469
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pricing based on individual customer characteristics is widely used to
maximize sellers' revenues. This work studies offline personalized pricing
under endogeneity using an instrumental variable approach. Standard
instrumental variable methods in causal inference/econometrics either focus on
a discrete treatment space or require the exclusion restriction of instruments
from having a direct effect on the outcome, which limits their applicability in
personalized pricing. In this paper, we propose a new policy learning method
for Personalized pRicing using Invalid iNsTrumental variables (PRINT) for
continuous treatment that allow direct effects on the outcome. Specifically,
relying on the structural models of revenue and price, we establish the
identifiability condition of an optimal pricing strategy under endogeneity with
the help of invalid instrumental variables. Based on this new identification,
which leads to solving conditional moment restrictions with generalized
residual functions, we construct an adversarial min-max estimator and learn an
optimal pricing strategy. Furthermore, we establish an asymptotic regret bound
to find an optimal pricing strategy. Finally, we demonstrate the effectiveness
of the proposed method via extensive simulation studies as well as a real data
application from an US online auto loan company.
Related papers
- A Tale of Two Cities: Pessimism and Opportunism in Offline Dynamic Pricing [20.06425698412548]
This paper studies offline dynamic pricing without data coverage assumption.
We establish a partial identification bound for the demand parameter whose associated price is unobserved.
We incorporate pessimistic and opportunistic strategies within the proposed partial identification framework to derive the estimated policy.
arXiv Detail & Related papers (2024-11-12T19:09:41Z) - A Primal-Dual Online Learning Approach for Dynamic Pricing of Sequentially Displayed Complementary Items under Sale Constraints [54.46126953873298]
We address the problem of dynamically pricing complementary items that are sequentially displayed to customers.
Coherent pricing policies for complementary items are essential because optimizing the pricing of each item individually is ineffective.
We empirically evaluate our approach using synthetic settings randomly generated from real-world data, and compare its performance in terms of constraints violation and regret.
arXiv Detail & Related papers (2024-07-08T09:55:31Z) - $i$REPO: $i$mplicit Reward Pairwise Difference based Empirical Preference Optimization [12.266207199002604]
Large Language Models (LLM) can sometimes produce outputs that deviate from human expectations.
We propose a novel framework named $i$REPO, which utilizes implicit Reward pairwise difference regression for Empirical Preference Optimization.
We show that $i$REPO effectively achieves self-alignment using soft-label, self-generated responses and the logit of empirical AI annotators.
arXiv Detail & Related papers (2024-05-24T05:42:11Z) - Policy Gradient with Active Importance Sampling [55.112959067035916]
Policy gradient (PG) methods significantly benefit from IS, enabling the effective reuse of previously collected samples.
However, IS is employed in RL as a passive tool for re-weighting historical samples.
We look for the best behavioral policy from which to collect samples to reduce the policy gradient variance.
arXiv Detail & Related papers (2024-05-09T09:08:09Z) - Metalearners for Ranking Treatment Effects [1.469168639465869]
We show how learning to rank can maximize the area under a policy's incremental profit curve.
We show how learning to rank can maximize the area under a policy's incremental profit curve.
arXiv Detail & Related papers (2024-05-03T15:31:18Z) - Structured Dynamic Pricing: Optimal Regret in a Global Shrinkage Model [50.06663781566795]
We consider a dynamic model with the consumers' preferences as well as price sensitivity varying over time.
We measure the performance of a dynamic pricing policy via regret, which is the expected revenue loss compared to a clairvoyant that knows the sequence of model parameters in advance.
Our regret analysis results not only demonstrate optimality of the proposed policy but also show that for policy planning it is essential to incorporate available structural information.
arXiv Detail & Related papers (2023-03-28T00:23:23Z) - Uncertainty-Aware Instance Reweighting for Off-Policy Learning [63.31923483172859]
We propose a Uncertainty-aware Inverse Propensity Score estimator (UIPS) for improved off-policy learning.
Experiment results on synthetic and three real-world recommendation datasets demonstrate the advantageous sample efficiency of the proposed UIPS estimator.
arXiv Detail & Related papers (2023-03-11T11:42:26Z) - Balanced Off-Policy Evaluation for Personalized Pricing [3.296526804364952]
We consider a personalized pricing problem in which we have data consisting of feature information, historical pricing decisions, and binary realized demand.
The goal is to perform off-policy evaluation for a new personalized pricing policy that maps features to prices.
Building on the balanced policy evaluation framework of Kallus, we propose a new approach tailored to pricing applications.
arXiv Detail & Related papers (2023-02-24T16:44:46Z) - When Demonstrations Meet Generative World Models: A Maximum Likelihood
Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.
Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z) - Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions [74.00030431081751]
We formalize the notion of user-specific cost functions and introduce a new method for identifying actionable recourses for users.
Our method satisfies up to 25.89 percentage points more users compared to strong baseline methods.
arXiv Detail & Related papers (2021-11-01T19:49:35Z) - Model Distillation for Revenue Optimization: Interpretable Personalized
Pricing [8.07517029746865]
We present a customized, prescriptive tree-based algorithm that distills knowledge from a complex black-box machine learning algorithm.
It segments customers with similar valuations and prescribes prices in such a way that maximizes revenue while maintaining interpretability.
arXiv Detail & Related papers (2020-07-03T18:33:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.