Related papers: Utility Fairness in Contextual Dynamic Pricing with Demand Learning

Utility Fairness in Contextual Dynamic Pricing with Demand Learning

URL: http://arxiv.org/abs/2311.16528v1
Date: Tue, 28 Nov 2023 05:19:23 GMT
Title: Utility Fairness in Contextual Dynamic Pricing with Demand Learning
Authors: Xi Chen, David Simchi-Levi, Yining Wang
Abstract summary: This paper introduces a novel contextual bandit algorithm for personalized pricing under utility fairness constraints. Our approach, which incorporates dynamic pricing and demand learning, addresses the critical challenge of fairness in pricing strategies.
Score: 23.26236046836737
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper introduces a novel contextual bandit algorithm for personalized pricing under utility fairness constraints in scenarios with uncertain demand, achieving an optimal regret upper bound. Our approach, which incorporates dynamic pricing and demand learning, addresses the critical challenge of fairness in pricing strategies. We first delve into the static full-information setting to formulate an optimal pricing policy as a constrained optimization problem. Here, we propose an approximation algorithm for efficiently and approximately computing the ideal policy. We also use mathematical analysis and computational studies to characterize the structures of optimal contextual pricing policies subject to fairness constraints, deriving simplified policies which lays the foundations of more in-depth research and extensions. Further, we extend our study to dynamic pricing problems with demand learning, establishing a non-standard regret lower bound that highlights the complexity added by fairness constraints. Our research offers a comprehensive analysis of the cost of fairness and its impact on the balance between utility and revenue maximization. This work represents a step towards integrating ethical considerations into algorithmic efficiency in data-driven dynamic pricing.

Related papers

Convergence and Sample Complexity of First-Order Methods for Agnostic Reinforcement Learning [66.4260157478436]
We study reinforcement learning in the policy learning setting.<n>The goal is to find a policy whose performance is competitive with the best policy in a given class of interest.
arXiv Detail & Related papers (2025-07-06T14:40:05Z)
Cost-Optimal Active AI Model Evaluation [71.2069549142394]
Development of generative AI systems requires continual evaluation, data acquisition, and annotation.<n>We develop novel, cost-aware methods for actively balancing the use of a cheap, but often inaccurate, weak rater.<n>We derive a family of cost-optimal policies for allocating a given annotation budget between weak and strong raters.
arXiv Detail & Related papers (2025-06-09T17:14:41Z)
Transfer Learning for Nonparametric Contextual Dynamic Pricing [17.420508136662257]
Dynamic pricing strategies are crucial for firms to maximize revenue by adjusting prices based on market conditions and customer characteristics. One promising approach to overcome this limitation is to leverage information from related products or markets to inform the focal pricing decisions. We propose a novel Transfer Learning for Dynamic Pricing (TLDP) algorithm that can effectively leverage pre-collected data from a source domain to enhance pricing decisions in the target domain.
arXiv Detail & Related papers (2025-01-31T01:05:04Z)
Joint Pricing and Resource Allocation: An Optimal Online-Learning Approach [20.70943884841438]
We study an online learning horizon where we make joint pricing and inventory decisions to maximize the overall net profit. We develop an efficient algorithm that utilizes a "Confidence Bound" strategy over multiple OCO.
arXiv Detail & Related papers (2025-01-29T23:23:54Z)
Statistical Analysis of Policy Space Compression Problem [54.1754937830779]
Policy search methods are crucial in reinforcement learning, offering a framework to address continuous state-action and partially observable problems. Reducing the policy space through policy compression emerges as a powerful, reward-free approach to accelerate the learning process. This technique condenses the policy space into a smaller, representative set while maintaining most of the original effectiveness.
arXiv Detail & Related papers (2024-11-15T02:46:55Z)
Deep Generative Demand Learning for Newsvendor and Pricing [7.594251468240168]
We consider data-driven inventory and pricing decisions in the feature-based newsvendor problem. We propose a novel approach leveraging conditional deep generative models (cDGMs) to address these challenges. We provide theoretical guarantees for our approach, including the consistency of profit estimation and convergence of our decisions to the optimal solution.
arXiv Detail & Related papers (2024-11-13T14:17:26Z)
A Tale of Two Cities: Pessimism and Opportunism in Offline Dynamic Pricing [20.06425698412548]
This paper studies offline dynamic pricing without data coverage assumption. We establish a partial identification bound for the demand parameter whose associated price is unobserved. We incorporate pessimistic and opportunistic strategies within the proposed partial identification framework to derive the estimated policy.
arXiv Detail & Related papers (2024-11-12T19:09:41Z)
A Primal-Dual Online Learning Approach for Dynamic Pricing of Sequentially Displayed Complementary Items under Sale Constraints [54.46126953873298]
We address the problem of dynamically pricing complementary items that are sequentially displayed to customers. Coherent pricing policies for complementary items are essential because optimizing the pricing of each item individually is ineffective. We empirically evaluate our approach using synthetic settings randomly generated from real-world data, and compare its performance in terms of constraints violation and regret.
arXiv Detail & Related papers (2024-07-08T09:55:31Z)
Pricing with Contextual Elasticity and Heteroscedastic Valuation [23.96777734246062]
We study an online contextual dynamic pricing problem, where customers decide whether to purchase a product based on its features and price. We introduce a novel approach to modeling a customer's expected demand by incorporating feature-based price elasticity. Our results shed light on the relationship between contextual elasticity and heteroscedastic valuation, providing insights for effective and practical pricing strategies.
arXiv Detail & Related papers (2023-12-26T11:07:37Z)
Structured Dynamic Pricing: Optimal Regret in a Global Shrinkage Model [50.06663781566795]
We consider a dynamic model with the consumers' preferences as well as price sensitivity varying over time. We measure the performance of a dynamic pricing policy via regret, which is the expected revenue loss compared to a clairvoyant that knows the sequence of model parameters in advance. Our regret analysis results not only demonstrate optimality of the proposed policy but also show that for policy planning it is essential to incorporate available structural information.
arXiv Detail & Related papers (2023-03-28T00:23:23Z)
Reinforcement Learning with Stepwise Fairness Constraints [50.538878453547966]
We introduce the study of reinforcement learning with stepwise fairness constraints. We provide learning algorithms with strong theoretical guarantees in regard to policy optimality and fairness violation.
arXiv Detail & Related papers (2022-11-08T04:06:23Z)
Online Regularization towards Always-Valid High-Dimensional Dynamic Pricing [19.11333865618553]
We propose a novel approach for designing dynamic pricing policy based regularized online statistical learning with theoretical guarantees. Our proposed online regularization scheme equips the proposed optimistic online regularized maximum likelihood pricing (OORMLP) pricing policy with three major advantages. In theory, the proposed OORMLP algorithm exploits the sparsity structure of high-dimensional models and secures a logarithmic regret in a decision horizon.
arXiv Detail & Related papers (2020-07-05T23:52:09Z)
Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation [49.69139684065241]
Contextual multi-armed bandit (MAB) achieves cutting-edge performance on a variety of problems. In this paper, we propose a hierarchical adaptive contextual bandit method (HATCH) to conduct the policy learning of contextual bandits with a budget constraint.
arXiv Detail & Related papers (2020-04-02T17:04:52Z)
Cost-Sensitive Portfolio Selection via Deep Reinforcement Learning [100.73223416589596]
We propose a cost-sensitive portfolio selection method with deep reinforcement learning. Specifically, a novel two-stream portfolio policy network is devised to extract both price series patterns and asset correlations. A new cost-sensitive reward function is developed to maximize the accumulated return and constrain both costs via reinforcement learning.
arXiv Detail & Related papers (2020-03-06T06:28:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.