Related papers: Transfer Learning for Nonparametric Contextual Dynamic Pricing

Transfer Learning for Nonparametric Contextual Dynamic Pricing

URL: http://arxiv.org/abs/2501.18836v1
Date: Fri, 31 Jan 2025 01:05:04 GMT
Title: Transfer Learning for Nonparametric Contextual Dynamic Pricing
Authors: Fan Wang, Feiyu Jiang, Zifeng Zhao, Yi Yu,
Abstract summary: Dynamic pricing strategies are crucial for firms to maximize revenue by adjusting prices based on market conditions and customer characteristics.<n>One promising approach to overcome this limitation is to leverage information from related products or markets to inform the focal pricing decisions.<n>We propose a novel Transfer Learning for Dynamic Pricing (TLDP) algorithm that can effectively leverage pre-collected data from a source domain to enhance pricing decisions in the target domain.
Score: 17.420508136662257
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Dynamic pricing strategies are crucial for firms to maximize revenue by adjusting prices based on market conditions and customer characteristics. However, designing optimal pricing strategies becomes challenging when historical data are limited, as is often the case when launching new products or entering new markets. One promising approach to overcome this limitation is to leverage information from related products or markets to inform the focal pricing decisions. In this paper, we explore transfer learning for nonparametric contextual dynamic pricing under a covariate shift model, where the marginal distributions of covariates differ between source and target domains while the reward functions remain the same. We propose a novel Transfer Learning for Dynamic Pricing (TLDP) algorithm that can effectively leverage pre-collected data from a source domain to enhance pricing decisions in the target domain. The regret upper bound of TLDP is established under a simple Lipschitz condition on the reward function. To establish the optimality of TLDP, we further derive a matching minimax lower bound, which includes the target-only scenario as a special case and is presented for the first time in the literature. Extensive numerical experiments validate our approach, demonstrating its superiority over existing methods and highlighting its practical utility in real-world applications.

Related papers

Your Offline Policy is Not Trustworthy: Bilevel Reinforcement Learning for Sequential Portfolio Optimization [82.03139922490796]
Reinforcement learning (RL) has shown significant promise for sequential portfolio optimization tasks, such as stock trading, where the objective is to maximize cumulative returns while minimizing risks using historical data.<n>Traditional RL approaches often produce policies that merely memorize the optimal yet impractical buying and selling behaviors within the fixed dataset.<n>Our approach frames portfolio optimization as a new type of partial-offline RL problem and makes two technical contributions.
arXiv Detail & Related papers (2025-05-19T06:37:25Z)
Deep Generative Demand Learning for Newsvendor and Pricing [7.594251468240168]
We consider data-driven inventory and pricing decisions in the feature-based newsvendor problem. We propose a novel approach leveraging conditional deep generative models (cDGMs) to address these challenges. We provide theoretical guarantees for our approach, including the consistency of profit estimation and convergence of our decisions to the optimal solution.
arXiv Detail & Related papers (2024-11-13T14:17:26Z)
A Tale of Two Cities: Pessimism and Opportunism in Offline Dynamic Pricing [20.06425698412548]
This paper studies offline dynamic pricing without data coverage assumption. We establish a partial identification bound for the demand parameter whose associated price is unobserved. We incorporate pessimistic and opportunistic strategies within the proposed partial identification framework to derive the estimated policy.
arXiv Detail & Related papers (2024-11-12T19:09:41Z)
Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence [60.37934652213881]
Domain Adaptation (DA) facilitates knowledge transfer from a source domain to a related target domain. This paper investigates a practical DA paradigm, namely Source data-Free Active Domain Adaptation (SFADA), where source data becomes inaccessible during adaptation. We present learn from the learnt (LFTL), a novel paradigm for SFADA to leverage the learnt knowledge from the source pretrained model and actively iterated models without extra overhead.
arXiv Detail & Related papers (2024-07-26T17:51:58Z)
A Primal-Dual Online Learning Approach for Dynamic Pricing of Sequentially Displayed Complementary Items under Sale Constraints [54.46126953873298]
We address the problem of dynamically pricing complementary items that are sequentially displayed to customers. Coherent pricing policies for complementary items are essential because optimizing the pricing of each item individually is ineffective. We empirically evaluate our approach using synthetic settings randomly generated from real-world data, and compare its performance in terms of constraints violation and regret.
arXiv Detail & Related papers (2024-07-08T09:55:31Z)
Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning [70.22819290458581]
Reinforcement learning with human feedback (RLHF) is a widely adopted approach in current large language model pipelines. Our approach introduces two key innovations: (1) on-policy query to avoid OOD and imbalance issues in seed data, and (2) active learning to select the most informative data for preference queries.
arXiv Detail & Related papers (2024-07-02T10:09:19Z)
Utility Fairness in Contextual Dynamic Pricing with Demand Learning [23.26236046836737]
This paper introduces a novel contextual bandit algorithm for personalized pricing under utility fairness constraints. Our approach, which incorporates dynamic pricing and demand learning, addresses the critical challenge of fairness in pricing strategies.
arXiv Detail & Related papers (2023-11-28T05:19:23Z)
Optimizing Credit Limit Adjustments Under Adversarial Goals Using Reinforcement Learning [42.303733194571905]
We seek to find and automatize an optimal credit card limit adjustment policy by employing reinforcement learning techniques. Our research establishes a conceptual structure for applying reinforcement learning framework to credit limit adjustment.
arXiv Detail & Related papers (2023-06-27T16:10:36Z)
Structured Dynamic Pricing: Optimal Regret in a Global Shrinkage Model [50.06663781566795]
We consider a dynamic model with the consumers' preferences as well as price sensitivity varying over time. We measure the performance of a dynamic pricing policy via regret, which is the expected revenue loss compared to a clairvoyant that knows the sequence of model parameters in advance. Our regret analysis results not only demonstrate optimality of the proposed policy but also show that for policy planning it is essential to incorporate available structural information.
arXiv Detail & Related papers (2023-03-28T00:23:23Z)
Personalized Pricing with Invalid Instrumental Variables: Identification, Estimation, and Policy Learning [5.372349090093469]
This work studies offline personalized pricing under endogeneity using an instrumental variable approach. We propose a new policy learning method for Personalized pRicing using Invalid iNsTrumental variables.
arXiv Detail & Related papers (2023-02-24T14:50:47Z)
RAIN: RegulArization on Input and Network for Black-Box Domain Adaptation [80.03883315743715]
Source-free domain adaptation transits the source-trained model towards target domain without exposing the source data. This paradigm is still at risk of data leakage due to adversarial attacks on the source model. We propose a novel approach named RAIN (RegulArization on Input and Network) for Black-Box domain adaptation from both input-level and network-level regularization.
arXiv Detail & Related papers (2022-08-22T18:18:47Z)
The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift [127.21287240963859]
We investigate a transfer learning approach with pretraining on the source data and finetuning based on the target data. For a large class of linear regression instances, transfer learning with $O(N2)$ source data is as effective as supervised learning with $N$ target data.
arXiv Detail & Related papers (2022-08-03T05:59:49Z)
A Curriculum-style Self-training Approach for Source-Free Semantic Segmentation [91.13472029666312]
We propose a curriculum-style self-training approach for source-free domain adaptive semantic segmentation. Our method yields state-of-the-art performance on source-free semantic segmentation tasks for both synthetic-to-real and adverse conditions.
arXiv Detail & Related papers (2021-06-22T10:21:39Z)
Online Regularization towards Always-Valid High-Dimensional Dynamic Pricing [19.11333865618553]
We propose a novel approach for designing dynamic pricing policy based regularized online statistical learning with theoretical guarantees. Our proposed online regularization scheme equips the proposed optimistic online regularized maximum likelihood pricing (OORMLP) pricing policy with three major advantages. In theory, the proposed OORMLP algorithm exploits the sparsity structure of high-dimensional models and secures a logarithmic regret in a decision horizon.
arXiv Detail & Related papers (2020-07-05T23:52:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.