Optimal Nonlinear Online Learning under Sequential Price Competition via s-Concavity
- URL: http://arxiv.org/abs/2503.16737v1
- Date: Thu, 20 Mar 2025 22:51:03 GMT
- Title: Optimal Nonlinear Online Learning under Sequential Price Competition via s-Concavity
- Authors: Daniele Bracale, Moulinath Banerjee, Cong Shi, Yuekai Sun,
- Abstract summary: We consider price competition among multiple sellers over a selling horizon of $T$ periods.<n>Sellers simultaneously offer their prices and observe their respective demand that is unobservable to competitors.<n>We show that when all sellers employ our policy, their prices converge at a rate of $O(T-1/7)$ to the Nash equilibrium prices that sellers would reach if they were fully informed.
- Score: 24.586053819490985
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider price competition among multiple sellers over a selling horizon of $T$ periods. In each period, sellers simultaneously offer their prices and subsequently observe their respective demand that is unobservable to competitors. The demand function for each seller depends on all sellers' prices through a private, unknown, and nonlinear relationship. To address this challenge, we propose a semi-parametric least-squares estimation of the nonlinear mean function, which does not require sellers to communicate demand information. We show that when all sellers employ our policy, their prices converge at a rate of $O(T^{-1/7})$ to the Nash equilibrium prices that sellers would reach if they were fully informed. Each seller incurs a regret of $O(T^{5/7})$ relative to a dynamic benchmark policy. A theoretical contribution of our work is proving the existence of equilibrium under shape-constrained demand functions via the concept of $s$-concavity and establishing regret bounds of our proposed policy. Technically, we also establish new concentration results for the least squares estimator under shape constraints. Our findings offer significant insights into dynamic competition-aware pricing and contribute to the broader study of non-parametric learning in strategic decision-making.
Related papers
- Dynamic Pricing with Adversarially-Censored Demands [25.566323930646178]
We study an online dynamic pricing problem where the potential demand at each time period $t=1,2,ldots, T$ is and dependent on the price.<n>A perishable inventory is imposed at the beginning of each time $t$, censoring the potential demand if it exceeds the inventory level.<n>We show that our algorithm achieves $tildeO(sqrtT)$ optimal regret even with adversarial inventory series.
arXiv Detail & Related papers (2025-02-10T05:37:39Z) - Fairness-aware Contextual Dynamic Pricing with Strategic Buyers [4.883313216485195]
We propose a dynamic pricing policy that simultaneously achieves price fairness and discourages strategic behaviors.<n>Our policy achieves an upper bound of $O(sqrt+H(T))$ regret over $T$ time horizons.<n>We also prove an $Omega(sqrtT)$ regret lower bound of any pricing policy under our problem setting.
arXiv Detail & Related papers (2025-01-25T22:30:37Z) - A Primal-Dual Online Learning Approach for Dynamic Pricing of Sequentially Displayed Complementary Items under Sale Constraints [54.46126953873298]
We address the problem of dynamically pricing complementary items that are sequentially displayed to customers.
Coherent pricing policies for complementary items are essential because optimizing the pricing of each item individually is ineffective.
We empirically evaluate our approach using synthetic settings randomly generated from real-world data, and compare its performance in terms of constraints violation and regret.
arXiv Detail & Related papers (2024-07-08T09:55:31Z) - On Tractable $Φ$-Equilibria in Non-Concave Games [53.212133025684224]
We study tractable $Phi$-equilibria in non-concave games.
We show that when $Phi$ is finite, there exists an efficient uncoupled learning algorithm that converges to the corresponding $Phi$-equilibria.
arXiv Detail & Related papers (2024-03-13T01:51:30Z) - Dynamic Pricing and Learning with Long-term Reference Effects [16.07344044662994]
We study a simple and novel reference price mechanism where reference price is the average of the past prices offered by the seller.
We show that under this mechanism, a markdown policy is near-optimal irrespective of the parameters of the model.
We then consider a more challenging dynamic pricing and learning problem, where the demand model parameters are apriori unknown.
arXiv Detail & Related papers (2024-02-19T21:36:54Z) - Contextual Dynamic Pricing with Strategic Buyers [93.97401997137564]
We study the contextual dynamic pricing problem with strategic buyers.
Seller does not observe the buyer's true feature, but a manipulated feature according to buyers' strategic behavior.
We propose a strategic dynamic pricing policy that incorporates the buyers' strategic behavior into the online learning to maximize the seller's cumulative revenue.
arXiv Detail & Related papers (2023-07-08T23:06:42Z) - Structured Dynamic Pricing: Optimal Regret in a Global Shrinkage Model [50.06663781566795]
We consider a dynamic model with the consumers' preferences as well as price sensitivity varying over time.
We measure the performance of a dynamic pricing policy via regret, which is the expected revenue loss compared to a clairvoyant that knows the sequence of model parameters in advance.
Our regret analysis results not only demonstrate optimality of the proposed policy but also show that for policy planning it is essential to incorporate available structural information.
arXiv Detail & Related papers (2023-03-28T00:23:23Z) - Online Learning under Budget and ROI Constraints via Weak Adaptivity [57.097119428915796]
Existing primal-dual algorithms for constrained online learning problems rely on two fundamental assumptions.
We show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers.
We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions.
arXiv Detail & Related papers (2023-02-02T16:30:33Z) - Autoregressive Bandits [58.46584210388307]
We propose a novel online learning setting, Autoregressive Bandits, in which the observed reward is governed by an autoregressive process of order $k$.
We show that, under mild assumptions on the reward process, the optimal policy can be conveniently computed.
We then devise a new optimistic regret minimization algorithm, namely, AutoRegressive Upper Confidence Bound (AR-UCB), that suffers sublinear regret of order $widetildemathcalO left( frac(k+1)3/2sqrtnT (1-G
arXiv Detail & Related papers (2022-12-12T21:37:36Z) - A Reinforcement Learning Approach in Multi-Phase Second-Price Auction
Design [158.0041488194202]
We study reserve price optimization in multi-phase second price auctions.
From the seller's perspective, we need to efficiently explore the environment in the presence of potentially nontruthful bidders.
Third, the seller's per-step revenue is unknown, nonlinear, and cannot even be directly observed from the environment.
arXiv Detail & Related papers (2022-10-19T03:49:05Z) - Fairness-aware Online Price Discrimination with Nonparametric Demand
Models [13.46602731592102]
This paper studies the problem of dynamic discriminatory pricing under fairness constraints.
We propose an optimal dynamic pricing policy regarding regret, which enforces the strict price fairness constraint.
arXiv Detail & Related papers (2021-11-16T04:31:02Z) - Dynamic Incentive-aware Learning: Robust Pricing in Contextual Auctions [13.234975857626752]
We consider the problem of robust learning of reserve prices against strategic buyers in contextual second-price auctions.
We propose learning policies that are robust to such strategic behavior.
arXiv Detail & Related papers (2020-02-25T19:00:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.