Bilateral Trade: A Regret Minimization Perspective
- URL: http://arxiv.org/abs/2109.12974v1
- Date: Wed, 8 Sep 2021 22:11:48 GMT
- Title: Bilateral Trade: A Regret Minimization Perspective
- Authors: Nicol\`o Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Federico
Fusco, Stefano Leonardi
- Abstract summary: We cast the bilateral trade problem in a regret minimization framework over $T$ rounds of seller/buyer interactions.
Our main contribution is a complete characterization of the regret regimes for fixed-price mechanisms with different feedback models and private valuations.
- Score: 5.031063690574698
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bilateral trade, a fundamental topic in economics, models the problem of
intermediating between two strategic agents, a seller and a buyer, willing to
trade a good for which they hold private valuations. In this paper, we cast the
bilateral trade problem in a regret minimization framework over $T$ rounds of
seller/buyer interactions, with no prior knowledge on their private valuations.
Our main contribution is a complete characterization of the regret regimes for
fixed-price mechanisms with different feedback models and private valuations,
using as a benchmark the best fixed-price in hindsight. More precisely, we
prove the following tight bounds on the regret:
- $\Theta(\sqrt{T})$ for full-feedback (i.e., direct revelation mechanisms).
- $\Theta(T^{2/3})$ for realistic feedback (i.e., posted-price mechanisms)
and independent seller/buyer valuations with bounded densities.
- $\Theta(T)$ for realistic feedback and seller/buyer valuations with bounded
densities.
- $\Theta(T)$ for realistic feedback and independent seller/buyer valuations.
- $\Theta(T)$ for the adversarial setting.
Related papers
- Fair Online Bilateral Trade [20.243000364933472]
We present a complete characterization of the regret for fair gain from trade when, after each interaction, the platform only learns whether each trader accepted the current price.
We conclude by providing tight regret bounds when, after each interaction, the platform is allowed to observe the true traders' valuations.
arXiv Detail & Related papers (2024-05-22T18:49:11Z) - Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback [58.66941279460248]
Learning from human feedback plays an important role in aligning generative models, such as large language models (LLM)
We study a model within this problem domain--contextual dueling bandits with adversarial feedback, where the true preference label can be flipped by an adversary.
We propose an algorithm namely robust contextual dueling bandit (algo), which is based on uncertainty-weighted maximum likelihood estimation.
arXiv Detail & Related papers (2024-04-16T17:59:55Z) - Contextual Dynamic Pricing with Strategic Buyers [93.97401997137564]
We study the contextual dynamic pricing problem with strategic buyers.
Seller does not observe the buyer's true feature, but a manipulated feature according to buyers' strategic behavior.
We propose a strategic dynamic pricing policy that incorporates the buyers' strategic behavior into the online learning to maximize the seller's cumulative revenue.
arXiv Detail & Related papers (2023-07-08T23:06:42Z) - Improving Language Model Negotiation with Self-Play and In-Context
Learning from AI Feedback [97.54519989641388]
We study whether multiple large language models (LLMs) can autonomously improve each other in a negotiation game by playing, reflecting, and criticizing.
Only a subset of the language models we consider can self-play and improve the deal price from AI feedback.
arXiv Detail & Related papers (2023-05-17T11:55:32Z) - Borda Regret Minimization for Generalized Linear Dueling Bandits [65.09919504862496]
We study the Borda regret minimization problem for dueling bandits, which aims to identify the item with the highest Borda score.
We propose a rich class of generalized linear dueling bandit models, which cover many existing models.
Our algorithm achieves an $tildeO(d2/3 T2/3)$ regret, which is also optimal.
arXiv Detail & Related papers (2023-03-15T17:59:27Z) - Repeated Bilateral Trade Against a Smoothed Adversary [5.939280057673226]
We study repeated bilateral trade where an adaptive $sigma$-smooth adversary generates the valuations of sellers and buyers.
We provide a complete characterization of the regret regimes for fixed-price mechanisms under different feedback models.
arXiv Detail & Related papers (2023-02-21T16:30:10Z) - A Reinforcement Learning Approach in Multi-Phase Second-Price Auction
Design [158.0041488194202]
We study reserve price optimization in multi-phase second price auctions.
From the seller's perspective, we need to efficiently explore the environment in the presence of potentially nontruthful bidders.
Third, the seller's per-step revenue is unknown, nonlinear, and cannot even be directly observed from the environment.
arXiv Detail & Related papers (2022-10-19T03:49:05Z) - An $α$-regret analysis of Adversarial Bilateral Trade [10.275531964940425]
We study sequential bilateral trade where sellers and buyers valuations are completely arbitrary.
We consider gain from trade which is harder to approximate than social welfare.
arXiv Detail & Related papers (2022-10-13T08:57:30Z) - Double Auctions with Two-sided Bandit Feedback [11.334374665364214]
Double Auction enables decentralized transfer of goods between multiple buyers and sellers.
We show with confidence bound based bidding, and Average Pricing' there is an efficient price discovery among the participants.
Our paper is the first to provide decentralized learning algorithms in a two-sided market where emphboth sides have uncertain preference that need to be learned.
arXiv Detail & Related papers (2022-08-13T01:03:34Z) - A Regret Analysis of Bilateral Trade [5.031063690574698]
We cast for the first time the bilateral trade problem in a regret minimization framework over rounds of seller/buyer interactions.
Our main contribution is a complete characterization of the regret regimes for fixedprice mechanisms with different models of feedback and private valuations.
arXiv Detail & Related papers (2021-02-16T08:53:17Z) - Adversarial Dueling Bandits [85.14061196945599]
We introduce the problem of regret in Adversarial Dueling Bandits.
The learner has to repeatedly choose a pair of items and observe only a relative binary win-loss' feedback for this pair.
Our main result is an algorithm whose $T$-round regret compared to the emphBorda-winner from a set of $K$ items.
arXiv Detail & Related papers (2020-10-27T19:09:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.