Nearly Tight Regret Bounds for Profit Maximization in Bilateral Trade
- URL: http://arxiv.org/abs/2509.22563v1
- Date: Fri, 26 Sep 2025 16:42:05 GMT
- Title: Nearly Tight Regret Bounds for Profit Maximization in Bilateral Trade
- Authors: Simone Di Gregorio, Paul Dütting, Federico Fusco, Chris Schwiegelshohn,
- Abstract summary: Bilateral trade models the task of intermediating between two strategic agents, a seller and a buyer, willing to trade a good for which they hold private valuations.<n>We study this problem from the perspective of a broker, in a regret minimization framework.<n>We propose a learning algorithm that guarantees a nearly tight $tildeO(sqrtT)$ regret when seller and valuations are drawn.
- Score: 14.182863671689836
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bilateral trade models the task of intermediating between two strategic agents, a seller and a buyer, willing to trade a good for which they hold private valuations. We study this problem from the perspective of a broker, in a regret minimization framework. At each time step, a new seller and buyer arrive, and the broker has to propose a mechanism that is incentive-compatible and individually rational, with the goal of maximizing profit. We propose a learning algorithm that guarantees a nearly tight $\tilde{O}(\sqrt{T})$ regret in the stochastic setting when seller and buyer valuations are drawn i.i.d. from a fixed and possibly correlated unknown distribution. We further show that it is impossible to achieve sublinear regret in the non-stationary scenario where valuations are generated upfront by an adversary. Our ambitious benchmark for these results is the best incentive-compatible and individually rational mechanism. This separates us from previous works on efficiency maximization in bilateral trade, where the benchmark is a single number: the best fixed price in hindsight. A particular challenge we face is that uniform convergence for all mechanisms' profits is impossible. We overcome this difficulty via a careful chaining analysis that proves convergence for a provably near-optimal mechanism at (essentially) optimal rate. We further showcase the broader applicability of our techniques by providing nearly optimal results for the joint ads problem.
Related papers
- A Tight Regret Analysis of Non-Parametric Repeated Contextual Brokerage [8.049531918823758]
We study a contextual version of the repeated brokerage problem.<n>In each interaction, two traders with private valuations for an item seek to buy or sell based on the learner's-a broker-proposed price, which is informed by some contextual information.<n>The broker's goal is to maximize the traders' net utility-also known as the gain from trade-by minimizing regret compared to an oracle with perfect knowledge of traders' valuation distributions.
arXiv Detail & Related papers (2025-03-03T08:42:55Z) - Pareto-Optimality, Smoothness, and Stochasticity in Learning-Augmented One-Max-Search [20.536739487240318]
One-max search is a classic problem in online decision-making.<n>We show how to leverage the smoothness obtained to provide an analysis of one-max search in learning-augmented settings.
arXiv Detail & Related papers (2025-02-08T23:25:52Z) - Optimal Multi-Objective Best Arm Identification with Fixed Confidence [62.36929749450298]
We consider a multi-armed bandit setting in which each arm yields an $M$-dimensional vector reward upon selection.<n>The end goal is to identify the best arm of em every objective in the shortest (expected) time subject to an upper bound on the probability of error.<n>We propose an algorithm that uses the novel idea of em surrogate proportions to sample the arms at each time step, eliminating the need to solve the max-min optimisation problem at each step.
arXiv Detail & Related papers (2025-01-23T12:28:09Z) - Selling Joint Ads: A Regret Minimization Perspective [7.288063443108292]
Motivated by online retail, we consider the problem of selling one item (e.g., an ad slot) to two non-excludable buyers (say, a merchant and a brand)
This problem captures, for example, situations where a merchant and a brand bid cooperatively in an auction to advertise a product, and both benefit from the ad being shown.
A mechanism collects bids from the two and decides whether to allocate and which payments the two parties should make.
arXiv Detail & Related papers (2024-09-12T07:59:10Z) - Multi-Agent Imitation Learning: Value is Easy, Regret is Hard [52.31989962031179]
We study a multi-agent imitation learning (MAIL) problem where we take the perspective of a learner attempting to coordinate a group of agents.
Most prior work in MAIL essentially reduces the problem to matching the behavior of the expert within the support of the demonstrations.
While doing so is sufficient to drive the value gap between the learner and the expert to zero under the assumption that agents are non-strategic, it does not guarantee to deviations by strategic agents.
arXiv Detail & Related papers (2024-06-06T16:18:20Z) - Fair Allocation in Dynamic Mechanism Design [57.66441610380448]
We consider a problem where an auctioneer sells an indivisible good to groups of buyers in every round, for a total of $T$ rounds.
The auctioneer aims to maximize their discounted overall revenue while adhering to a fairness constraint that guarantees a minimum average allocation for each group.
arXiv Detail & Related papers (2024-05-31T19:26:05Z) - Learning to Maximize Gains From Trade in Small Markets [6.204762177970342]
We study the problem of designing a two-sided market (double auction) under the constraints of (dominant-strategy) incentive compatibility and budget-balance.
Our result is a general impossibility for the case of correlated distributions of values even between one seller and two buyers.
Our second is an efficient learning algorithm for one seller and two buyers in the case of independent distributions.
arXiv Detail & Related papers (2024-01-21T20:57:12Z) - Robust and Performance Incentivizing Algorithms for Multi-Armed Bandits with Strategic Agents [52.75161794035767]
We introduce a class of bandit algorithms that meet the two objectives of performance incentivization and robustness simultaneously.<n>We show that settings where the principal has no information about the arms' performance characteristics can be handled by combining ideas from second price auctions with our algorithms.
arXiv Detail & Related papers (2023-12-13T06:54:49Z) - An $α$-regret analysis of Adversarial Bilateral Trade [10.275531964940425]
We study sequential bilateral trade where sellers and buyers valuations are completely arbitrary.
We consider gain from trade which is harder to approximate than social welfare.
arXiv Detail & Related papers (2022-10-13T08:57:30Z) - Navigating to the Best Policy in Markov Decision Processes [68.8204255655161]
We investigate the active pure exploration problem in Markov Decision Processes.
Agent sequentially selects actions and, from the resulting system trajectory, aims at the best as fast as possible.
arXiv Detail & Related papers (2021-06-05T09:16:28Z) - VCG Mechanism Design with Unknown Agent Values under Stochastic Bandit
Feedback [104.06766271716774]
We study a multi-round welfare-maximising mechanism design problem in instances where agents do not know their values.
We first define three notions of regret for the welfare, the individual utilities of each agent and that of the mechanism.
Our framework also provides flexibility to control the pricing scheme so as to trade-off between the agent and seller regrets.
arXiv Detail & Related papers (2020-04-19T18:00:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.