Related papers: An $α$-regret analysis of Adversarial Bilateral Trade

An $α$-regret analysis of Adversarial Bilateral Trade

URL: http://arxiv.org/abs/2210.06846v2
Date: Thu, 10 Oct 2024 15:27:18 GMT
Title: An $α$-regret analysis of Adversarial Bilateral Trade
Authors: Yossi Azar, Amos Fiat, Federico Fusco,
Abstract summary: We study sequential bilateral trade where sellers and buyers valuations are completely arbitrary. We consider gain from trade which is harder to approximate than social welfare.
Score: 10.275531964940425
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study sequential bilateral trade where sellers and buyers valuations are completely arbitrary (i.e., determined by an adversary). Sellers and buyers are strategic agents with private valuations for the good and the goal is to design a mechanism that maximizes efficiency (or gain from trade) while being incentive compatible, individually rational and budget balanced. In this paper we consider gain from trade which is harder to approximate than social welfare. We consider a variety of feedback scenarios and distinguish the cases where the mechanism posts one price and when it can post different prices for buyer and seller. We show several surprising results about the separation between the different scenarios. In particular we show that (a) it is impossible to achieve sublinear $\alpha$-regret for any $\alpha<2$, (b) but with full feedback sublinear $2$-regret is achievable (c) with a single price and partial feedback one cannot get sublinear $\alpha$ regret for any constant $\alpha$ (d) nevertheless, posting two prices even with one-bit feedback achieves sublinear $2$-regret, and (e) there is a provable separation in the $2$-regret bounds between full and partial feedback.

Related papers

Online Fair Division for Personalized $2$-Value Instances [51.278096593080456]
We study an online fair division setting, where goods arrive one at a time and there is a fixed set of $n$ agents.<n>Once a good appears, the value each agent has for it is revealed and it must be allocated immediately and irrevocably to one of the agents.<n>We show how to obtain worst case guarantees with respect to well-known fairness notions.
arXiv Detail & Related papers (2025-05-28T09:48:16Z)
Near-optimal Regret Using Policy Optimization in Online MDPs with Aggregate Bandit Feedback [49.84060509296641]
We study online finite-horizon Markov Decision Processes with adversarially changing loss and aggregate bandit feedback (a.k.a full-bandit) Under this type of feedback, the agent observes only the total loss incurred over the entire trajectory, rather than the individual losses at each intermediate step within the trajectory. We introduce the first Policy Optimization algorithms for this setting.
arXiv Detail & Related papers (2025-02-06T12:03:24Z)
Reinforcement Learning with Segment Feedback [56.54271464134885]
We consider a model named RL with segment feedback, which offers a general paradigm filling the gap between per-state-action feedback and trajectory feedback. Under binary feedback, increasing the number of segments $m$ decreases the regret at an exponential rate; in contrast, surprisingly, under sum feedback, increasing $m$ does not reduce the regret significantly.
arXiv Detail & Related papers (2025-02-03T23:08:42Z)
Market Making without Regret [15.588799679661637]
We consider a sequential decision-making setting where, at every round $t$, a market maker posts a bid price $B_t$ and an ask price $A_t$ to an incoming trader. If the trader's valuation is lower than the bid price, or higher than the ask price, then a trade (sell or buy) occurs. We characterize the maker's regret with respect to the best fixed choice of bid and ask pairs.
arXiv Detail & Related papers (2024-11-21T10:13:55Z)
Mind the Gap: A Causal Perspective on Bias Amplification in Prediction & Decision-Making [58.06306331390586]
We introduce the notion of a margin complement, which measures how much a prediction score $S$ changes due to a thresholding operation. We show that under suitable causal assumptions, the influences of $X$ on the prediction score $S$ are equal to the influences of $X$ on the true outcome $Y$.
arXiv Detail & Related papers (2024-05-24T11:22:19Z)
Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback [58.66941279460248]
Learning from human feedback plays an important role in aligning generative models, such as large language models (LLM) We study a model within this problem domain--contextual dueling bandits with adversarial feedback, where the true preference label can be flipped by an adversary. We propose an algorithm namely robust contextual dueling bandit (algo), which is based on uncertainty-weighted maximum likelihood estimation.
arXiv Detail & Related papers (2024-04-16T17:59:55Z)
An Online Learning Theory of Brokerage [3.8059763597999012]
We investigate brokerage between traders from an online learning perspective. Unlike other bilateral trade problems already studied, we focus on the case where there are no designated buyer and seller roles. We show that the optimal rate degrades to $sqrtT$ in the first case, and the problem becomes unlearnable in the second.
arXiv Detail & Related papers (2023-10-18T17:01:32Z)
Repeated Bilateral Trade Against a Smoothed Adversary [5.939280057673226]
We study repeated bilateral trade where an adaptive $sigma$-smooth adversary generates the valuations of sellers and buyers. We provide a complete characterization of the regret regimes for fixed-price mechanisms under different feedback models.
arXiv Detail & Related papers (2023-02-21T16:30:10Z)
Autobidders with Budget and ROI Constraints: Efficiency, Regret, and Pacing Dynamics [53.62091043347035]
We study a game between autobidding algorithms that compete in an online advertising platform. We propose a gradient-based learning algorithm that is guaranteed to satisfy all constraints and achieves vanishing individual regret.
arXiv Detail & Related papers (2023-01-30T21:59:30Z)
A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design [158.0041488194202]
We study reserve price optimization in multi-phase second price auctions. From the seller's perspective, we need to efficiently explore the environment in the presence of potentially nontruthful bidders. Third, the seller's per-step revenue is unknown, nonlinear, and cannot even be directly observed from the environment.
arXiv Detail & Related papers (2022-10-19T03:49:05Z)
Minimax-Optimal Multi-Agent RL in Zero-Sum Markov Games With a Generative Model [50.38446482252857]
Two-player zero-sum Markov games are arguably the most basic setting in multi-agent reinforcement learning. We develop a learning algorithm that learns an $varepsilon$-approximate Markov NE policy using $$ widetildeObigg. We derive a refined regret bound for FTRL that makes explicit the role of variance-type quantities.
arXiv Detail & Related papers (2022-08-22T17:24:55Z)
Bilateral Trade: A Regret Minimization Perspective [5.031063690574698]
We cast the bilateral trade problem in a regret minimization framework over $T$ rounds of seller/buyer interactions. Our main contribution is a complete characterization of the regret regimes for fixed-price mechanisms with different feedback models and private valuations.
arXiv Detail & Related papers (2021-09-08T22:11:48Z)
A Regret Analysis of Bilateral Trade [5.031063690574698]
We cast for the first time the bilateral trade problem in a regret minimization framework over rounds of seller/buyer interactions. Our main contribution is a complete characterization of the regret regimes for fixedprice mechanisms with different models of feedback and private valuations.
arXiv Detail & Related papers (2021-02-16T08:53:17Z)
Dropout as a Regularizer of Interaction Effects [76.84531978621143]
Dropout is a regularizer against higher-order interactions. We prove this perspective analytically and empirically. We also find that it is difficult to obtain the same selective pressure against high-order interactions.
arXiv Detail & Related papers (2020-07-02T01:11:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.