Learning and Collusion in Multi-unit Auctions
- URL: http://arxiv.org/abs/2305.17402v2
- Date: Fri, 12 Jan 2024 23:04:46 GMT
- Title: Learning and Collusion in Multi-unit Auctions
- Authors: Simina Br\^anzei and Mahsa Derakhshan and Negin Golrezaei and Yanjun
Han
- Abstract summary: We consider repeated multi-unit auctions with uniform pricing.
We analyze the properties of this auction in both the offline and online settings.
We show that the $(K+1)$-st price format is susceptible to collusion among the bidders.
- Score: 17.727436775513368
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider repeated multi-unit auctions with uniform pricing, which are
widely used in practice for allocating goods such as carbon licenses. In each
round, $K$ identical units of a good are sold to a group of buyers that have
valuations with diminishing marginal returns. The buyers submit bids for the
units, and then a price $p$ is set per unit so that all the units are sold. We
consider two variants of the auction, where the price is set to the $K$-th
highest bid and $(K+1)$-st highest bid, respectively.
We analyze the properties of this auction in both the offline and online
settings. In the offline setting, we consider the problem that one player $i$
is facing: given access to a data set that contains the bids submitted by
competitors in past auctions, find a bid vector that maximizes player $i$'s
cumulative utility on the data set. We design a polynomial time algorithm for
this problem, by showing it is equivalent to finding a maximum-weight path on a
carefully constructed directed acyclic graph.
In the online setting, the players run learning algorithms to update their
bids as they participate in the auction over time. Based on our offline
algorithm, we design efficient online learning algorithms for bidding. The
algorithms have sublinear regret, under both full information and bandit
feedback structures. We complement our online learning algorithms with regret
lower bounds.
Finally, we analyze the quality of the equilibria in the worst case through
the lens of the core solution concept in the game among the bidders. We show
that the $(K+1)$-st price format is susceptible to collusion among the bidders;
meanwhile, the $K$-th price format does not have this issue.
Related papers
- Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback [58.66941279460248]
Learning from human feedback plays an important role in aligning generative models, such as large language models (LLM)
We study a model within this problem domain--contextual dueling bandits with adversarial feedback, where the true preference label can be flipped by an adversary.
We propose an algorithm namely robust contextual dueling bandit (algo), which is based on uncertainty-weighted maximum likelihood estimation.
arXiv Detail & Related papers (2024-04-16T17:59:55Z) - No-Regret Algorithms in non-Truthful Auctions with Budget and ROI Constraints [0.9694940903078658]
We study the problem of designing online autobidding algorithms to optimize value subject to ROI and budget constraints.
Our main result is an algorithm with full information feedback that guarantees a near-optimal $tilde O(sqrt T)$ regret with respect to the best Lipschitz function.
arXiv Detail & Related papers (2024-04-15T14:31:53Z) - Online Learning in Contextual Second-Price Pay-Per-Click Auctions [47.06746975822902]
We study online learning in contextual pay-per-click auctions where at each of the $T$ rounds, the learner receives some context along with a set of ads.
The learner's goal is to minimize her regret, defined as the gap between her total revenue and that of an oracle strategy.
arXiv Detail & Related papers (2023-10-08T07:04:22Z) - Learning in Repeated Multi-Unit Pay-As-Bid Auctions [3.6294895527930504]
We consider the problem of learning how to bid in repeated multi-unit pay-as-bid auctions.
The problem of learning how to bid in pay-as-bid auctions is challenging according to the nature of the action space.
We show that the optimal solution to the offline problem can be obtained using a time dynamic programming scheme.
arXiv Detail & Related papers (2023-07-27T20:49:28Z) - Autobidders with Budget and ROI Constraints: Efficiency, Regret, and Pacing Dynamics [53.62091043347035]
We study a game between autobidding algorithms that compete in an online advertising platform.
We propose a gradient-based learning algorithm that is guaranteed to satisfy all constraints and achieves vanishing individual regret.
arXiv Detail & Related papers (2023-01-30T21:59:30Z) - A Reinforcement Learning Approach in Multi-Phase Second-Price Auction
Design [158.0041488194202]
We study reserve price optimization in multi-phase second price auctions.
From the seller's perspective, we need to efficiently explore the environment in the presence of potentially nontruthful bidders.
Third, the seller's per-step revenue is unknown, nonlinear, and cannot even be directly observed from the environment.
arXiv Detail & Related papers (2022-10-19T03:49:05Z) - No-regret Learning in Repeated First-Price Auctions with Budget
Constraints [5.834615090865286]
We propose an RL-based bidding algorithm against the optimal non-anticipating strategy under stationary competition.
Our algorithm obtains $widetilde O(sqrt T)$-regret if the bids are all revealed at the end of each round.
arXiv Detail & Related papers (2022-05-29T04:32:05Z) - No-Regret Learning in Partially-Informed Auctions [85.67897346422122]
We study a machine learning formulation of auctions with partially-revealed information.
In each round, a new item is drawn from an unknown distribution and the platform publishes a price together with incomplete, "masked" information about the item.
We show that when the distribution over items is known to the buyer and the mask is a SimHash function mapping $mathbbRd$ to $0,1ell$, our algorithm has regret $tilde mathcalO((Tdell)frac12)$.
arXiv Detail & Related papers (2022-02-22T01:15:51Z) - Fast Rate Learning in Stochastic First Price Bidding [0.0]
First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising.
We show how to achieve significantly lower regret when the opponents' maximal bid distribution is known.
Our algorithms converge much faster than alternatives proposed in the literature for various bid distributions.
arXiv Detail & Related papers (2021-07-05T07:48:52Z) - Online Markov Decision Processes with Aggregate Bandit Feedback [74.85532145498742]
We study a novel variant of online finite-horizon Markov Decision Processes with adversarially changing loss functions and initially unknown dynamics.
In each episode, the learner suffers the loss accumulated along the trajectory realized by the policy chosen for the episode, and observes aggregate bandit feedback.
Our main result is a computationally efficient algorithm with $O(sqrtK)$ regret for this setting, where $K$ is the number of episodes.
arXiv Detail & Related papers (2021-01-31T16:49:07Z) - Learning to Bid Optimally and Efficiently in Adversarial First-price
Auctions [40.30925727499806]
We develop the first minimax optimal online bidding algorithm that achieves an $widetildeO(sqrtT)$ regret.
We test our algorithm on three real-world first-price auction datasets obtained from Verizon Media.
arXiv Detail & Related papers (2020-07-09T05:40:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.