Learning in Repeated Multi-Unit Pay-As-Bid Auctions
- URL: http://arxiv.org/abs/2307.15193v3
- Date: Sat, 09 Nov 2024 19:30:46 GMT
- Title: Learning in Repeated Multi-Unit Pay-As-Bid Auctions
- Authors: Rigel Galgana, Negin Golrezaei,
- Abstract summary: We study the problem of bidding strategies in pay-as-bid (PAB) auctions from the perspective of single bidder.
We show that a utility trick enables a time algorithm to solve the offline problem where competing bids are known in advance.
We also present additional findings on the characterization of PAB equilibria.
- Score: 3.6294895527930504
- License:
- Abstract: Motivated by Carbon Emissions Trading Schemes, Treasury Auctions, Procurement Auctions, and Wholesale Electricity Markets, which all involve the auctioning of homogeneous multiple units, we consider the problem of learning how to bid in repeated multi-unit pay-as-bid auctions. In each of these auctions, a large number of (identical) items are to be allocated to the largest submitted bids, where the price of each of the winning bids is equal to the bid itself. In this work, we study the problem of optimizing bidding strategies from the perspective of a single bidder. Effective bidding in pay-as-bid (PAB) auctions is complex due to the combinatorial nature of the action space. We show that a utility decoupling trick enables a polynomial time algorithm to solve the offline problem where competing bids are known in advance. Leveraging this structure, we design efficient algorithms for the online problem under both full information and bandit feedback settings that achieve an upper bound on regret of $O(M \sqrt{T \log T})$ and $O(M T^{\frac{2}{3}} \sqrt{\log T})$ respectively, where $M$ is the number of units demanded by the bidder and $T$ is the total number of auctions. We accompany these results with a regret lower bound of $\Omega(M\sqrt{T})$ for the full information setting and $\Omega (M^{2/3}T^{2/3})$ for the bandit setting. We also present additional findings on the characterization of PAB equilibria. While the Nash equilibria of PAB auctions possess nice properties such as winning bid uniformity and high welfare \& revenue, they are not guaranteed under no regret learning dynamics. Nevertheless, our simulations suggest these properties hold anyways, regardless of Nash equilibrium existence. Compared to its uniform price counterpart, the PAB dynamics converge faster and achieve higher revenue, making PAB appealing whenever revenue holds significant social value.
Related papers
- Procurement Auctions via Approximately Optimal Submodular Optimization [53.93943270902349]
We study procurement auctions, where an auctioneer seeks to acquire services from strategic sellers with private costs.
Our goal is to design computationally efficient auctions that maximize the difference between the quality of the acquired services and the total cost of the sellers.
arXiv Detail & Related papers (2024-11-20T18:06:55Z) - Randomized Truthful Auctions with Learning Agents [10.39657928150242]
We study a setting where agents use no-regret learning to participate in repeated auctions.
We show that when bidders participate in second-price auctions using no-regret bidding algorithms, the runner-up bidder may not converge to bidding truthfully.
We define a notion of em auctioneer regret comparing the revenue generated to the revenue of a second price auction with bids.
arXiv Detail & Related papers (2024-11-14T15:28:40Z) - Fair Allocation in Dynamic Mechanism Design [57.66441610380448]
We consider a problem where an auctioneer sells an indivisible good to groups of buyers in every round, for a total of $T$ rounds.
The auctioneer aims to maximize their discounted overall revenue while adhering to a fairness constraint that guarantees a minimum average allocation for each group.
arXiv Detail & Related papers (2024-05-31T19:26:05Z) - No-Regret Algorithms in non-Truthful Auctions with Budget and ROI Constraints [0.9694940903078658]
We study the problem of designing online autobidding algorithms to optimize value subject to ROI and budget constraints.
Our main result is an algorithm with full information feedback that guarantees a near-optimal $tilde O(sqrt T)$ regret with respect to the best Lipschitz function.
arXiv Detail & Related papers (2024-04-15T14:31:53Z) - Combinatorial Stochastic-Greedy Bandit [79.1700188160944]
We propose a novelgreedy bandit (SGB) algorithm for multi-armed bandit problems when no extra information other than the joint reward of the selected set of $n$ arms at each time $tin [T]$ is observed.
SGB adopts an optimized-explore-then-commit approach and is specifically designed for scenarios with a large set of base arms.
arXiv Detail & Related papers (2023-12-13T11:08:25Z) - Learning and Collusion in Multi-unit Auctions [17.727436775513368]
We consider repeated multi-unit auctions with uniform pricing.
We analyze the properties of this auction in both the offline and online settings.
We show that the $(K+1)$-st price format is susceptible to collusion among the bidders.
arXiv Detail & Related papers (2023-05-27T08:00:49Z) - Autobidders with Budget and ROI Constraints: Efficiency, Regret, and Pacing Dynamics [53.62091043347035]
We study a game between autobidding algorithms that compete in an online advertising platform.
We propose a gradient-based learning algorithm that is guaranteed to satisfy all constraints and achieves vanishing individual regret.
arXiv Detail & Related papers (2023-01-30T21:59:30Z) - A Reinforcement Learning Approach in Multi-Phase Second-Price Auction
Design [158.0041488194202]
We study reserve price optimization in multi-phase second price auctions.
From the seller's perspective, we need to efficiently explore the environment in the presence of potentially nontruthful bidders.
Third, the seller's per-step revenue is unknown, nonlinear, and cannot even be directly observed from the environment.
arXiv Detail & Related papers (2022-10-19T03:49:05Z) - ProportionNet: Balancing Fairness and Revenue for Auction Design with
Deep Learning [55.76903822619047]
We study the design of revenue-maximizing auctions with strong incentive guarantees.
We extend techniques for approximating auctions using deep learning to address concerns of fairness while maintaining high revenue and strong incentive guarantees.
arXiv Detail & Related papers (2020-10-13T13:54:21Z) - Optimal No-regret Learning in Repeated First-price Auctions [38.908235632001116]
We study online learning in repeated first-price auctions.
We develop the first learning algorithm that achieves a near-optimal $widetildeO(sqrtT)$ regret bound.
arXiv Detail & Related papers (2020-03-22T03:32:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.