Learning in Repeated Multi-Unit Pay-As-Bid Auctions
- URL: http://arxiv.org/abs/2307.15193v2
- Date: Mon, 15 Jul 2024 19:58:56 GMT
- Title: Learning in Repeated Multi-Unit Pay-As-Bid Auctions
- Authors: Rigel Galgana, Negin Golrezaei,
- Abstract summary: We consider the problem of learning how to bid in repeated multi-unit pay-as-bid auctions.
The problem of learning how to bid in pay-as-bid auctions is challenging according to the nature of the action space.
We show that the optimal solution to the offline problem can be obtained using a time dynamic programming scheme.
- Score: 3.6294895527930504
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by Carbon Emissions Trading Schemes, Treasury Auctions, and Procurement Auctions, which all involve the auctioning of homogeneous multiple units, we consider the problem of learning how to bid in repeated multi-unit pay-as-bid auctions. In each of these auctions, a large number of (identical) items are to be allocated to the largest submitted bids, where the price of each of the winning bids is equal to the bid itself. The problem of learning how to bid in pay-as-bid auctions is challenging due to the combinatorial nature of the action space. We overcome this challenge by focusing on the offline setting, where the bidder optimizes their vector of bids while only having access to the past submitted bids by other bidders. We show that the optimal solution to the offline problem can be obtained using a polynomial time dynamic programming (DP) scheme. We leverage the structure of the DP scheme to design online learning algorithms with polynomial time and space complexity under full information and bandit feedback settings. We achieve an upper bound on regret of $O(M\sqrt{T\log |\mathcal{B}|})$ and $O(M\sqrt{|\mathcal{B}|T\log |\mathcal{B}|})$ respectively, where $M$ is the number of units demanded by the bidder, $T$ is the total number of auctions, and $|\mathcal{B}|$ is the size of the discretized bid space. We accompany these results with a regret lower bound, which match the linear dependency in $M$. Our numerical results suggest that when all agents behave according to our proposed no regret learning algorithms, the resulting market dynamics mainly converge to a welfare maximizing equilibrium where bidders submit uniform bids. Lastly, our experiments demonstrate that the pay-as-bid auction consistently generates significantly higher revenue compared to its popular alternative, the uniform price auction.
Related papers
- Fair Allocation in Dynamic Mechanism Design [57.66441610380448]
We consider a problem where an auctioneer sells an indivisible good to groups of buyers in every round, for a total of $T$ rounds.
The auctioneer aims to maximize their discounted overall revenue while adhering to a fairness constraint that guarantees a minimum average allocation for each group.
arXiv Detail & Related papers (2024-05-31T19:26:05Z) - Learning and Collusion in Multi-unit Auctions [17.727436775513368]
We consider repeated multi-unit auctions with uniform pricing.
We analyze the properties of this auction in both the offline and online settings.
We show that the $(K+1)$-st price format is susceptible to collusion among the bidders.
arXiv Detail & Related papers (2023-05-27T08:00:49Z) - Autobidders with Budget and ROI Constraints: Efficiency, Regret, and Pacing Dynamics [53.62091043347035]
We study a game between autobidding algorithms that compete in an online advertising platform.
We propose a gradient-based learning algorithm that is guaranteed to satisfy all constraints and achieves vanishing individual regret.
arXiv Detail & Related papers (2023-01-30T21:59:30Z) - A Reinforcement Learning Approach in Multi-Phase Second-Price Auction
Design [158.0041488194202]
We study reserve price optimization in multi-phase second price auctions.
From the seller's perspective, we need to efficiently explore the environment in the presence of potentially nontruthful bidders.
Third, the seller's per-step revenue is unknown, nonlinear, and cannot even be directly observed from the environment.
arXiv Detail & Related papers (2022-10-19T03:49:05Z) - No-regret Learning in Repeated First-Price Auctions with Budget
Constraints [5.834615090865286]
We propose an RL-based bidding algorithm against the optimal non-anticipating strategy under stationary competition.
Our algorithm obtains $widetilde O(sqrt T)$-regret if the bids are all revealed at the end of each round.
arXiv Detail & Related papers (2022-05-29T04:32:05Z) - Fast Rate Learning in Stochastic First Price Bidding [0.0]
First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising.
We show how to achieve significantly lower regret when the opponents' maximal bid distribution is known.
Our algorithms converge much faster than alternatives proposed in the literature for various bid distributions.
arXiv Detail & Related papers (2021-07-05T07:48:52Z) - ProportionNet: Balancing Fairness and Revenue for Auction Design with
Deep Learning [55.76903822619047]
We study the design of revenue-maximizing auctions with strong incentive guarantees.
We extend techniques for approximating auctions using deep learning to address concerns of fairness while maintaining high revenue and strong incentive guarantees.
arXiv Detail & Related papers (2020-10-13T13:54:21Z) - Learning to Bid Optimally and Efficiently in Adversarial First-price
Auctions [40.30925727499806]
We develop the first minimax optimal online bidding algorithm that achieves an $widetildeO(sqrtT)$ regret.
We test our algorithm on three real-world first-price auction datasets obtained from Verizon Media.
arXiv Detail & Related papers (2020-07-09T05:40:39Z) - Optimal No-regret Learning in Repeated First-price Auctions [38.908235632001116]
We study online learning in repeated first-price auctions.
We develop the first learning algorithm that achieves a near-optimal $widetildeO(sqrtT)$ regret bound.
arXiv Detail & Related papers (2020-03-22T03:32:09Z) - Online Joint Bid/Daily Budget Optimization of Internet Advertising
Campaigns [115.96295568115251]
We study the problem of automating the online joint bid/daily budget optimization of pay-per-click advertising campaigns over multiple channels.
For every campaign, we capture the dependency of the number of clicks on the bid and daily budget by Gaussian Processes.
We design four algorithms and show that they suffer from a regret that is upper bounded with high probability as O(sqrtT)
We present the results of the adoption of our algorithms in a real-world application with a daily average spent of 1,000 Euros for more than one year.
arXiv Detail & Related papers (2020-03-03T11:07:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.