Fast Rate Learning in Stochastic First Price Bidding
- URL: http://arxiv.org/abs/2107.01835v1
- Date: Mon, 5 Jul 2021 07:48:52 GMT
- Title: Fast Rate Learning in Stochastic First Price Bidding
- Authors: Juliette Achddou (PSL, DI-ENS, VALDA ), Olivier Capp\'e (LTCI, VALDA
), Aur\'elien Garivier (UMPA-ENSL)
- Abstract summary: First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising.
We show how to achieve significantly lower regret when the opponents' maximal bid distribution is known.
Our algorithms converge much faster than alternatives proposed in the literature for various bid distributions.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: First-price auctions have largely replaced traditional bidding approaches
based on Vickrey auctions in programmatic advertising. As far as learning is
concerned, first-price auctions are more challenging because the optimal
bidding strategy does not only depend on the value of the item but also
requires some knowledge of the other bids. They have already given rise to
several works in sequential learning, many of which consider models for which
the value of the buyer or the opponents' maximal bid is chosen in an
adversarial manner. Even in the simplest settings, this gives rise to
algorithms whose regret grows as $\sqrt{T}$ with respect to the time horizon
$T$. Focusing on the case where the buyer plays against a stationary stochastic
environment, we show how to achieve significantly lower regret: when the
opponents' maximal bid distribution is known we provide an algorithm whose
regret can be as low as $\log^2(T)$; in the case where the distribution must be
learnt sequentially, a generalization of this algorithm can achieve $T^{1/3+
\epsilon}$ regret, for any $\epsilon>0$. To obtain these results, we introduce
two novel ideas that can be of interest in their own right. First, by
transposing results obtained in the posted price setting, we provide conditions
under which the first-price biding utility is locally quadratic around its
optimum. Second, we leverage the observation that, on small sub-intervals, the
concentration of the variations of the empirical distribution function may be
controlled more accurately than by using the classical
Dvoretzky-Kiefer-Wolfowitz inequality. Numerical simulations confirm that our
algorithms converge much faster than alternatives proposed in the literature
for various bid distributions, including for bids collected on an actual
programmatic advertising platform.
Related papers
- No-Regret Algorithms in non-Truthful Auctions with Budget and ROI Constraints [0.9694940903078658]
We study the problem of designing online autobidding algorithms to optimize value subject to ROI and budget constraints.
Our main result is an algorithm with full information feedback that guarantees a near-optimal $tilde O(sqrt T)$ regret with respect to the best Lipschitz function.
arXiv Detail & Related papers (2024-04-15T14:31:53Z) - Online Learning in Contextual Second-Price Pay-Per-Click Auctions [47.06746975822902]
We study online learning in contextual pay-per-click auctions where at each of the $T$ rounds, the learner receives some context along with a set of ads.
The learner's goal is to minimize her regret, defined as the gap between her total revenue and that of an oracle strategy.
arXiv Detail & Related papers (2023-10-08T07:04:22Z) - Autobidders with Budget and ROI Constraints: Efficiency, Regret, and Pacing Dynamics [53.62091043347035]
We study a game between autobidding algorithms that compete in an online advertising platform.
We propose a gradient-based learning algorithm that is guaranteed to satisfy all constraints and achieves vanishing individual regret.
arXiv Detail & Related papers (2023-01-30T21:59:30Z) - Leveraging the Hints: Adaptive Bidding in Repeated First-Price Auctions [42.002983450368134]
We study the question of how to bid in first-price auctions.
Unlike in second-price auctions, bidding one's private value truthfully is no longer optimal.
We consider two types of hints: one where a single point-prediction is available, and the other where a hint interval is available.
arXiv Detail & Related papers (2022-11-05T19:20:53Z) - A Reinforcement Learning Approach in Multi-Phase Second-Price Auction
Design [158.0041488194202]
We study reserve price optimization in multi-phase second price auctions.
From the seller's perspective, we need to efficiently explore the environment in the presence of potentially nontruthful bidders.
Third, the seller's per-step revenue is unknown, nonlinear, and cannot even be directly observed from the environment.
arXiv Detail & Related papers (2022-10-19T03:49:05Z) - Efficient Algorithms for Stochastic Repeated Second-price Auctions [0.0]
We develop efficient sequential bidding strategies for repeated auctions.
We provide the first parametric lower bound for this problem.
We propose more explainable strategies which are reminiscent of the Explore Then Commit bandit algorithm.
arXiv Detail & Related papers (2020-11-10T12:45:02Z) - A novel auction system for selecting advertisements in Real-Time bidding [68.8204255655161]
Real-Time Bidding is a new Internet advertising system that has become very popular in recent years.
We propose an alternative betting system with a new approach that not only considers the economic aspect but also other relevant factors for the functioning of the advertising system.
arXiv Detail & Related papers (2020-10-22T18:36:41Z) - Learning to Bid Optimally and Efficiently in Adversarial First-price
Auctions [40.30925727499806]
We develop the first minimax optimal online bidding algorithm that achieves an $widetildeO(sqrtT)$ regret.
We test our algorithm on three real-world first-price auction datasets obtained from Verizon Media.
arXiv Detail & Related papers (2020-07-09T05:40:39Z) - Optimal No-regret Learning in Repeated First-price Auctions [38.908235632001116]
We study online learning in repeated first-price auctions.
We develop the first learning algorithm that achieves a near-optimal $widetildeO(sqrtT)$ regret bound.
arXiv Detail & Related papers (2020-03-22T03:32:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.