Related papers: Strategically-Robust Learning Algorithms for Bidding in First-Price Auctions

Strategically-Robust Learning Algorithms for Bidding in First-Price Auctions

URL: http://arxiv.org/abs/2402.07363v2
Date: Sun, 7 Jul 2024 06:07:00 GMT
Title: Strategically-Robust Learning Algorithms for Bidding in First-Price Auctions
Authors: Rachitesh Kumar, Jon Schneider, Balasubramanian Sivan,
Abstract summary: Learning to bid in repeated first-price auctions is a fundamental problem at the interface of game theory and machine learning. We propose a novel concave formulation for pure-strategy bidding in first-price auctions, and use it to analyze natural Gradient-Ascent-based algorithms for this problem.
Score: 11.988955088595858
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning to bid in repeated first-price auctions is a fundamental problem at the interface of game theory and machine learning, which has seen a recent surge in interest due to the transition of display advertising to first-price auctions. In this work, we propose a novel concave formulation for pure-strategy bidding in first-price auctions, and use it to analyze natural Gradient-Ascent-based algorithms for this problem. Importantly, our analysis goes beyond regret, which was the typical focus of past work, and also accounts for the strategic backdrop of online-advertising markets where bidding algorithms are deployed -- we provide the first guarantees of strategic-robustness and incentive-compatibility for Gradient Ascent. Concretely, we show that our algorithms achieve $O(\sqrt{T})$ regret when the highest competing bids are generated adversarially, and show that no online algorithm can do better. We further prove that the regret reduces to $O(\log T)$ when the competition is stationary and stochastic, which drastically improves upon the previous best of $O(\sqrt{T})$. Moving beyond regret, we show that a strategic seller cannot exploit our algorithms to extract more revenue on average than is possible under the optimal mechanism. Finally, we prove that our algorithm is also incentive compatible -- it is a (nearly) dominant strategy for the buyer to report her values truthfully to the algorithm as a whole. Altogether, these guarantees make our algorithms the first to simultaneously achieve both optimal regret and strategic-robustness.

Related papers

Learning to Lead: Incentivizing Strategic Agents in the Dark [50.93875404941184]
We study an online learning version of the generalized principal-agent model.<n>We develop the first provably sample-efficient algorithm for this challenging setting.<n>We establish a near optimal $tildeO(sqrtT) $ regret bound for learning the principal's optimal policy.
arXiv Detail & Related papers (2025-06-10T04:25:04Z)
Computational Intractability of Strategizing against Online Learners [7.225039292056237]
We show that a standard no-regret algorithm can compute a near-optimal strategy against a learner. This establishes a fundamental computational barrier to finding optimal strategies in general game-theoretic settings.
arXiv Detail & Related papers (2025-03-06T08:28:50Z)
Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information [57.287431079644705]
We study the problem of online learning in Stackelberg games with side information between a leader and a sequence of followers. We provide learning algorithms for the leader which achieve $O(T1/2)$ regret under bandit feedback.
arXiv Detail & Related papers (2025-01-31T22:40:57Z)
Deep Reinforcement Learning for Online Optimal Execution Strategies [49.1574468325115]
This paper tackles the challenge of learning non-Markovian optimal execution strategies in dynamic financial markets. We introduce a novel actor-critic algorithm based on Deep Deterministic Policy Gradient (DDPG) We show that our algorithm successfully approximates the optimal execution strategy.
arXiv Detail & Related papers (2024-10-17T12:38:08Z)
Learning Safe Strategies for Value Maximizing Buyers in Uniform Price Auctions [4.089889918897877]
We study the bidding problem in repeated uniform price multi-unit auctions from the perspective of a value-maximizing buyer.<n>We introduce the notion of safe bidding strategies as those that satisfy the RoI constraints irrespective of competing bids.<n>We show that these strategies satisfy a mild no-overbidding condition, depend only on the valuation curve of the bidder, and the bidder can focus on a finite subset without loss of generality.
arXiv Detail & Related papers (2024-06-06T01:29:47Z)
Online Learning in Contextual Second-Price Pay-Per-Click Auctions [47.06746975822902]
We study online learning in contextual pay-per-click auctions where at each of the $T$ rounds, the learner receives some context along with a set of ads. The learner's goal is to minimize her regret, defined as the gap between her total revenue and that of an oracle strategy.
arXiv Detail & Related papers (2023-10-08T07:04:22Z)
Autobidders with Budget and ROI Constraints: Efficiency, Regret, and Pacing Dynamics [53.62091043347035]
We study a game between autobidding algorithms that compete in an online advertising platform. We propose a gradient-based learning algorithm that is guaranteed to satisfy all constraints and achieves vanishing individual regret.
arXiv Detail & Related papers (2023-01-30T21:59:30Z)
No-regret Learning in Repeated First-Price Auctions with Budget Constraints [5.834615090865286]
We propose an RL-based bidding algorithm against the optimal non-anticipating strategy under stationary competition. Our algorithm obtains $widetilde O(sqrt T)$-regret if the bids are all revealed at the end of each round.
arXiv Detail & Related papers (2022-05-29T04:32:05Z)
Fast Rate Learning in Stochastic First Price Bidding [0.0]
First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising. We show how to achieve significantly lower regret when the opponents' maximal bid distribution is known. Our algorithms converge much faster than alternatives proposed in the literature for various bid distributions.
arXiv Detail & Related papers (2021-07-05T07:48:52Z)
Online Adversarial Attacks [57.448101834579624]
We formalize the online adversarial attack problem, emphasizing two key elements found in real-world use-cases. We first rigorously analyze a deterministic variant of the online threat model. We then propose algoname, a simple yet practical algorithm yielding a provably better competitive ratio for $k=2$ over the current best single threshold algorithm.
arXiv Detail & Related papers (2021-03-02T20:36:04Z)
Provably Efficient Algorithms for Multi-Objective Competitive RL [54.22598924633369]
We study multi-objective reinforcement learning (RL) where an agent's reward is represented as a vector. In settings where an agent competes against opponents, its performance is measured by the distance of its average return vector to a target set. We develop statistically and computationally efficient algorithms to approach the associated target set.
arXiv Detail & Related papers (2021-02-05T14:26:00Z)
Byzantine-Resilient Non-Convex Stochastic Gradient Descent [61.6382287971982]
adversary-resilient distributed optimization, in which. machines can independently compute gradients, and cooperate. Our algorithm is based on a new concentration technique, and its sample complexity. It is very practical: it improves upon the performance of all prior methods when no. setting machines are present.
arXiv Detail & Related papers (2020-12-28T17:19:32Z)
Efficient Algorithms for Stochastic Repeated Second-price Auctions [0.0]
We develop efficient sequential bidding strategies for repeated auctions. We provide the first parametric lower bound for this problem. We propose more explainable strategies which are reminiscent of the Explore Then Commit bandit algorithm.
arXiv Detail & Related papers (2020-11-10T12:45:02Z)
Learning to Bid Optimally and Efficiently in Adversarial First-price Auctions [40.30925727499806]
We develop the first minimax optimal online bidding algorithm that achieves an $widetildeO(sqrtT)$ regret. We test our algorithm on three real-world first-price auction datasets obtained from Verizon Media.
arXiv Detail & Related papers (2020-07-09T05:40:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.