Efficient Algorithms for Stochastic Repeated Second-price Auctions
- URL: http://arxiv.org/abs/2011.05072v2
- Date: Fri, 26 Feb 2021 08:04:20 GMT
- Title: Efficient Algorithms for Stochastic Repeated Second-price Auctions
- Authors: Juliette Achddou (VALDA), Olivier Capp\'e (VALDA), Aur\'elien Garivier
(UMPA-ENSL)
- Abstract summary: We develop efficient sequential bidding strategies for repeated auctions.
We provide the first parametric lower bound for this problem.
We propose more explainable strategies which are reminiscent of the Explore Then Commit bandit algorithm.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Developing efficient sequential bidding strategies for repeated auctions is
an important practical challenge in various marketing tasks. In this setting,
the bidding agent obtains information, on both the value of the item at sale
and the behavior of the other bidders, only when she wins the auction. Standard
bandit theory does not apply to this problem due to the presence of
action-dependent censoring. In this work, we consider second-price auctions and
propose novel, efficient UCB-like algorithms for this task. These algorithms
are analyzed in the stochastic setting, assuming regularity of the distribution
of the opponents' bids. We provide regret upper bounds that quantify the
improvement over the baseline algorithm proposed in the literature. The
improvement is particularly significant in cases when the value of the
auctioned item is low, yielding a spectacular reduction in the order of the
worst-case regret. We further provide the first parametric lower bound for this
problem that applies to generic UCB-like strategies. As an alternative, we
propose more explainable strategies which are reminiscent of the Explore Then
Commit bandit algorithm. We provide a critical analysis of this class of
strategies, showing both important advantages and limitations. In particular,
we provide a minimax lower bound and propose a nearly minimax-optimal instance
of this class.
Related papers
- Procurement Auctions via Approximately Optimal Submodular Optimization [53.93943270902349]
We study procurement auctions, where an auctioneer seeks to acquire services from strategic sellers with private costs.
Our goal is to design computationally efficient auctions that maximize the difference between the quality of the acquired services and the total cost of the sellers.
arXiv Detail & Related papers (2024-11-20T18:06:55Z) - Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback [58.66941279460248]
Learning from human feedback plays an important role in aligning generative models, such as large language models (LLM)
We study a model within this problem domain--contextual dueling bandits with adversarial feedback, where the true preference label can be flipped by an adversary.
We propose an algorithm namely robust contextual dueling bandit (algo), which is based on uncertainty-weighted maximum likelihood estimation.
arXiv Detail & Related papers (2024-04-16T17:59:55Z) - Strategically-Robust Learning Algorithms for Bidding in First-Price Auctions [11.988955088595858]
Learning to bid in repeated first-price auctions is a fundamental problem at the interface of game theory and machine learning.
We propose a novel concave formulation for pure-strategy bidding in first-price auctions, and use it to analyze natural Gradient-Ascent-based algorithms for this problem.
arXiv Detail & Related papers (2024-02-12T01:33:33Z) - A Reinforcement Learning Approach in Multi-Phase Second-Price Auction
Design [158.0041488194202]
We study reserve price optimization in multi-phase second price auctions.
From the seller's perspective, we need to efficiently explore the environment in the presence of potentially nontruthful bidders.
Third, the seller's per-step revenue is unknown, nonlinear, and cannot even be directly observed from the environment.
arXiv Detail & Related papers (2022-10-19T03:49:05Z) - Dynamic Budget Throttling in Repeated Second-Price Auctions [11.823210231891517]
throttling is a popular choice, managing an advertiser's total expenditure by selecting only a subset of auctions to participate in.
This paper provides a theoretical panorama of a single advertiser's dynamic budget throttling process in repeated second-price auctions.
Our results bridge the gaps in theoretical research of throttling in repeated auctions and reveal the ability of this popular budget-smoothing strategy.
arXiv Detail & Related papers (2022-07-11T08:12:02Z) - Fast Rate Learning in Stochastic First Price Bidding [0.0]
First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising.
We show how to achieve significantly lower regret when the opponents' maximal bid distribution is known.
Our algorithms converge much faster than alternatives proposed in the literature for various bid distributions.
arXiv Detail & Related papers (2021-07-05T07:48:52Z) - Learning over no-Preferred and Preferred Sequence of items for Robust
Recommendation [66.8722561224499]
We propose a theoretically founded sequential strategy for training large-scale Recommender Systems (RS) over implicit feedback.
We present two variants of this strategy where model parameters are updated using either the momentum method or a gradient-based approach.
arXiv Detail & Related papers (2020-12-12T22:10:15Z) - Learning to Bid Optimally and Efficiently in Adversarial First-price
Auctions [40.30925727499806]
We develop the first minimax optimal online bidding algorithm that achieves an $widetildeO(sqrtT)$ regret.
We test our algorithm on three real-world first-price auction datasets obtained from Verizon Media.
arXiv Detail & Related papers (2020-07-09T05:40:39Z) - Adaptive Discretization for Adversarial Lipschitz Bandits [85.39106976861702]
Lipschitz bandits is a prominent version of multi-armed bandits that studies large, structured action spaces.
A central theme here is the adaptive discretization of the action space, which gradually zooms in'' on the more promising regions.
We provide the first algorithm for adaptive discretization in the adversarial version, and derive instance-dependent regret bounds.
arXiv Detail & Related papers (2020-06-22T16:06:25Z) - Optimal No-regret Learning in Repeated First-price Auctions [38.908235632001116]
We study online learning in repeated first-price auctions.
We develop the first learning algorithm that achieves a near-optimal $widetildeO(sqrtT)$ regret bound.
arXiv Detail & Related papers (2020-03-22T03:32:09Z) - Adversarial Attacks on Linear Contextual Bandits [87.08004581867537]
Malicious agents may have incentives to attack the bandit algorithm to induce it to perform a desired behavior.
We show that a malicious agent can force a linear contextual bandit algorithm to pull any desired arm $T - o(T)$ times over a horizon of $T$ steps.
We also investigate the case when a malicious agent is interested in affecting the behavior of the bandit algorithm in a single context.
arXiv Detail & Related papers (2020-02-10T15:04:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.