Learning Utilities and Equilibria in Non-Truthful Auctions
- URL: http://arxiv.org/abs/2007.01722v3
- Date: Mon, 31 Oct 2022 18:04:25 GMT
- Title: Learning Utilities and Equilibria in Non-Truthful Auctions
- Authors: Hu Fu, Tao Lin
- Abstract summary: We show that $tilde O(n / epsilon2)$ samples from the prior with $n$ agents suffice for an algorithm to learn the interim utilities for all bidding strategies.
We also consider a setting where agents must pay a search cost to discover their own types.
- Score: 11.682943354386868
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In non-truthful auctions, agents' utility for a strategy depends on the
strategies of the opponents and also the prior distribution over their private
types; the set of Bayes Nash equilibria generally has an intricate dependence
on the prior. Using the First Price Auction as our main demonstrating example,
we show that $\tilde O(n / \epsilon^2)$ samples from the prior with $n$ agents
suffice for an algorithm to learn the interim utilities for all monotone
bidding strategies. As a consequence, this number of samples suffice for
learning all approximate equilibria. We give almost matching (up to polylog
factors) lower bound on the sample complexity for learning utilities. We also
consider a setting where agents must pay a search cost to discover their own
types. Drawing on a connection between this setting and the first price
auction, discovered recently by Kleinberg et al. (2016), we show that $\tilde
O(n / \epsilon^2)$ samples suffice for utilities and equilibria to be estimated
in a near welfare-optimal descending auction in this setting. En route, we
improve the sample complexity bound, recently obtained by Guo et al. (2021),
for the Pandora's Box problem, which is a classical model for sequential
consumer search.
Related papers
- Learning to Coordinate Bidders in Non-Truthful Auctions [6.3923058661534276]
We study the complexity of learning Bayes correlated equilibria in non-truthful auctions.<n>We prove that the BCEs can be learned with a number $tilde O(fracnvareps2)$ of samples from bidders' value.
arXiv Detail & Related papers (2025-07-03T17:03:14Z) - Towards a Sharp Analysis of Offline Policy Learning for $f$-Divergence-Regularized Contextual Bandits [49.96531901205305]
We analyze $f$-divergence-regularized offline policy learning.<n>For reverse Kullback-Leibler (KL) divergence, we give the first $tildeO(epsilon-1)$ sample complexity under single-policy concentrability.<n>We extend our analysis to dueling bandits, and we believe these results take a significant step toward a comprehensive understanding of $f$-divergence-regularized policy learning.
arXiv Detail & Related papers (2025-02-09T22:14:45Z) - Online Combinatorial Allocations and Auctions with Few Samples [9.675397292814122]
This paper investigates the feasibility of achieving O(1)-competitive algorithms under the realistic constraint of having access to only a limited number of samples from the underlying bidder distributions.
Our first main contribution shows that a mere single sample from each bidder distribution is sufficient to yield an O(1)-competitive algorithm for submodular/XOS valuations.
arXiv Detail & Related papers (2024-09-17T11:43:55Z) - Fast Rates for Bandit PAC Multiclass Classification [73.17969992976501]
We study multiclass PAC learning with bandit feedback, where inputs are classified into one of $K$ possible labels and feedback is limited to whether or not the predicted labels are correct.
Our main contribution is in designing a novel learning algorithm for the agnostic $(varepsilon,delta)$PAC version of the problem.
arXiv Detail & Related papers (2024-06-18T08:54:04Z) - Improved classical shadows from local symmetries in the Schur basis [4.462208715451194]
We study the sample complexity of the classical shadows task.
We present the first joint measurement protocol for classical shadows whose sample complexity scales with the rank of the unknown state.
arXiv Detail & Related papers (2024-05-15T17:33:10Z) - Improved Regret for Efficient Online Reinforcement Learning with Linear
Function Approximation [69.0695698566235]
We study reinforcement learning with linear function approximation and adversarially changing cost functions.
We present a computationally efficient policy optimization algorithm for the challenging general setting of unknown dynamics and bandit feedback.
arXiv Detail & Related papers (2023-01-30T17:26:39Z) - On-Demand Sampling: Learning Optimally from Multiple Distributions [63.20009081099896]
Social and real-world considerations have given rise to multi-distribution learning paradigms.
We establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity.
Our algorithm design and analysis are enabled by our extensions of online learning techniques for solving zero-sum games.
arXiv Detail & Related papers (2022-10-22T19:07:26Z) - Fast Rate Learning in Stochastic First Price Bidding [0.0]
First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising.
We show how to achieve significantly lower regret when the opponents' maximal bid distribution is known.
Our algorithms converge much faster than alternatives proposed in the literature for various bid distributions.
arXiv Detail & Related papers (2021-07-05T07:48:52Z) - Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal
Sample Complexity [67.02490430380415]
We show that model-based MARL achieves a sample complexity of $tilde O(|S||B|(gamma)-3epsilon-2)$ for finding the Nash equilibrium (NE) value up to some $epsilon$ error.
We also show that such a sample bound is minimax-optimal (up to logarithmic factors) if the algorithm is reward-agnostic, where the algorithm queries state transition samples without reward knowledge.
arXiv Detail & Related papers (2020-07-15T03:25:24Z) - A Provably Efficient Sample Collection Strategy for Reinforcement
Learning [123.69175280309226]
One of the challenges in online reinforcement learning (RL) is that the agent needs to trade off the exploration of the environment and the exploitation of the samples to optimize its behavior.
We propose to tackle the exploration-exploitation problem following a decoupled approach composed of: 1) An "objective-specific" algorithm that prescribes how many samples to collect at which states, as if it has access to a generative model (i.e., sparse simulator of the environment); 2) An "objective-agnostic" sample collection responsible for generating the prescribed samples as fast as possible.
arXiv Detail & Related papers (2020-07-13T15:17:35Z) - Learning Halfspaces with Tsybakov Noise [50.659479930171585]
We study the learnability of halfspaces in the presence of Tsybakov noise.
We give an algorithm that achieves misclassification error $epsilon$ with respect to the true halfspace.
arXiv Detail & Related papers (2020-06-11T14:25:02Z) - The Simulator: Understanding Adaptive Sampling in the
Moderate-Confidence Regime [52.38455827779212]
We propose a novel technique for analyzing adaptive sampling called the em Simulator.
We prove the first instance-based lower bounds the top-k problem which incorporate the appropriate log-factors.
Our new analysis inspires a simple and near-optimal for the best-arm and top-k identification, the first em practical of its kind for the latter problem.
arXiv Detail & Related papers (2017-02-16T23:42:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.