Online Learning for Dynamic Vickrey-Clarke-Groves Mechanism in Unknown Environments
- URL: http://arxiv.org/abs/2506.19038v2
- Date: Thu, 23 Oct 2025 22:45:35 GMT
- Title: Online Learning for Dynamic Vickrey-Clarke-Groves Mechanism in Unknown Environments
- Authors: Vincent Leon, S. Rasoul Etesami,
- Abstract summary: We consider the problem of online dynamic mechanism design for sequential auctions in unknown environments.<n>We develop a reinforcement learning algorithm for the seller to learn the underlying Markov decision process.<n>We show that the learned mechanism approximately satisfies efficiency, truthfulness, and individual rationality.
- Score: 2.2344764434954256
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of online dynamic mechanism design for sequential auctions in unknown environments, where the underlying market and, thus, the bidders' values vary over time as interactions between the seller and the bidders progress. We model the sequential auctions as an infinite-horizon average-reward Markov decision process (MDP). In each round, the seller determines an allocation and sets a payment for each bidder, while each bidder receives a private reward and submits a sealed bid to the seller. The state, which represents the underlying market, evolves according to an unknown transition kernel and the seller's allocation policy without episodic resets. We first extend the Vickrey-Clarke-Groves (VCG) mechanism to sequential auctions, thereby obtaining a dynamic counterpart that preserves the desired properties: efficiency, truthfulness, and individual rationality. We then focus on the online setting and develop a reinforcement learning algorithm for the seller to learn the underlying MDP and implement a mechanism that closely resembles the dynamic VCG mechanism. We show that the learned mechanism approximately satisfies efficiency, truthfulness, and individual rationality and achieves guaranteed performance in terms of various notions of regret.
Related papers
- Generative Large-Scale Pre-trained Models for Automated Ad Bidding Optimization [5.460538555236247]
We propose GRAD (Generative Reward-driven Ad-bidding with Mixture-of-Experts), a scalable foundation model for auto-bidding.<n>We show that GRAD significantly enhances platform revenue, highlighting its effectiveness in addressing the evolving and diverse requirements of modern advertisers.
arXiv Detail & Related papers (2025-08-04T02:46:18Z) - Situationally-Aware Dynamics Learning [57.698553219660376]
We propose a novel framework for online learning of hidden state representations.<n>Our approach explicitly models the influence of unobserved parameters on both transition dynamics and reward structures.<n>Experiments in both simulation and real world reveal significant improvements in data efficiency, policy performance, and the emergence of safer, adaptive navigation strategies.
arXiv Detail & Related papers (2025-05-26T06:40:11Z) - Online Auction Design Using Distribution-Free Uncertainty Quantification with Applications to E-Commerce [6.241187362917031]
We introduce the Conformal Online Auction Design (COAD), a novel mechanism that maximizes revenue by quantifying uncertainty in bidder values without relying on known distributions.<n>COAD incorporates both bidder and item features, using historical data to design an incentive-compatible mechanism for online auctions.<n>We demonstrate the practical effectiveness of COAD through an application to real-world eBay auction data.
arXiv Detail & Related papers (2024-05-11T15:28:25Z) - Automated Deterministic Auction Design with Objective Decomposition [31.918952529696885]
This paper introduces OD-VVCA, an objective decomposition approach for automated designing Virtual Valuations Combinatorial Auctions (VVCAs)
We utilize a parallelizable dynamic programming algorithm to compute the allocation and revenue outcomes of a VVCA efficiently.
We then decompose the revenue objective function into continuous and piecewise constant discontinuous components, optimizing each using distinct methods.
arXiv Detail & Related papers (2024-02-19T07:45:04Z) - Robust and Performance Incentivizing Algorithms for Multi-Armed Bandits with Strategic Agents [52.75161794035767]
We introduce a class of bandit algorithms that meet the two objectives of performance incentivization and robustness simultaneously.<n>We show that settings where the principal has no information about the arms' performance characteristics can be handled by combining ideas from second price auctions with our algorithms.
arXiv Detail & Related papers (2023-12-13T06:54:49Z) - Refined Mechanism Design for Approximately Structured Priors via Active
Regression [50.71772232237571]
We consider the problem of a revenue-maximizing seller with a large number of items for sale to $n$ strategic bidders.
It is well-known that optimal and even approximately-optimal mechanisms for this setting are notoriously difficult to characterize or compute.
arXiv Detail & Related papers (2023-10-11T20:34:17Z) - Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline
Reinforcement Learning [114.36124979578896]
We design a dynamic mechanism using offline reinforcement learning algorithms.
Our algorithm is based on the pessimism principle and only requires a mild assumption on the coverage of the offline data set.
arXiv Detail & Related papers (2022-05-05T05:44:26Z) - Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach [123.55983746427572]
We propose novel learning algorithms to recover the dynamic Vickrey-Clarke-Grove (VCG) mechanism over multiple rounds of interaction.<n>A key contribution of our approach is incorporating reward-free online Reinforcement Learning (RL) to aid exploration over a rich policy space.
arXiv Detail & Related papers (2022-02-25T16:17:23Z) - Optimizing Multiple Performance Metrics with Deep GSP Auctions for
E-commerce Advertising [28.343122250701498]
In e-commerce advertising, the ad platform usually relies on auction mechanisms to optimize different performance metrics, such as user experience, advertiser utility, and platform revenue.
We propose a new mechanism called Deep GSP auction, which leverages deep learning to design new rank score functions within the celebrated GSP auction framework.
arXiv Detail & Related papers (2020-12-05T02:51:11Z) - Certifying Strategyproof Auction Networks [53.37051312298459]
We focus on the RegretNet architecture, which can represent auctions with arbitrary numbers of items and participants.
We propose ways to explicitly verify strategyproofness under a particular valuation profile using techniques from the neural network verification literature.
arXiv Detail & Related papers (2020-06-15T20:22:48Z) - VCG Mechanism Design with Unknown Agent Values under Stochastic Bandit
Feedback [104.06766271716774]
We study a multi-round welfare-maximising mechanism design problem in instances where agents do not know their values.
We first define three notions of regret for the welfare, the individual utilities of each agent and that of the mechanism.
Our framework also provides flexibility to control the pricing scheme so as to trade-off between the agent and seller regrets.
arXiv Detail & Related papers (2020-04-19T18:00:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.