Related papers: Online Learning for Dynamic Vickrey-Clarke-Groves Mechanism in Sequential Auctions under Unknown Environments

Online Learning for Dynamic Vickrey-Clarke-Groves Mechanism in Sequential Auctions under Unknown Environments

URL: http://arxiv.org/abs/2506.19038v1
Date: Mon, 23 Jun 2025 18:52:32 GMT
Title: Online Learning for Dynamic Vickrey-Clarke-Groves Mechanism in Sequential Auctions under Unknown Environments
Authors: Vincent Leon, S. Rasoul Etesami,
Abstract summary: We consider the problem of online dynamic mechanism design for sequential auctions in unknown environments.<n>We develop an online reinforcement learning algorithm for the seller to learn the underlying MDP model.<n>We show that the learned online mechanism achieves efficiency, truthfulness, and rationality with arbitrarily high probability and guaranteed performance in terms of various notions of regret.
Score: 0.9208007322096533
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the problem of online dynamic mechanism design for sequential auctions in unknown environments, where the underlying market and, thus, the bidders' values vary over time as interactions between the seller and the bidders progress. We model the sequential auctions as an infinite-horizon average-reward Markov decision process (MDP), where the transition kernel and reward functions are unknown to the seller. In each round, the seller determines an allocation and a payment for each bidder. Each bidder receives a private reward and submits a sealed bid to the seller. The state, which represents the underlying market, evolves according to an unknown transition kernel and the seller's allocation policy. Unlike existing works that formulate the problem as a multi-armed bandit model or as an episodic MDP, where the environment resets to an initial state after each round or episode, our paper considers a more realistic and sophisticated setting in which the market continues to evolve without restarting. We first extend the Vickrey-Clarke-Groves (VCG) mechanism, which is known to be efficient, truthful, and individually rational for one-shot static auctions, to sequential auctions, thereby obtaining a dynamic VCG mechanism counterpart that preserves these desired properties. We then focus on the online setting and develop an online reinforcement learning algorithm for the seller to learn the underlying MDP model and implement a mechanism that closely resembles the dynamic VCG mechanism. We show that the learned online mechanism asymptotically converges to a dynamic mechanism that approximately satisfies efficiency, truthfulness, and individual rationality with arbitrarily high probability and achieves guaranteed performance in terms of various notions of regret.

Related papers

Generative Large-Scale Pre-trained Models for Automated Ad Bidding Optimization [5.460538555236247]
We propose GRAD (Generative Reward-driven Ad-bidding with Mixture-of-Experts), a scalable foundation model for auto-bidding.<n>We show that GRAD significantly enhances platform revenue, highlighting its effectiveness in addressing the evolving and diverse requirements of modern advertisers.
arXiv Detail & Related papers (2025-08-04T02:46:18Z)
Situationally-Aware Dynamics Learning [57.698553219660376]
We propose a novel framework for online learning of hidden state representations.<n>Our approach explicitly models the influence of unobserved parameters on both transition dynamics and reward structures.<n>Experiments in both simulation and real world reveal significant improvements in data efficiency, policy performance, and the emergence of safer, adaptive navigation strategies.
arXiv Detail & Related papers (2025-05-26T06:40:11Z)
Online Auction Design Using Distribution-Free Uncertainty Quantification with Applications to E-Commerce [6.241187362917031]
We introduce the Conformal Online Auction Design (COAD), a novel mechanism that maximizes revenue by quantifying uncertainty in bidder values without relying on known distributions.<n>COAD incorporates both bidder and item features, using historical data to design an incentive-compatible mechanism for online auctions.<n>We demonstrate the practical effectiveness of COAD through an application to real-world eBay auction data.
arXiv Detail & Related papers (2024-05-11T15:28:25Z)
Automated Deterministic Auction Design with Objective Decomposition [31.918952529696885]
This paper introduces OD-VVCA, an objective decomposition approach for automated designing Virtual Valuations Combinatorial Auctions (VVCAs) We utilize a parallelizable dynamic programming algorithm to compute the allocation and revenue outcomes of a VVCA efficiently. We then decompose the revenue objective function into continuous and piecewise constant discontinuous components, optimizing each using distinct methods.
arXiv Detail & Related papers (2024-02-19T07:45:04Z)
Robust and Performance Incentivizing Algorithms for Multi-Armed Bandits with Strategic Agents [52.75161794035767]
We introduce a class of bandit algorithms that meet the two objectives of performance incentivization and robustness simultaneously.<n>We show that settings where the principal has no information about the arms' performance characteristics can be handled by combining ideas from second price auctions with our algorithms.
arXiv Detail & Related papers (2023-12-13T06:54:49Z)
Refined Mechanism Design for Approximately Structured Priors via Active Regression [50.71772232237571]
We consider the problem of a revenue-maximizing seller with a large number of items for sale to $n$ strategic bidders. It is well-known that optimal and even approximately-optimal mechanisms for this setting are notoriously difficult to characterize or compute.
arXiv Detail & Related papers (2023-10-11T20:34:17Z)
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning [114.36124979578896]
We design a dynamic mechanism using offline reinforcement learning algorithms. Our algorithm is based on the pessimism principle and only requires a mild assumption on the coverage of the offline data set.
arXiv Detail & Related papers (2022-05-05T05:44:26Z)
Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach [123.55983746427572]
We propose novel learning algorithms to recover the dynamic Vickrey-Clarke-Grove (VCG) mechanism over multiple rounds of interaction.<n>A key contribution of our approach is incorporating reward-free online Reinforcement Learning (RL) to aid exploration over a rich policy space.
arXiv Detail & Related papers (2022-02-25T16:17:23Z)
Optimizing Multiple Performance Metrics with Deep GSP Auctions for E-commerce Advertising [28.343122250701498]
In e-commerce advertising, the ad platform usually relies on auction mechanisms to optimize different performance metrics, such as user experience, advertiser utility, and platform revenue. We propose a new mechanism called Deep GSP auction, which leverages deep learning to design new rank score functions within the celebrated GSP auction framework.
arXiv Detail & Related papers (2020-12-05T02:51:11Z)
Certifying Strategyproof Auction Networks [53.37051312298459]
We focus on the RegretNet architecture, which can represent auctions with arbitrary numbers of items and participants. We propose ways to explicitly verify strategyproofness under a particular valuation profile using techniques from the neural network verification literature.
arXiv Detail & Related papers (2020-06-15T20:22:48Z)
VCG Mechanism Design with Unknown Agent Values under Stochastic Bandit Feedback [104.06766271716774]
We study a multi-round welfare-maximising mechanism design problem in instances where agents do not know their values. We first define three notions of regret for the welfare, the individual utilities of each agent and that of the mechanism. Our framework also provides flexibility to control the pricing scheme so as to trade-off between the agent and seller regrets.
arXiv Detail & Related papers (2020-04-19T18:00:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.