Scalable Neural Incentive Design with Parameterized Mean-Field Approximation
- URL: http://arxiv.org/abs/2510.21442v1
- Date: Fri, 24 Oct 2025 13:18:54 GMT
- Title: Scalable Neural Incentive Design with Parameterized Mean-Field Approximation
- Authors: Nathan Corecco, Batuhan Yardim, Vinzenz Thoma, Zebang Shen, Niao He,
- Abstract summary: We show that when dynamics and rewards are Lipschitz, the finite-$N$ ID objective is approximated by the PMFG at rate $mathscrO(frac1sqrtN)$.<n>We further introduce our Adjoint Mean-Field Incentive Design (AMID) algorithm, which uses explicit differentiation of iterated equilibrium operators to compute gradients efficiently.
- Score: 28.20524168049273
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Designing incentives for a multi-agent system to induce a desirable Nash equilibrium is both a crucial and challenging problem appearing in many decision-making domains, especially for a large number of agents $N$. Under the exchangeability assumption, we formalize this incentive design (ID) problem as a parameterized mean-field game (PMFG), aiming to reduce complexity via an infinite-population limit. We first show that when dynamics and rewards are Lipschitz, the finite-$N$ ID objective is approximated by the PMFG at rate $\mathscr{O}(\frac{1}{\sqrt{N}})$. Moreover, beyond the Lipschitz-continuous setting, we prove the same $\mathscr{O}(\frac{1}{\sqrt{N}})$ decay for the important special case of sequential auctions, despite discontinuities in dynamics, through a tailored auction-specific analysis. Built on our novel approximation results, we further introduce our Adjoint Mean-Field Incentive Design (AMID) algorithm, which uses explicit differentiation of iterated equilibrium operators to compute gradients efficiently. By uniting approximation bounds with optimization guarantees, AMID delivers a powerful, scalable algorithmic tool for many-agent (large $N$) ID. Across diverse auction settings, the proposed AMID method substantially increases revenue over first-price formats and outperforms existing benchmark methods.
Related papers
- A Geometry-Aware Efficient Algorithm for Compositional Entropic Risk Minimization [42.094833715284416]
We study a family of problems termed $textbfcompositional entropic risk minimization.<n>Each data's loss is formulated as a Log-Expectation-Exponential (Log-E-Exp) function.<n>We propose a geometry-aware algorithm, termed $textbfSCENT$, for the dual formulation of entropic risk minimization.
arXiv Detail & Related papers (2026-02-02T22:33:49Z) - EVaR-Optimal Arm Identification in Bandits [7.340828059560291]
We study the fixed-confidence best arm identification problem within the multiarmed bandit (MAB) framework under the Entropic Value-at-Risk criterion.
arXiv Detail & Related papers (2025-10-06T11:49:56Z) - Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture.<n>Non-smooth regularization is often incorporated into machine learning tasks.<n>We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z) - Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency [52.60557300927007]
We present a $textbfMA-OSMA$ algorithm to transfer the discrete submodular problem into a continuous optimization.<n>We also introduce a projection-free $textbfMA-OSEA$ algorithm, which effectively utilizes the KL divergence by mixing a uniform distribution.<n>Our algorithms significantly improve the $(frac11+c)$-approximation provided by the state-of-the-art OSG algorithm.
arXiv Detail & Related papers (2025-02-07T15:57:56Z) - Achieving PAC Guarantees in Mechanism Design through Multi-Armed Bandits [8.013444110633223]
We analytically derive a class of optimal solutions to a linear program (LP) for automated mechanism design.<n>These solutions can be expressed using a set of essential variables whose cardinality is exponentially smaller than the total number of variables in the original formulation.<n>We address this by translating the evaluation of this term into a multi-armed bandit (MAB) problem.
arXiv Detail & Related papers (2024-11-30T03:59:36Z) - Sample-Efficient Constrained Reinforcement Learning with General Parameterization [35.22742439337603]
We consider a constrained Markov Decision Problem (CMDP) where the goal of an agent is to maximize the expected discounted sum of rewards over an infinite horizon.
We develop the Primal-Dual Accelerated Natural Policy Gradient (PD-ANPG) algorithm that ensures an $epsilon$ global optimality gap and $epsilon$ constraint violation.
arXiv Detail & Related papers (2024-05-17T08:39:05Z) - Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation [53.17668583030862]
We study infinite-horizon average-reward Markov decision processes (AMDPs) in the context of general function approximation.
We propose a novel algorithmic framework named Local-fitted Optimization with OPtimism (LOOP)
We show that LOOP achieves a sublinear $tildemathcalO(mathrmpoly(d, mathrmsp(V*)) sqrtTbeta )$ regret, where $d$ and $beta$ correspond to AGEC and log-covering number of the hypothesis class respectively
arXiv Detail & Related papers (2024-04-19T06:24:22Z) - Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback [75.29048190099523]
Online gradient descent (OGD) is well known to be doubly optimal under strong convexity or monotonicity assumptions.
In this paper, we design a fully adaptive OGD algorithm, textsfAdaOGD, that does not require a priori knowledge of these parameters.
arXiv Detail & Related papers (2023-10-21T18:38:13Z) - Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games [66.2085181793014]
We show that a model-free stage-based Q-learning algorithm can enjoy the same optimality in the $H$ dependence as model-based algorithms.
Our algorithm features a key novel design of updating the reference value functions as the pair of optimistic and pessimistic value functions.
arXiv Detail & Related papers (2023-08-17T08:34:58Z) - Bayesian Optimization-based Combinatorial Assignment [10.73407470973258]
We study the assignment domain, which includes auctions and course allocation.
The main challenge in this domain is that the bundle space grows exponentially in the number of items.
arXiv Detail & Related papers (2022-08-31T08:47:02Z) - Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - Alternating Direction Method of Multipliers for Quantization [15.62692130672419]
We study the performance of the Alternating Direction Method of Multipliers for Quantization ($texttADMM-Q$) algorithm.
We develop a few variants of $texttADMM-Q$ that can handle inexact update rules.
We empirically evaluate the efficacy of our proposed approaches.
arXiv Detail & Related papers (2020-09-08T01:58:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.