Online Allocation and Learning in the Presence of Strategic Agents
- URL: http://arxiv.org/abs/2209.12112v1
- Date: Sun, 25 Sep 2022 00:46:53 GMT
- Title: Online Allocation and Learning in the Presence of Strategic Agents
- Authors: Steven Yin, Shipra Agrawal, Assaf Zeevi
- Abstract summary: We study the problem of allocating $T$ sequentially arriving items among $n$ homogeneous agents under the constraint that each agent must receive a pre-specified fraction of all items.
Our main contribution is an online learning based allocation mechanism that is approximately Bayesian incentive compatible.
- Score: 16.124755488878044
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of allocating $T$ sequentially arriving items among $n$
homogeneous agents under the constraint that each agent must receive a
pre-specified fraction of all items, with the objective of maximizing the
agents' total valuation of items allocated to them. The agents' valuations for
the item in each round are assumed to be i.i.d. but their distribution is a
priori unknown to the central planner. Therefore, the central planner needs to
implicitly learn these distributions from the observed values in order to pick
a good allocation policy. However, an added challenge here is that the agents
are strategic with incentives to misreport their valuations in order to receive
better allocations. This sets our work apart both from the online auction
design settings which typically assume known valuation distributions and/or
involve payments, and from the online learning settings that do not consider
strategic agents. To that end, our main contribution is an online learning
based allocation mechanism that is approximately Bayesian incentive compatible,
and when all agents are truthful, guarantees a sublinear regret for individual
agents' utility compared to that under the optimal offline allocation policy.
Related papers
- On the Hardness of Decentralized Multi-Agent Policy Evaluation under Byzantine Attacks [12.696705862929337]
We study a fully-decentralized multi-agent policy evaluation problem in the presence of up to $f$ faulty agents.
In particular, we focus on the so-called Byzantine faulty model with model poisoning setting.
arXiv Detail & Related papers (2024-09-19T16:27:08Z) - Estimating and Incentivizing Imperfect-Knowledge Agents with Hidden
Rewards [4.742123770879715]
In practice, incentive providers often cannot observe the reward realizations of incentivized agents.
This paper explores a repeated adverse selection game between a self-interested learning agent and a learning principal.
We introduce an estimator whose only input is the history of principal's incentives and agent's choices.
arXiv Detail & Related papers (2023-08-13T08:12:01Z) - Learning to Incentivize Information Acquisition: Proper Scoring Rules
Meet Principal-Agent Model [64.94131130042275]
We study the incentivized information acquisition problem, where a principal hires an agent to gather information on her behalf.
We design a provably sample efficient algorithm that tailors the UCB algorithm to our model.
Our algorithm features a delicate estimation procedure for the optimal profit of the principal, and a conservative correction scheme that ensures the desired agent's actions are incentivized.
arXiv Detail & Related papers (2023-03-15T13:40:16Z) - DM$^2$: Distributed Multi-Agent Reinforcement Learning for Distribution
Matching [43.58408474941208]
This paper studies the problem of distributed multi-agent learning without resorting to explicit coordination schemes.
Each individual agent matches a target distribution of concurrently sampled trajectories from a joint expert policy.
Experimental validation on the StarCraft domain shows that combining the reward for distribution matching with the environment reward allows agents to outperform a fully distributed baseline.
arXiv Detail & Related papers (2022-06-01T04:57:50Z) - Byzantine-Robust Online and Offline Distributed Reinforcement Learning [60.970950468309056]
We consider a distributed reinforcement learning setting where multiple agents explore the environment and communicate their experiences through a central server.
$alpha$-fraction of agents are adversarial and can report arbitrary fake information.
We seek to identify a near-optimal policy for the underlying Markov decision process in the presence of these adversarial agents.
arXiv Detail & Related papers (2022-06-01T00:44:53Z) - Monotonic Improvement Guarantees under Non-stationarity for
Decentralized PPO [66.5384483339413]
We present a new monotonic improvement guarantee for optimizing decentralized policies in cooperative Multi-Agent Reinforcement Learning (MARL)
We show that a trust region constraint can be effectively enforced in a principled way by bounding independent ratios based on the number of agents in training.
arXiv Detail & Related papers (2022-01-31T20:39:48Z) - AoI-Aware Resource Allocation for Platoon-Based C-V2X Networks via
Multi-Agent Multi-Task Reinforcement Learning [22.890835786710316]
This paper investigates the problem of age of information (AoI) aware radio resource management for a platooning system.
Multiple autonomous platoons exploit the cellular wireless vehicle-to-everything (C-V2X) communication technology to disseminate the cooperative awareness messages (CAMs) to their followers.
We exploit a distributed resource allocation framework based on multi-agent reinforcement learning (MARL), where each platoon leader (PL) acts as an agent and interacts with the environment to learn its optimal policy.
arXiv Detail & Related papers (2021-05-10T08:39:56Z) - Learning Strategies in Decentralized Matching Markets under Uncertain
Preferences [91.3755431537592]
We study the problem of decision-making in the setting of a scarcity of shared resources when the preferences of agents are unknown a priori.
Our approach is based on the representation of preferences in a reproducing kernel Hilbert space.
We derive optimal strategies that maximize agents' expected payoffs.
arXiv Detail & Related papers (2020-10-29T03:08:22Z) - Decentralized Reinforcement Learning: Global Decision-Making via Local
Economic Transactions [80.49176924360499]
We establish a framework for directing a society of simple, specialized, self-interested agents to solve sequential decision problems.
We derive a class of decentralized reinforcement learning algorithms.
We demonstrate the potential advantages of a society's inherent modular structure for more efficient transfer learning.
arXiv Detail & Related papers (2020-07-05T16:41:09Z) - VCG Mechanism Design with Unknown Agent Values under Stochastic Bandit
Feedback [104.06766271716774]
We study a multi-round welfare-maximising mechanism design problem in instances where agents do not know their values.
We first define three notions of regret for the welfare, the individual utilities of each agent and that of the mechanism.
Our framework also provides flexibility to control the pricing scheme so as to trade-off between the agent and seller regrets.
arXiv Detail & Related papers (2020-04-19T18:00:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.