Generalized Principal-Agent Problem with a Learning Agent
- URL: http://arxiv.org/abs/2402.09721v3
- Date: Tue, 11 Jun 2024 05:52:20 GMT
- Title: Generalized Principal-Agent Problem with a Learning Agent
- Authors: Tao Lin, Yiling Chen,
- Abstract summary: Generalized principal-agent problems are economic problems where an agent best responds to a principal's committed strategy.
We show that if the agent uses mean-based learning algorithms, the principal can do significantly better than the non-learning model.
- Score: 6.2458948163267785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalized principal-agent problems, including Stackelberg games, contract design, and Bayesian persuasion, are a class of economic problems where an agent best responds to a principal's committed strategy. We study repeated generalized principal-agent problems under the assumption that the principal does not have commitment power and the agent uses algorithms to learn to respond to the principal. We reduce this problem to a one-shot generalized principal-agent problem with an approximately-best-responding agent. Using this reduction, we show that: (1) if the agent uses contextual no-regret learning algorithms, then the principal can guarantee a utility that is at least the principal's optimal utility in the classic non-learning model minus the square root of the agent's regret; (2) if the agent uses contextual no-swap-regret learning algorithms, then the principal cannot obtain any utility more than the optimal utility in the non-learning model plus the agent's swap regret. But (3) if the agent uses mean-based learning algorithms (which can be no-regret but not no-swap-regret), then the principal can do significantly better than the non-learning model. These general results not only refine previous results in Stackelberg games and contract design with learning agents but also lead to new results for Bayesian persuasion with a learning agent.
Related papers
- Learning to Lead: Incentivizing Strategic Agents in the Dark [50.93875404941184]
We study an online learning version of the generalized principal-agent model.<n>We develop the first provably sample-efficient algorithm for this challenging setting.<n>We establish a near optimal $tildeO(sqrtT) $ regret bound for learning the principal's optimal policy.
arXiv Detail & Related papers (2025-06-10T04:25:04Z) - Learning to Incentivize in Repeated Principal-Agent Problems with Adversarial Agent Arrivals [19.575710928077346]
We study a repeated principal-agent problem over a finite horizon $T$.<n>We show that the problem becomes intractable, leading to linear regret.
arXiv Detail & Related papers (2025-05-29T05:46:01Z) - On the Resilience of Multi-Agent Systems with Malicious Agents [58.79302663733702]
This paper investigates what is the resilience of multi-agent system structures under malicious agents.
We devise two methods, AutoTransform and AutoInject, to transform any agent into a malicious one.
We show that two defense methods, introducing a mechanism for each agent to challenge others' outputs, or an additional agent to review and correct messages, can enhance system resilience.
arXiv Detail & Related papers (2024-08-02T03:25:20Z) - Incentivized Learning in Principal-Agent Bandit Games [62.41639598376539]
This work considers a repeated principal-agent bandit game, where the principal can only interact with her environment through the agent.
The principal can influence the agent's decisions by offering incentives which add up to his rewards.
We present nearly optimal learning algorithms for the principal's regret in both multi-armed and linear contextual settings.
arXiv Detail & Related papers (2024-03-06T16:00:46Z) - Impact of Decentralized Learning on Player Utilities in Stackelberg Games [57.08270857260131]
In many two-agent systems, each agent learns separately and the rewards of the two agents are not perfectly aligned.
We model these systems as Stackelberg games with decentralized learning and show that standard regret benchmarks result in worst-case linear regret for at least one player.
We develop algorithms to achieve near-optimal $O(T2/3)$ regret for both players with respect to these benchmarks.
arXiv Detail & Related papers (2024-02-29T23:38:28Z) - Repeated Contracting with Multiple Non-Myopic Agents: Policy Regret and
Limited Liability [6.512509337399156]
We study a repeated contracting setting in which a Principal adaptively chooses amongst $k$ Agents at each of $T$ rounds.
The Agents are non-myopic, and so a mechanism for the Principal induces a $T$-round extensive form game amongst the Agents.
arXiv Detail & Related papers (2024-02-27T01:01:59Z) - Contracting with a Learning Agent [32.950708673180436]
We study the study of repeated contracts with a learning agent, focusing on agents who achieve no-regret outcomes.
We achieve an optimal solution to this problem for a canonical contract setting, in which the agent's choice among multiple actions leads to success/failure.
Our results generalize beyond success/failure, to arbitrary non-linear contracts which the principal rescales dynamically.
arXiv Detail & Related papers (2024-01-29T14:53:22Z) - Principal-Agent Reward Shaping in MDPs [50.914110302917756]
Principal-agent problems arise when one party acts on behalf of another, leading to conflicts of interest.
We study a two-player Stack game where the principal and the agent have different reward functions, and the agent chooses an MDP policy for both players.
Our results establish trees and deterministic decision processes with a finite horizon.
arXiv Detail & Related papers (2023-12-30T18:30:44Z) - Learning Optimal Contracts: How to Exploit Small Action Spaces [37.92189925462977]
We study principal-agent problems in which a principal commits to an outcome-dependent payment scheme.
We design an algorithm that learns an approximately-optimal contract with high probability.
It can also be employed to provide a $tildemathcalO(T4/5)$ regret bound in the related online learning setting.
arXiv Detail & Related papers (2023-09-18T14:18:35Z) - Learning to Incentivize Information Acquisition: Proper Scoring Rules
Meet Principal-Agent Model [64.94131130042275]
We study the incentivized information acquisition problem, where a principal hires an agent to gather information on her behalf.
We design a provably sample efficient algorithm that tailors the UCB algorithm to our model.
Our algorithm features a delicate estimation procedure for the optimal profit of the principal, and a conservative correction scheme that ensures the desired agent's actions are incentivized.
arXiv Detail & Related papers (2023-03-15T13:40:16Z) - Learning in Stackelberg Games with Non-myopic Agents [60.927889817803745]
We study Stackelberg games where a principal repeatedly interacts with a non-myopic long-lived agent, without knowing the agent's payoff function.
We provide a general framework that reduces learning in presence of non-myopic agents to robust bandit optimization in the presence of myopic agents.
arXiv Detail & Related papers (2022-08-19T15:49:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.