Learning Optimal Contracts: How to Exploit Small Action Spaces
- URL: http://arxiv.org/abs/2309.09801v4
- Date: Fri, 7 Jun 2024 08:08:10 GMT
- Title: Learning Optimal Contracts: How to Exploit Small Action Spaces
- Authors: Francesco Bacchiocchi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti,
- Abstract summary: We study principal-agent problems in which a principal commits to an outcome-dependent payment scheme.
We design an algorithm that learns an approximately-optimal contract with high probability.
It can also be employed to provide a $tildemathcalO(T4/5)$ regret bound in the related online learning setting.
- Score: 37.92189925462977
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study principal-agent problems in which a principal commits to an outcome-dependent payment scheme -- called contract -- in order to induce an agent to take a costly, unobservable action leading to favorable outcomes. We consider a generalization of the classical (single-round) version of the problem in which the principal interacts with the agent by committing to contracts over multiple rounds. The principal has no information about the agent, and they have to learn an optimal contract by only observing the outcome realized at each round. We focus on settings in which the size of the agent's action space is small. We design an algorithm that learns an approximately-optimal contract with high probability in a number of rounds polynomial in the size of the outcome space, when the number of actions is constant. Our algorithm solves an open problem by Zhu et al.[2022]. Moreover, it can also be employed to provide a $\tilde{\mathcal{O}}(T^{4/5})$ regret bound in the related online learning setting in which the principal aims at maximizing their cumulative utility, thus considerably improving previously-known regret bounds.
Related papers
- Contractual Reinforcement Learning: Pulling Arms with Invisible Hands [68.77645200579181]
We propose a theoretical framework for aligning economic interests of different stakeholders in the online learning problems through contract design.
For the planning problem, we design an efficient dynamic programming algorithm to determine the optimal contracts against the far-sighted agent.
For the learning problem, we introduce a generic design of no-regret learning algorithms to untangle the challenges from robust design of contracts to the balance of exploration and exploitation.
arXiv Detail & Related papers (2024-07-01T16:53:00Z) - Multi-Agent Imitation Learning: Value is Easy, Regret is Hard [52.31989962031179]
We study a multi-agent imitation learning (MAIL) problem where we take the perspective of a learner attempting to coordinate a group of agents.
Most prior work in MAIL essentially reduces the problem to matching the behavior of the expert within the support of the demonstrations.
While doing so is sufficient to drive the value gap between the learner and the expert to zero under the assumption that agents are non-strategic, it does not guarantee to deviations by strategic agents.
arXiv Detail & Related papers (2024-06-06T16:18:20Z) - New Perspectives in Online Contract Design [2.296475290901356]
This work studies the repeated principal-agent problem from an online learning perspective.
The principal's goal is to learn the optimal contract that maximizes her utility through repeated interactions.
arXiv Detail & Related papers (2024-03-11T20:28:23Z) - Incentivized Learning in Principal-Agent Bandit Games [62.41639598376539]
This work considers a repeated principal-agent bandit game, where the principal can only interact with her environment through the agent.
The principal can influence the agent's decisions by offering incentives which add up to his rewards.
We present nearly optimal learning algorithms for the principal's regret in both multi-armed and linear contextual settings.
arXiv Detail & Related papers (2024-03-06T16:00:46Z) - Are Bounded Contracts Learnable and Approximately Optimal? [8.121834515103243]
This paper considers the hidden-action model of the principal-agent problem, in which a principal incentivizes an agent to work on a project using a contract.
We investigate whether contracts with bounded payments are learnable and approximately optimal.
arXiv Detail & Related papers (2024-02-22T12:19:19Z) - Contracting with a Learning Agent [32.950708673180436]
We study the study of repeated contracts with a learning agent, focusing on agents who achieve no-regret outcomes.
We achieve an optimal solution to this problem for a canonical contract setting, in which the agent's choice among multiple actions leads to success/failure.
Our results generalize beyond success/failure, to arbitrary non-linear contracts which the principal rescales dynamically.
arXiv Detail & Related papers (2024-01-29T14:53:22Z) - Deep Contract Design via Discontinuous Networks [23.293185030103544]
We introduce a novel representation: the Discontinuous ReLU (DeLU) network, which models the principal's utility as a discontinuous piecewise affine function of the design of a contract.
DeLU networks implicitly learn closed-form expressions for the incentive compatibility constraints of the agent and the utility objective of the principal.
We provide empirical results that demonstrate success in approximating the principal's utility function with a small number of training samples and scaling to find approximately optimal contracts on problems with a large number of actions and outcomes.
arXiv Detail & Related papers (2023-07-05T14:20:20Z) - Learning to Incentivize Information Acquisition: Proper Scoring Rules
Meet Principal-Agent Model [64.94131130042275]
We study the incentivized information acquisition problem, where a principal hires an agent to gather information on her behalf.
We design a provably sample efficient algorithm that tailors the UCB algorithm to our model.
Our algorithm features a delicate estimation procedure for the optimal profit of the principal, and a conservative correction scheme that ensures the desired agent's actions are incentivized.
arXiv Detail & Related papers (2023-03-15T13:40:16Z) - Multi-Agent Neural Rewriter for Vehicle Routing with Limited Disclosure
of Costs [65.23158435596518]
Solving the multi-vehicle routing problem as a team Markov game with partially observable costs.
Our multi-agent reinforcement learning approach, the so-called multi-agent Neural Rewriter, builds on the single-agent Neural Rewriter to solve the problem by iteratively rewriting solutions.
arXiv Detail & Related papers (2022-06-13T09:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.