Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL
- URL: http://arxiv.org/abs/2208.10469v4
- Date: Mon, 29 Jan 2024 16:37:55 GMT
- Title: Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL
- Authors: Andreas A. Haupt, Phillip J.K. Christoffersen, Mehul Damani, Dylan
Hadfield-Menell
- Abstract summary: Multi-agent Reinforcement Learning (MARL) is a powerful tool for training autonomous agents acting independently in a common environment.
MARL can lead to sub-optimal behavior when individual incentives and group incentives diverge.
We propose an augmentation to a Markov game where agents voluntarily agree to binding transfers of reward, under pre-specified conditions.
- Score: 4.969697978555126
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-agent Reinforcement Learning (MARL) is a powerful tool for training
autonomous agents acting independently in a common environment. However, it can
lead to sub-optimal behavior when individual incentives and group incentives
diverge. Humans are remarkably capable at solving these social dilemmas. It is
an open problem in MARL to replicate such cooperative behaviors in selfish
agents. In this work, we draw upon the idea of formal contracting from
economics to overcome diverging incentives between agents in MARL. We propose
an augmentation to a Markov game where agents voluntarily agree to binding
transfers of reward, under pre-specified conditions. Our contributions are
theoretical and empirical. First, we show that this augmentation makes all
subgame-perfect equilibria of all Fully Observable Markov Games exhibit
socially optimal behavior, given a sufficiently rich space of contracts. Next,
we show that for general contract spaces, and even under partial observability,
richer contract spaces lead to higher welfare. Hence, contract space design
solves an exploration-exploitation tradeoff, sidestepping incentive issues. We
complement our theoretical analysis with experiments. Issues of exploration in
the contracting augmentation are mitigated using a training methodology
inspired by multi-objective reinforcement learning: Multi-Objective Contract
Augmentation Learning (MOCA). We test our methodology in static, single-move
games, as well as dynamic domains that simulate traffic, pollution management
and common pool resource management.
Related papers
- Contracting with a Learning Agent [32.950708673180436]
We study the study of repeated contracts with a learning agent, focusing on agents who achieve no-regret outcomes.
We achieve an optimal solution to this problem for a canonical contract setting, in which the agent's choice among multiple actions leads to success/failure.
Our results generalize beyond success/failure, to arbitrary non-linear contracts which the principal rescales dynamically.
arXiv Detail & Related papers (2024-01-29T14:53:22Z) - Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards
and Ethical Behavior in the MACHIAVELLI Benchmark [61.43264961005614]
We develop a benchmark of 134 Choose-Your-Own-Adventure games containing over half a million rich, diverse scenarios.
We evaluate agents' tendencies to be power-seeking, cause disutility, and commit ethical violations.
Our results show that agents can both act competently and morally, so concrete progress can be made in machine ethics.
arXiv Detail & Related papers (2023-04-06T17:59:03Z) - Towards Multi-Agent Reinforcement Learning driven Over-The-Counter
Market Simulations [16.48389671789281]
We study a game between liquidity provider and liquidity taker agents interacting in an over-the-counter market.
By playing against each other, our deep-reinforcement-learning-driven agents learn emergent behaviors.
We show convergence rates for our multi-agent policy gradient algorithm under a transitivity assumption.
arXiv Detail & Related papers (2022-10-13T17:06:08Z) - Modeling Bounded Rationality in Multi-Agent Simulations Using Rationally
Inattentive Reinforcement Learning [85.86440477005523]
We study more human-like RL agents which incorporate an established model of human-irrationality, the Rational Inattention (RI) model.
RIRL models the cost of cognitive information processing using mutual information.
We show that using RIRL yields a rich spectrum of new equilibrium behaviors that differ from those found under rational assumptions.
arXiv Detail & Related papers (2022-01-18T20:54:00Z) - Finding General Equilibria in Many-Agent Economic Simulations Using Deep
Reinforcement Learning [72.23843557783533]
We show that deep reinforcement learning can discover stable solutions that are epsilon-Nash equilibria for a meta-game over agent types.
Our approach is more flexible and does not need unrealistic assumptions, e.g., market clearing.
We demonstrate our approach in real-business-cycle models, a representative family of DGE models, with 100 worker-consumers, 10 firms, and a government who taxes and redistributes.
arXiv Detail & Related papers (2022-01-03T17:00:17Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - Emergent Prosociality in Multi-Agent Games Through Gifting [14.943238230772264]
Reinforcement learning algorithms often suffer from converging to socially less desirable equilibria when multiple equilibria exist.
We propose using a less restrictive peer-rewarding mechanism, gifting, that guides the agents toward more socially desirable equilibria.
We employ a theoretical framework that captures the benefit of gifting in converging to the prosocial equilibrium.
arXiv Detail & Related papers (2021-05-13T23:28:30Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z) - Learning to Incentivize Other Learning Agents [73.03133692589532]
We show how to equip RL agents with the ability to give rewards directly to other agents, using a learned incentive function.
Such agents significantly outperform standard RL and opponent-shaping agents in challenging general-sum Markov games.
Our work points toward more opportunities and challenges along the path to ensure the common good in a multi-agent future.
arXiv Detail & Related papers (2020-06-10T20:12:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.