Influencing Long-Term Behavior in Multiagent Reinforcement Learning
- URL: http://arxiv.org/abs/2203.03535v1
- Date: Mon, 7 Mar 2022 17:32:35 GMT
- Title: Influencing Long-Term Behavior in Multiagent Reinforcement Learning
- Authors: Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob N. Foerster, Michael
Everett, Chuangchuang Sun, Gerald Tesauro, Jonathan P. How
- Abstract summary: We propose a principled framework for considering the limiting policies of other agents as the time approaches infinity.
Specifically, we develop a new optimization objective that maximizes each agent's average reward by directly accounting for the impact of its behavior on the limiting set of policies that other agents will take on.
Thanks to our farsighted evaluation, we demonstrate better long-term performance than state-of-the-art baselines in various domains.
- Score: 59.98329270954098
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The main challenge of multiagent reinforcement learning is the difficulty of
learning useful policies in the presence of other simultaneously learning
agents whose changing behaviors jointly affect the environment's transition and
reward dynamics. An effective approach that has recently emerged for addressing
this non-stationarity is for each agent to anticipate the learning of other
interacting agents and influence the evolution of their future policies towards
desirable behavior for its own benefit. Unfortunately, all previous approaches
for achieving this suffer from myopic evaluation, considering only a few or a
finite number of updates to the policies of other agents. In this paper, we
propose a principled framework for considering the limiting policies of other
agents as the time approaches infinity. Specifically, we develop a new
optimization objective that maximizes each agent's average reward by directly
accounting for the impact of its behavior on the limiting set of policies that
other agents will take on. Thanks to our farsighted evaluation, we demonstrate
better long-term performance than state-of-the-art baselines in various
domains, including the full spectrum of general-sum, competitive, and
cooperative settings.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement
Learning [84.22561239481901]
We propose a new approach that enables agents to learn whether their behaviors should be consistent with that of other agents.
We evaluate DCIR in multiple environments including Multi-agent Particle, Google Research Football and StarCraft II Micromanagement.
arXiv Detail & Related papers (2023-12-10T06:03:57Z) - Quantifying Agent Interaction in Multi-agent Reinforcement Learning for
Cost-efficient Generalization [63.554226552130054]
Generalization poses a significant challenge in Multi-agent Reinforcement Learning (MARL)
The extent to which an agent is influenced by unseen co-players depends on the agent's policy and the specific scenario.
We present the Level of Influence (LoI), a metric quantifying the interaction intensity among agents within a given scenario and environment.
arXiv Detail & Related papers (2023-10-11T06:09:26Z) - Informative Policy Representations in Multi-Agent Reinforcement Learning
via Joint-Action Distributions [17.129962954873587]
In multi-agent reinforcement learning, the inherent non-stationarity of the environment caused by other agents' actions posed significant difficulties for an agent to learn a good policy independently.
We propose a general method to learn representations of other agents' policies via the joint-action distributions sampled in interactions.
We empirically demonstrate that our method outperforms existing work in multi-agent tasks when facing unseen agents.
arXiv Detail & Related papers (2021-06-10T15:09:33Z) - Stateful Strategic Regression [20.7177095411398]
We describe the Stackelberg equilibrium of the resulting game and provide novel algorithms for computing it.
Our analysis reveals several intriguing insights about the role of multiple interactions in shaping the game's outcome.
Most importantly, we show that with multiple rounds of interaction at her disposal, the principal is more effective at incentivizing the agent to accumulate effort in her desired direction.
arXiv Detail & Related papers (2021-06-07T17:46:29Z) - Exploring the Impact of Tunable Agents in Sequential Social Dilemmas [0.0]
We leverage multi-objective reinforcement learning to create tunable agents.
We apply this technique to sequential social dilemmas.
We demonstrate that the tunable agents framework allows easy adaption between cooperative and competitive behaviours.
arXiv Detail & Related papers (2021-01-28T12:44:31Z) - Deep Interactive Bayesian Reinforcement Learning via Meta-Learning [63.96201773395921]
The optimal adaptive behaviour under uncertainty over the other agents' strategies can be computed using the Interactive Bayesian Reinforcement Learning framework.
We propose to meta-learn approximate belief inference and Bayes-optimal behaviour for a given prior.
We show empirically that our approach outperforms existing methods that use a model-free approach, sample from the approximate posterior, maintain memory-free models of others, or do not fully utilise the known structure of the environment.
arXiv Detail & Related papers (2021-01-11T13:25:13Z) - A Policy Gradient Algorithm for Learning to Learn in Multiagent
Reinforcement Learning [47.154539984501895]
We propose a novel meta-multiagent policy gradient theorem that accounts for the non-stationary policy dynamics inherent to multiagent learning settings.
This is achieved by modeling our gradient updates to consider both an agent's own non-stationary policy dynamics and the non-stationary policy dynamics of other agents in the environment.
arXiv Detail & Related papers (2020-10-31T22:50:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.