Non-cooperative Multi-agent Systems with Exploring Agents
- URL: http://arxiv.org/abs/2005.12360v1
- Date: Mon, 25 May 2020 19:34:29 GMT
- Title: Non-cooperative Multi-agent Systems with Exploring Agents
- Authors: Jalal Etesami, Christoph-Nikolas Straehle
- Abstract summary: We develop a prescriptive model of multi-agent behavior using Markov games.
We focus on models in which the agents play "exploration but near optimum strategies"
- Score: 10.736626320566707
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-agent learning is a challenging problem in machine learning that has
applications in different domains such as distributed control, robotics, and
economics. We develop a prescriptive model of multi-agent behavior using Markov
games. Since in many multi-agent systems, agents do not necessary select their
optimum strategies against other agents (e.g., multi-pedestrian interaction),
we focus on models in which the agents play "exploration but near optimum
strategies". We model such policies using the Boltzmann-Gibbs distribution.
This leads to a set of coupled Bellman equations that describes the behavior of
the agents. We introduce a set of conditions under which the set of equations
admit a unique solution and propose two algorithms that provably provide the
solution in finite and infinite time horizon scenarios. We also study a
practical setting in which the interactions can be described using the
occupancy measures and propose a simplified Markov game with less complexity.
Furthermore, we establish the connection between the Markov games with
exploration strategies and the principle of maximum causal entropy for
multi-agent systems. Finally, we evaluate the performance of our algorithms via
several well-known games from the literature and some games that are designed
based on real world applications.
Related papers
- Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - Breaking the Curse of Multiagents in a Large State Space: RL in Markov
Games with Independent Linear Function Approximation [56.715186432566576]
We propose a new model, independent linear Markov game, for reinforcement learning with a large state space and a large number of agents.
We design new algorithms for learning correlated equilibria (CCE) and Markov correlated equilibria (CE) with sample bounds complexity that only scalely with each agent's own function class complexity.
Our algorithms rely on two key technical innovations: (1) utilizing policy replay to tackle non-stationarity incurred by multiple agents and the use of function approximation; and (2) separating learning Markov equilibria and exploration in the Markov games.
arXiv Detail & Related papers (2023-02-07T18:47:48Z) - Off-Beat Multi-Agent Reinforcement Learning [62.833358249873704]
We investigate model-free multi-agent reinforcement learning (MARL) in environments where off-beat actions are prevalent.
We propose a novel episodic memory, LeGEM, for model-free MARL algorithms.
We evaluate LeGEM on various multi-agent scenarios with off-beat actions, including Stag-Hunter Game, Quarry Game, Afforestation Game, and StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2022-05-27T02:21:04Z) - Decentralized Cooperative Multi-Agent Reinforcement Learning with
Exploration [35.75029940279768]
We study multi-agent reinforcement learning in the most basic cooperative setting -- Markov teams.
We propose an algorithm in which each agent independently runs a stage-based V-learning style algorithm.
We show that the agents can learn an $epsilon$-approximate Nash equilibrium policy in at most $proptowidetildeO (1/epsilon4)$ episodes.
arXiv Detail & Related papers (2021-10-12T02:45:12Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Model Free Reinforcement Learning Algorithm for Stationary Mean field
Equilibrium for Multiple Types of Agents [43.21120427632336]
We consider a multi-agent strategic interaction over an infinite horizon where agents can be of multiple types.
Each agent has a private state; the state evolves depending on the distribution of the state of the agents of different types and the action of the agent.
We show how such kind of interaction can model the cyber attacks among defenders and adversaries.
arXiv Detail & Related papers (2020-12-31T00:12:46Z) - Forgetful Experience Replay in Hierarchical Reinforcement Learning from
Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments.
Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations.
The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z) - Learning to Model Opponent Learning [11.61673411387596]
Multi-Agent Reinforcement Learning (MARL) considers settings in which a set of coexisting agents interact with one another and their environment.
This poses a great challenge for value function-based algorithms whose convergence usually relies on the assumption of a stationary environment.
We develop a novel approach to modelling an opponent's learning dynamics which we term Learning to Model Opponent Learning (LeMOL)
arXiv Detail & Related papers (2020-06-06T17:19:04Z) - Multi Type Mean Field Reinforcement Learning [26.110052366068533]
We extend mean field multiagent algorithms to multiple types.
We conduct experiments on three different testbeds for the field of many agent reinforcement learning.
arXiv Detail & Related papers (2020-02-06T20:58:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.