Algorithms in Multi-Agent Systems: A Holistic Perspective from
Reinforcement Learning and Game Theory
- URL: http://arxiv.org/abs/2001.06487v3
- Date: Fri, 31 Jan 2020 02:16:05 GMT
- Title: Algorithms in Multi-Agent Systems: A Holistic Perspective from
Reinforcement Learning and Game Theory
- Authors: Yunlong Lu and Kai Yan
- Abstract summary: Deep reinforcement learning has achieved outstanding results in recent years.
Recent works are exploring learning beyond single-agent scenarios and considering multi-agent scenarios.
Traditional game-theoretic algorithms, which, in turn, show bright application promise combined with modern algorithms and boosting computing power.
- Score: 2.5147566619221515
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Deep reinforcement learning (RL) has achieved outstanding results in recent
years, which has led a dramatic increase in the number of methods and
applications. Recent works are exploring learning beyond single-agent scenarios
and considering multi-agent scenarios. However, they are faced with lots of
challenges and are seeking for help from traditional game-theoretic algorithms,
which, in turn, show bright application promise combined with modern algorithms
and boosting computing power. In this survey, we first introduce basic concepts
and algorithms in single agent RL and multi-agent systems; then, we summarize
the related algorithms from three aspects. Solution concepts from game theory
give inspiration to algorithms which try to evaluate the agents or find better
solutions in multi-agent systems. Fictitious self-play becomes popular and has
a great impact on the algorithm of multi-agent reinforcement learning.
Counterfactual regret minimization is an important tool to solve games with
incomplete information, and has shown great strength when combined with deep
learning.
Related papers
- Diagnosing and exploiting the computational demands of videos games for
deep reinforcement learning [13.98405611352641]
We introduce the Learning Challenge Diagnosticator (LCD), a tool that measures the perceptual and reinforcement learning demands of a task.
We use LCD to discover a novel taxonomy of challenges in the Procgen benchmark, and demonstrate that these predictions are both highly reliable and can instruct algorithmic development.
arXiv Detail & Related papers (2023-09-22T21:03:33Z) - Monte-Carlo Tree Search for Multi-Agent Pathfinding: Preliminary Results [60.4817465598352]
We introduce an original variant of Monte-Carlo Tree Search (MCTS) tailored to multi-agent pathfinding.
Specifically, we use individual paths to assist the agents with the the goal-reaching behavior.
We also use a dedicated decomposition technique to reduce the branching factor of the tree search procedure.
arXiv Detail & Related papers (2023-07-25T12:33:53Z) - Scaling Laws for Imitation Learning in Single-Agent Games [29.941613597833133]
We investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting for single-agent games.
We first demonstrate our findings on a variety of Atari games, and thereafter focus on the extremely challenging game of NetHack.
We find that IL loss and mean return scale smoothly with the compute budget and are strongly correlated, resulting in power laws for training compute-optimal IL agents.
arXiv Detail & Related papers (2023-07-18T16:43:03Z) - Efficient Cooperation Strategy Generation in Multi-Agent Video Games via
Hypergraph Neural Network [16.226702761758595]
The performance of deep reinforcement learning in single-agent video games is astounding.
However, researchers have extra difficulties while working with video games in multi-agent environments.
We propose a novel algorithm based on the actor-critic method, which adapts the hypergraph structure of agents and employs hypergraph convolution to complete information feature extraction and representation between agents.
arXiv Detail & Related papers (2022-03-07T10:34:40Z) - TiKick: Toward Playing Multi-agent Football Full Games from Single-agent
Demonstrations [31.596018856092513]
Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game.
To the best of our knowledge, Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game.
arXiv Detail & Related papers (2021-10-09T08:34:58Z) - The Information Geometry of Unsupervised Reinforcement Learning [133.20816939521941]
Unsupervised skill discovery is a class of algorithms that learn a set of policies without access to a reward function.
We show that unsupervised skill discovery algorithms do not learn skills that are optimal for every possible reward function.
arXiv Detail & Related papers (2021-10-06T13:08:36Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - Discovering Reinforcement Learning Algorithms [53.72358280495428]
Reinforcement learning algorithms update an agent's parameters according to one of several possible rules.
This paper introduces a new meta-learning approach that discovers an entire update rule.
It includes both 'what to predict' (e.g. value functions) and 'how to learn from it' by interacting with a set of environments.
arXiv Detail & Related papers (2020-07-17T07:38:39Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z) - Forgetful Experience Replay in Hierarchical Reinforcement Learning from
Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments.
Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations.
The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z) - Non-cooperative Multi-agent Systems with Exploring Agents [10.736626320566707]
We develop a prescriptive model of multi-agent behavior using Markov games.
We focus on models in which the agents play "exploration but near optimum strategies"
arXiv Detail & Related papers (2020-05-25T19:34:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.