Investigation of Independent Reinforcement Learning Algorithms in
Multi-Agent Environments
- URL: http://arxiv.org/abs/2111.01100v1
- Date: Mon, 1 Nov 2021 17:14:38 GMT
- Title: Investigation of Independent Reinforcement Learning Algorithms in
Multi-Agent Environments
- Authors: Ken Ming Lee, Sriram Ganapathi Subramanian, Mark Crowley
- Abstract summary: We show that independent algorithms can perform on par with multi-agent algorithms in cooperative and competitive settings.
We also show that adding recurrence improves the learning of independent algorithms in cooperative partially observable environments.
- Score: 0.9281671380673306
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Independent reinforcement learning algorithms have no theoretical guarantees
for finding the best policy in multi-agent settings. However, in practice,
prior works have reported good performance with independent algorithms in some
domains and bad performance in others. Moreover, a comprehensive study of the
strengths and weaknesses of independent algorithms is lacking in the
literature. In this paper, we carry out an empirical comparison of the
performance of independent algorithms on four PettingZoo environments that span
the three main categories of multi-agent environments, i.e., cooperative,
competitive, and mixed. We show that in fully-observable environments,
independent algorithms can perform on par with multi-agent algorithms in
cooperative and competitive settings. For the mixed environments, we show that
agents trained via independent algorithms learn to perform well individually,
but fail to learn to cooperate with allies and compete with enemies. We also
show that adding recurrence improves the learning of independent algorithms in
cooperative partially observable environments.
Related papers
- Multi-agent cooperation through learning-aware policy gradients [53.63948041506278]
Self-interested individuals often fail to cooperate, posing a fundamental challenge for multi-agent learning.
We present the first unbiased, higher-derivative-free policy gradient algorithm for learning-aware reinforcement learning.
We derive from the iterated prisoner's dilemma a novel explanation for how and when cooperation arises among self-interested learning-aware agents.
arXiv Detail & Related papers (2024-10-24T10:48:42Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - A Reinforcement Learning-assisted Genetic Programming Algorithm for Team
Formation Problem Considering Person-Job Matching [70.28786574064694]
A reinforcement learning-assisted genetic programming algorithm (RL-GP) is proposed to enhance the quality of solutions.
The hyper-heuristic rules obtained through efficient learning can be utilized as decision-making aids when forming project teams.
arXiv Detail & Related papers (2023-04-08T14:32:12Z) - Learning from Multiple Independent Advisors in Multi-agent Reinforcement
Learning [15.195932300563541]
This paper considers the problem of simultaneously learning from multiple independent advisors in multi-agent reinforcement learning.
We provide principled algorithms that incorporate a set of advisors by both evaluating the advisors at each state and subsequently using the advisors to guide action selection.
arXiv Detail & Related papers (2023-01-26T15:00:23Z) - Learning Rationalizable Equilibria in Multiplayer Games [38.922957434291554]
Existing algorithms require a number of samples exponential in the number of players to learn rationalizable equilibria under bandit feedback.
This paper develops the first line of efficient algorithms for learning rationalizable Coarse Correlated Equilibria (CCE) and Correlated Equilibria (CE)
Our algorithms incorporate several novel techniques to guarantee rationalizability and no (swap-)regret simultaneously, including a correlated exploration scheme and adaptive learning rates.
arXiv Detail & Related papers (2022-10-20T16:49:00Z) - Developing cooperative policies for multi-stage reinforcement learning
tasks [0.0]
Many hierarchical reinforcement learning algorithms utilise a series of independent skills as a basis to solve tasks at a higher level of reasoning.
This paper proposes the Cooperative Consecutive Policies (CCP) method of enabling consecutive agents to cooperatively solve long time horizon multi-stage tasks.
arXiv Detail & Related papers (2022-05-11T01:31:04Z) - Decentralized Cooperative Multi-Agent Reinforcement Learning with
Exploration [35.75029940279768]
We study multi-agent reinforcement learning in the most basic cooperative setting -- Markov teams.
We propose an algorithm in which each agent independently runs a stage-based V-learning style algorithm.
We show that the agents can learn an $epsilon$-approximate Nash equilibrium policy in at most $proptowidetildeO (1/epsilon4)$ episodes.
arXiv Detail & Related papers (2021-10-12T02:45:12Z) - Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning.
We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class.
We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z) - Is Independent Learning All You Need in the StarCraft Multi-Agent
Challenge? [100.48692829396778]
Independent PPO (IPPO) is a form of independent learning in which each agent simply estimates its local value function.
IPPO's strong performance may be due to its robustness to some forms of environment non-stationarity.
arXiv Detail & Related papers (2020-11-18T20:29:59Z) - Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in
Cooperative Tasks [11.480994804659908]
Multi-agent deep reinforcement learning (MARL) suffers from a lack of commonly-used evaluation tasks and criteria.
We provide a systematic evaluation and comparison of three different classes of MARL algorithms.
Our experiments serve as a reference for the expected performance of algorithms across different learning tasks.
arXiv Detail & Related papers (2020-06-14T11:22:53Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.