Coach-assisted Multi-Agent Reinforcement Learning Framework for
Unexpected Crashed Agents
- URL: http://arxiv.org/abs/2203.08454v1
- Date: Wed, 16 Mar 2022 08:22:45 GMT
- Title: Coach-assisted Multi-Agent Reinforcement Learning Framework for
Unexpected Crashed Agents
- Authors: Jian Zhao, Youpeng Zhao, Weixun Wang, Mingyu Yang, Xunhan Hu, Wengang
Zhou, Jianye Hao, Houqiang Li
- Abstract summary: We present a formal formulation of a cooperative multi-agent reinforcement learning system with unexpected crashes.
We propose a coach-assisted multi-agent reinforcement learning framework, which introduces a virtual coach agent to adjust the crash rate during training.
To the best of our knowledge, this work is the first to study the unexpected crashes in the multi-agent system.
- Score: 120.91291581594773
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-agent reinforcement learning is difficult to be applied in practice,
which is partially due to the gap between the simulated and real-world
scenarios. One reason for the gap is that the simulated systems always assume
that the agents can work normally all the time, while in practice, one or more
agents may unexpectedly "crash" during the coordination process due to
inevitable hardware or software failures. Such crashes will destroy the
cooperation among agents, leading to performance degradation. In this work, we
present a formal formulation of a cooperative multi-agent reinforcement
learning system with unexpected crashes. To enhance the robustness of the
system to crashes, we propose a coach-assisted multi-agent reinforcement
learning framework, which introduces a virtual coach agent to adjust the crash
rate during training. We design three coaching strategies and the re-sampling
strategy for our coach agent. To the best of our knowledge, this work is the
first to study the unexpected crashes in the multi-agent system. Extensive
experiments on grid-world and StarCraft II micromanagement tasks demonstrate
the efficacy of adaptive strategy compared with the fixed crash rate strategy
and curriculum learning strategy. The ablation study further illustrates the
effectiveness of our re-sampling strategy.
Related papers
- MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning [62.065503126104126]
We study how a principal can efficiently and effectively intervene on the rewards of a previously unseen learning agent in order to induce desirable outcomes.
This is relevant to many real-world settings like auctions or taxation, where the principal may not know the learning behavior nor the rewards of real people.
We introduce MERMAIDE, a model-based meta-learning framework to train a principal that can quickly adapt to out-of-distribution agents.
arXiv Detail & Related papers (2023-04-10T15:44:50Z) - Hierarchical Reinforcement Learning with Opponent Modeling for
Distributed Multi-agent Cooperation [13.670618752160594]
Deep reinforcement learning (DRL) provides a promising approach for multi-agent cooperation through the interaction of the agents and environments.
Traditional DRL solutions suffer from the high dimensions of multiple agents with continuous action space during policy search.
We propose a hierarchical reinforcement learning approach with high-level decision-making and low-level individual control for efficient policy search.
arXiv Detail & Related papers (2022-06-25T19:09:29Z) - Robust Reinforcement Learning via Genetic Curriculum [5.421464476555662]
Genetic curriculum is an algorithm that automatically identifies scenarios in which the agent currently fails and generates an associated curriculum.
Our empirical studies show improvement in robustness over the existing state of the art algorithms, providing training curricula that result in agents being 2 - 8x times less likely to fail.
arXiv Detail & Related papers (2022-02-17T01:14:20Z) - Conditional Imitation Learning for Multi-Agent Games [89.897635970366]
We study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time.
We propose a novel approach to address the difficulties of scalability and data scarcity.
Our model learns a low-rank subspace over ego and partner agent strategies, then infers and adapts to a new partner strategy by interpolating in the subspace.
arXiv Detail & Related papers (2022-01-05T04:40:13Z) - Relative Distributed Formation and Obstacle Avoidance with Multi-agent
Reinforcement Learning [20.401609420707867]
We propose a distributed formation and obstacle avoidance method based on multi-agent reinforcement learning (MARL)
Our method achieves better performance regarding formation error, formation convergence rate and on-par success rate of obstacle avoidance compared with baselines.
arXiv Detail & Related papers (2021-11-14T13:02:45Z) - HAVEN: Hierarchical Cooperative Multi-Agent Reinforcement Learning with
Dual Coordination Mechanism [17.993973801986677]
Multi-agent reinforcement learning often suffers from the exponentially larger action space caused by a large number of agents.
We propose a novel value decomposition framework HAVEN based on hierarchical reinforcement learning for the fully cooperative multi-agent problems.
arXiv Detail & Related papers (2021-10-14T10:43:47Z) - SA-MATD3:Self-attention-based multi-agent continuous control method in
cooperative environments [12.959163198988536]
Existing algorithms suffer from the problem of uneven learning degree with the increase of the number of agents.
A new structure for a multi-agent actor critic is proposed, and the self-attention mechanism is applied in the critic network.
The proposed algorithm makes full use of the samples in the replay memory buffer to learn the behavior of a class of agents.
arXiv Detail & Related papers (2021-07-01T08:15:05Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - Safe Reinforcement Learning via Curriculum Induction [94.67835258431202]
In safety-critical applications, autonomous agents may need to learn in an environment where mistakes can be very costly.
Existing safe reinforcement learning methods make an agent rely on priors that let it avoid dangerous situations.
This paper presents an alternative approach inspired by human teaching, where an agent learns under the supervision of an automatic instructor.
arXiv Detail & Related papers (2020-06-22T10:48:17Z) - Forgetful Experience Replay in Hierarchical Reinforcement Learning from
Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments.
Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations.
The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.