Structured Diversification Emergence via Reinforced Organization Control
and Hierarchical Consensus Learning
- URL: http://arxiv.org/abs/2102.04775v1
- Date: Tue, 9 Feb 2021 11:46:12 GMT
- Title: Structured Diversification Emergence via Reinforced Organization Control
and Hierarchical Consensus Learning
- Authors: Wenhao Li, Xiangfeng Wang, Bo Jin, Junjie Sheng, Yun Hua and Hongyuan
Zha
- Abstract summary: We propose a structured diversification emergence MARL framework named scRochico based on reinforced organization control and hierarchical consensus learning.
scRochico is significantly better than the current SOTA algorithms in terms of exploration efficiency and cooperation strength.
- Score: 48.525944995851965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When solving a complex task, humans will spontaneously form teams and to
complete different parts of the whole task, respectively. Meanwhile, the
cooperation between teammates will improve efficiency. However, for current
cooperative MARL methods, the cooperation team is constructed through either
heuristics or end-to-end blackbox optimization. In order to improve the
efficiency of cooperation and exploration, we propose a structured
diversification emergence MARL framework named {\sc{Rochico}} based on
reinforced organization control and hierarchical consensus learning.
{\sc{Rochico}} first learns an adaptive grouping policy through the
organization control module, which is established by independent multi-agent
reinforcement learning. Further, the hierarchical consensus module based on the
hierarchical intentions with consensus constraint is introduced after team
formation. Simultaneously, utilizing the hierarchical consensus module and a
self-supervised intrinsic reward enhanced decision module, the proposed
cooperative MARL algorithm {\sc{Rochico}} can output the final diversified
multi-agent cooperative policy. All three modules are organically combined to
promote the structured diversification emergence. Comparative experiments on
four large-scale cooperation tasks show that {\sc{Rochico}} is significantly
better than the current SOTA algorithms in terms of exploration efficiency and
cooperation strength.
Related papers
- CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation [98.11670473661587]
CaPo improves cooperation efficiency with two phases: 1) meta-plan generation, and 2) progress-adaptive meta-plan and execution.
Experimental results on the ThreeDworld Multi-Agent Transport and Communicative Watch-And-Help tasks demonstrate that CaPo achieves much higher task completion rate and efficiency compared with state-of-the-arts.
arXiv Detail & Related papers (2024-11-07T13:08:04Z) - Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - Reaching Consensus in Cooperative Multi-Agent Reinforcement Learning
with Goal Imagination [16.74629849552254]
We propose a model-based consensus mechanism to explicitly coordinate multiple agents.
The proposed Multi-agent Goal Imagination (MAGI) framework guides agents to reach consensus with an Imagined common goal.
We show that such efficient consensus mechanism can guide all agents cooperatively reaching valuable future states.
arXiv Detail & Related papers (2024-03-05T18:07:34Z) - CoMIX: A Multi-agent Reinforcement Learning Training Architecture for Efficient Decentralized Coordination and Independent Decision-Making [2.4555276449137042]
Robust coordination skills enable agents to operate cohesively in shared environments, together towards a common goal and, ideally, individually without hindering each other's progress.
This paper presents Coordinated QMIX, a novel training framework for decentralized agents that enables emergent coordination through flexible policies, allowing at the same time independent decision-making at individual level.
arXiv Detail & Related papers (2023-08-21T13:45:44Z) - Learning to Learn Group Alignment: A Self-Tuning Credo Framework with
Multiagent Teams [1.370633147306388]
Mixed incentives among a population with multiagent teams has been shown to have advantages over a fully cooperative system.
We propose a framework where individual learning agents self-regulate their configuration of incentives through various parts of their reward function.
arXiv Detail & Related papers (2023-04-14T18:16:19Z) - Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL)
It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks.
The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z) - HAVEN: Hierarchical Cooperative Multi-Agent Reinforcement Learning with
Dual Coordination Mechanism [17.993973801986677]
Multi-agent reinforcement learning often suffers from the exponentially larger action space caused by a large number of agents.
We propose a novel value decomposition framework HAVEN based on hierarchical reinforcement learning for the fully cooperative multi-agent problems.
arXiv Detail & Related papers (2021-10-14T10:43:47Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.