Self-Motivated Multi-Agent Exploration
- URL: http://arxiv.org/abs/2301.02083v2
- Date: Wed, 27 Sep 2023 11:53:46 GMT
- Title: Self-Motivated Multi-Agent Exploration
- Authors: Shaowei Zhang, Jiahan Cao, Lei Yuan, Yang Yu, De-Chuan Zhan
- Abstract summary: In cooperative multi-agent reinforcement learning (CMARL), it is critical for agents to achieve a balance between self-exploration and team collaboration.
Recent works mainly concentrate on agents' coordinated exploration, which brings about the exponentially grown exploration of the state space.
We propose Self-Motivated Multi-Agent Exploration (SMMAE), which aims to achieve success in team tasks by adaptively finding a trade-off between self-exploration and team cooperation.
- Score: 38.55811936029999
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In cooperative multi-agent reinforcement learning (CMARL), it is critical for
agents to achieve a balance between self-exploration and team collaboration.
However, agents can hardly accomplish the team task without coordination and
they would be trapped in a local optimum where easy cooperation is accessed
without enough individual exploration. Recent works mainly concentrate on
agents' coordinated exploration, which brings about the exponentially grown
exploration of the state space. To address this issue, we propose
Self-Motivated Multi-Agent Exploration (SMMAE), which aims to achieve success
in team tasks by adaptively finding a trade-off between self-exploration and
team cooperation. In SMMAE, we train an independent exploration policy for each
agent to maximize their own visited state space. Each agent learns an
adjustable exploration probability based on the stability of the joint team
policy. The experiments on highly cooperative tasks in StarCraft II
micromanagement benchmark (SMAC) demonstrate that SMMAE can explore
task-related states more efficiently, accomplish coordinated behaviours and
boost the learning performance.
Related papers
- MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure [37.56309011441144]
This paper introduces MESA, a novel meta-exploration method for cooperative multi-agent learning.
It learns to explore by first identifying the agents' high-rewarding joint state-action subspace from training tasks and then learning a set of diverse exploration policies to "cover" the subspace.
Experiments show that with learned exploration policies, MESA achieves significantly better performance in sparse-reward tasks in several multi-agent particle environments and multi-agent MuJoCo environments.
arXiv Detail & Related papers (2024-05-01T23:19:48Z) - Settling Decentralized Multi-Agent Coordinated Exploration by Novelty Sharing [34.299478481229265]
We propose MACE, a simple yet effective multi-agent coordinated exploration method.
By communicating only local novelty, agents can take into account other agents' local novelty to approximate the global novelty.
We show that MACE achieves superior performance in three multi-agent environments with sparse rewards.
arXiv Detail & Related papers (2024-02-03T09:35:25Z) - Emergence of Collective Open-Ended Exploration from Decentralized Meta-Reinforcement Learning [2.296343533657165]
Recent works have proven that intricate cooperative behaviors can emerge in agents trained using meta reinforcement learning on open ended task distributions using self-play.
We argue that self-play and other centralized training techniques do not accurately reflect how general collective exploration strategies emerge in the natural world.
arXiv Detail & Related papers (2023-11-01T16:56:44Z) - Building Cooperative Embodied Agents Modularly with Large Language
Models [104.57849816689559]
We address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments.
We harness the commonsense knowledge, reasoning ability, language comprehension, and text generation prowess of LLMs and seamlessly incorporate them into a cognitive-inspired modular framework.
Our experiments on C-WAH and TDW-MAT demonstrate that CoELA driven by GPT-4 can surpass strong planning-based methods and exhibit emergent effective communication.
arXiv Detail & Related papers (2023-07-05T17:59:27Z) - Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL)
It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks.
The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z) - Multi-agent Deep Covering Skill Discovery [50.812414209206054]
We propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space.
Also, we propose a novel framework to adopt the multi-agent options in the MARL process.
We show that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-10-07T00:40:59Z) - Cooperative Exploration for Multi-Agent Deep Reinforcement Learning [127.4746863307944]
We propose cooperative multi-agent exploration (CMAE) for deep reinforcement learning.
The goal is selected from multiple projected state spaces via a normalized entropy-based technique.
We demonstrate that CMAE consistently outperforms baselines on various tasks.
arXiv Detail & Related papers (2021-07-23T20:06:32Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.