Imagine, Initialize, and Explore: An Effective Exploration Method in
Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2402.17978v2
- Date: Fri, 1 Mar 2024 11:08:48 GMT
- Title: Imagine, Initialize, and Explore: An Effective Exploration Method in
Multi-Agent Reinforcement Learning
- Authors: Zeyang Liu, Lipeng Wan, Xinrui Yang, Zhuoran Chen, Xingyu Chen,
Xuguang Lan
- Abstract summary: We propose a novel method for efficient multi-agent exploration in complex scenarios.
We formulate the imagination as a sequence modeling problem, where the states, observations, prompts, actions, and rewards are predicted autoregressively.
By initializing agents at the critical states, IIE significantly increases the likelihood of discovering potentially important underexplored regions.
- Score: 27.81925751697255
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Effective exploration is crucial to discovering optimal strategies for
multi-agent reinforcement learning (MARL) in complex coordination tasks.
Existing methods mainly utilize intrinsic rewards to enable committed
exploration or use role-based learning for decomposing joint action spaces
instead of directly conducting a collective search in the entire
action-observation space. However, they often face challenges obtaining
specific joint action sequences to reach successful states in long-horizon
tasks. To address this limitation, we propose Imagine, Initialize, and Explore
(IIE), a novel method that offers a promising solution for efficient
multi-agent exploration in complex scenarios. IIE employs a transformer model
to imagine how the agents reach a critical state that can influence each
other's transition functions. Then, we initialize the environment at this state
using a simulator before the exploration phase. We formulate the imagination as
a sequence modeling problem, where the states, observations, prompts, actions,
and rewards are predicted autoregressively. The prompt consists of
timestep-to-go, return-to-go, influence value, and one-shot demonstration,
specifying the desired state and trajectory as well as guiding the action
generation. By initializing agents at the critical states, IIE significantly
increases the likelihood of discovering potentially important under-explored
regions. Despite its simplicity, empirical results demonstrate that our method
outperforms multi-agent exploration baselines on the StarCraft Multi-Agent
Challenge (SMAC) and SMACv2 environments. Particularly, IIE shows improved
performance in the sparse-reward SMAC tasks and produces more effective
curricula over the initialized states than other generative methods, such as
CVAE-GAN and diffusion models.
Related papers
- FlickerFusion: Intra-trajectory Domain Generalizing Multi-Agent RL [19.236153474365747]
Existing MARL approaches often rely on the restrictive assumption that the number of entities remains constant between training and inference.
In this paper, we tackle the challenge of intra-trajectory dynamic entity composition under zero-shot out-of-domain (OOD) generalization.
We propose FlickerFusion, a novel OOD generalization method that acts as a universally applicable augmentation technique for MARL backbone methods.
arXiv Detail & Related papers (2024-10-21T10:57:45Z) - Action abstractions for amortized sampling [49.384037138511246]
We propose an approach to incorporate the discovery of action abstractions, or high-level actions, into the policy optimization process.
Our approach involves iteratively extracting action subsequences commonly used across many high-reward trajectories and chunking' them into a single action that is added to the action space.
arXiv Detail & Related papers (2024-10-19T19:22:50Z) - SVDE: Scalable Value-Decomposition Exploration for Cooperative
Multi-Agent Reinforcement Learning [22.389803019100423]
We propose a scalable value-decomposition exploration (SVDE) method, which includes a scalable training mechanism, intrinsic reward design, and explorative experience replay.
Our method achieves the best performance on almost all maps compared to other popular algorithms in a set of StarCraft II micromanagement games.
arXiv Detail & Related papers (2023-03-16T03:17:20Z) - Locality Matters: A Scalable Value Decomposition Approach for
Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.
We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z) - Cooperative Exploration for Multi-Agent Deep Reinforcement Learning [127.4746863307944]
We propose cooperative multi-agent exploration (CMAE) for deep reinforcement learning.
The goal is selected from multiple projected state spaces via a normalized entropy-based technique.
We demonstrate that CMAE consistently outperforms baselines on various tasks.
arXiv Detail & Related papers (2021-07-23T20:06:32Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z) - Online reinforcement learning with sparse rewards through an active
inference capsule [62.997667081978825]
This paper introduces an active inference agent which minimizes the novel free energy of the expected future.
Our model is capable of solving sparse-reward problems with a very high sample efficiency.
We also introduce a novel method for approximating the prior model from the reward function, which simplifies the expression of complex objectives.
arXiv Detail & Related papers (2021-06-04T10:03:36Z) - Batch Exploration with Examples for Scalable Robotic Reinforcement
Learning [63.552788688544254]
Batch Exploration with Examples (BEE) explores relevant regions of the state-space guided by a modest number of human provided images of important states.
BEE is able to tackle challenging vision-based manipulation tasks both in simulation and on a real Franka robot.
arXiv Detail & Related papers (2020-10-22T17:49:25Z) - REMAX: Relational Representation for Multi-Agent Exploration [13.363887960136102]
We propose a learning-based exploration strategy to generate the initial states of a game.
We demonstrate that our method improves the training and performance of the MARL model more than the existing exploration methods.
arXiv Detail & Related papers (2020-08-12T10:23:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.