Episodic Multi-agent Reinforcement Learning with Curiosity-Driven
Exploration
- URL: http://arxiv.org/abs/2111.11032v1
- Date: Mon, 22 Nov 2021 07:34:47 GMT
- Title: Episodic Multi-agent Reinforcement Learning with Curiosity-Driven
Exploration
- Authors: Lulu Zheng, Jiarui Chen, Jianhao Wang, Jiamin He, Yujing Hu, Yingfeng
Chen, Changjie Fan, Yang Gao, Chongjie Zhang
- Abstract summary: We introduce a novel Episodic Multi-agent reinforcement learning with Curiosity-driven exploration, called EMC.
We use prediction errors of individual Q-values as intrinsic rewards for coordinated exploration and utilize episodic memory to exploit explored informative experience to boost policy training.
- Score: 40.87053312548429
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient exploration in deep cooperative multi-agent reinforcement learning
(MARL) still remains challenging in complex coordination problems. In this
paper, we introduce a novel Episodic Multi-agent reinforcement learning with
Curiosity-driven exploration, called EMC. We leverage an insight of popular
factorized MARL algorithms that the "induced" individual Q-values, i.e., the
individual utility functions used for local execution, are the embeddings of
local action-observation histories, and can capture the interaction between
agents due to reward backpropagation during centralized training. Therefore, we
use prediction errors of individual Q-values as intrinsic rewards for
coordinated exploration and utilize episodic memory to exploit explored
informative experience to boost policy training. As the dynamics of an agent's
individual Q-value function captures the novelty of states and the influence
from other agents, our intrinsic reward can induce coordinated exploration to
new or promising states. We illustrate the advantages of our method by didactic
examples, and demonstrate its significant outperformance over state-of-the-art
MARL baselines on challenging tasks in the StarCraft II micromanagement
benchmark.
Related papers
- Imagine, Initialize, and Explore: An Effective Exploration Method in
Multi-Agent Reinforcement Learning [27.81925751697255]
We propose a novel method for efficient multi-agent exploration in complex scenarios.
We formulate the imagination as a sequence modeling problem, where the states, observations, prompts, actions, and rewards are predicted autoregressively.
By initializing agents at the critical states, IIE significantly increases the likelihood of discovering potentially important underexplored regions.
arXiv Detail & Related papers (2024-02-28T01:45:01Z) - Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL)
It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks.
The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z) - Strangeness-driven Exploration in Multi-Agent Reinforcement Learning [0.0]
We introduce a new exploration method with the strangeness that can be easily incorporated into any centralized training and decentralized execution (CTDE)-based MARL algorithms.
The exploration bonus is obtained from the strangeness and the proposed exploration method is not much affected by transitions commonly observed in MARL tasks.
arXiv Detail & Related papers (2022-12-27T11:08:49Z) - Credit-cognisant reinforcement learning for multi-agent cooperation [0.0]
We introduce the concept of credit-cognisant rewards, which allows an agent to perceive the effect its actions had on the environment as well as on its co-agents.
We show that by manipulating these experiences and constructing the reward contained within them to include the rewards received by all the agents within the same action sequence, we are able to improve significantly on the performance of independent deep Q-learning.
arXiv Detail & Related papers (2022-11-18T09:00:25Z) - Curiosity-Driven Multi-Agent Exploration with Mixed Objectives [7.247148291603988]
Intrinsic rewards have been increasingly used to mitigate the sparse reward problem in single-agent reinforcement learning.
Curiosity-driven exploration is a simple yet efficient approach that quantifies this novelty as the prediction error of the agent's curiosity module.
We show here, however, that naively using this curiosity-driven approach to guide exploration in sparse reward cooperative multi-agent environments does not consistently lead to improved results.
arXiv Detail & Related papers (2022-10-29T02:45:38Z) - Cooperative Exploration for Multi-Agent Deep Reinforcement Learning [127.4746863307944]
We propose cooperative multi-agent exploration (CMAE) for deep reinforcement learning.
The goal is selected from multiple projected state spaces via a normalized entropy-based technique.
We demonstrate that CMAE consistently outperforms baselines on various tasks.
arXiv Detail & Related papers (2021-07-23T20:06:32Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z) - Planning to Explore via Self-Supervised World Models [120.31359262226758]
Plan2Explore is a self-supervised reinforcement learning agent.
We present a new approach to self-supervised exploration and fast adaptation to new tasks.
Without any training supervision or task-specific interaction, Plan2Explore outperforms prior self-supervised exploration methods.
arXiv Detail & Related papers (2020-05-12T17:59:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.