Strangeness-driven Exploration in Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2212.13448v1
- Date: Tue, 27 Dec 2022 11:08:49 GMT
- Title: Strangeness-driven Exploration in Multi-Agent Reinforcement Learning
- Authors: Ju-Bong Kim, Ho-Bin Choi, Youn-Hee Han
- Abstract summary: We introduce a new exploration method with the strangeness that can be easily incorporated into any centralized training and decentralized execution (CTDE)-based MARL algorithms.
The exploration bonus is obtained from the strangeness and the proposed exploration method is not much affected by transitions commonly observed in MARL tasks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient exploration strategy is one of essential issues in cooperative
multi-agent reinforcement learning (MARL) algorithms requiring complex
coordination. In this study, we introduce a new exploration method with the
strangeness that can be easily incorporated into any centralized training and
decentralized execution (CTDE)-based MARL algorithms. The strangeness refers to
the degree of unfamiliarity of the observations that an agent visits. In order
to give the observation strangeness a global perspective, it is also augmented
with the the degree of unfamiliarity of the visited entire state. The
exploration bonus is obtained from the strangeness and the proposed exploration
method is not much affected by stochastic transitions commonly observed in MARL
tasks. To prevent a high exploration bonus from making the MARL training
insensitive to extrinsic rewards, we also propose a separate action-value
function trained by both extrinsic reward and exploration bonus, on which a
behavioral policy to generate transitions is designed based. It makes the
CTDE-based MARL algorithms more stable when they are used with an exploration
method. Through a comparative evaluation in didactic examples and the StarCraft
Multi-Agent Challenge, we show that the proposed exploration method achieves
significant performance improvement in the CTDE-based MARL algorithms.
Related papers
- Random Latent Exploration for Deep Reinforcement Learning [71.88709402926415]
This paper introduces a new exploration technique called Random Latent Exploration (RLE)
RLE combines the strengths of bonus-based and noise-based (two popular approaches for effective exploration in deep RL) exploration strategies.
We evaluate it on the challenging Atari and IsaacGym benchmarks and show that RLE exhibits higher overall scores across all the tasks than other approaches.
arXiv Detail & Related papers (2024-07-18T17:55:22Z) - FoX: Formation-aware exploration in multi-agent reinforcement learning [10.554220876480297]
We propose a formation-based equivalence relation on the exploration space and aim to reduce the search space by exploring only meaningful states in different formations.
Numerical results show that the proposed FoX framework significantly outperforms the state-of-the-art MARL algorithms on Google Research Football (GRF) and sparse Starcraft II multi-agent challenge (SMAC) tasks.
arXiv Detail & Related papers (2023-08-22T08:39:44Z) - On the Importance of Exploration for Generalization in Reinforcement
Learning [89.63074327328765]
We propose EDE: Exploration via Distributional Ensemble, a method that encourages exploration of states with high uncertainty.
Our algorithm is the first value-based approach to achieve state-of-the-art on both Procgen and Crafter.
arXiv Detail & Related papers (2023-06-08T18:07:02Z) - Rewarding Episodic Visitation Discrepancy for Exploration in
Reinforcement Learning [64.8463574294237]
We propose Rewarding Episodic Visitation Discrepancy (REVD) as an efficient and quantified exploration method.
REVD provides intrinsic rewards by evaluating the R'enyi divergence-based visitation discrepancy between episodes.
It is tested on PyBullet Robotics Environments and Atari games.
arXiv Detail & Related papers (2022-09-19T08:42:46Z) - Episodic Multi-agent Reinforcement Learning with Curiosity-Driven
Exploration [40.87053312548429]
We introduce a novel Episodic Multi-agent reinforcement learning with Curiosity-driven exploration, called EMC.
We use prediction errors of individual Q-values as intrinsic rewards for coordinated exploration and utilize episodic memory to exploit explored informative experience to boost policy training.
arXiv Detail & Related papers (2021-11-22T07:34:47Z) - Cooperative Exploration for Multi-Agent Deep Reinforcement Learning [127.4746863307944]
We propose cooperative multi-agent exploration (CMAE) for deep reinforcement learning.
The goal is selected from multiple projected state spaces via a normalized entropy-based technique.
We demonstrate that CMAE consistently outperforms baselines on various tasks.
arXiv Detail & Related papers (2021-07-23T20:06:32Z) - MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven
Reinforcement Learning [65.52675802289775]
We show that an uncertainty aware classifier can solve challenging reinforcement learning problems.
We propose a novel method for computing the normalized maximum likelihood (NML) distribution.
We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions.
arXiv Detail & Related papers (2021-07-15T08:19:57Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z) - REMAX: Relational Representation for Multi-Agent Exploration [13.363887960136102]
We propose a learning-based exploration strategy to generate the initial states of a game.
We demonstrate that our method improves the training and performance of the MARL model more than the existing exploration methods.
arXiv Detail & Related papers (2020-08-12T10:23:35Z) - Intrinsic Exploration as Multi-Objective RL [29.124322674133]
Intrinsic motivation enables reinforcement learning (RL) agents to explore when rewards are very sparse.
We propose a framework based on multi-objective RL where both exploration and exploitation are being optimized as separate objectives.
This formulation brings the balance between exploration and exploitation at a policy level, resulting in advantages over traditional methods.
arXiv Detail & Related papers (2020-04-06T02:37:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.