The StarCraft Multi-Agent Challenges+ : Learning of Multi-Stage Tasks
and Environmental Factors without Precise Reward Functions
- URL: http://arxiv.org/abs/2207.02007v2
- Date: Thu, 7 Jul 2022 08:30:16 GMT
- Title: The StarCraft Multi-Agent Challenges+ : Learning of Multi-Stage Tasks
and Environmental Factors without Precise Reward Functions
- Authors: Mingyu Kim, Jihwan Oh, Yongsik Lee, Joonkee Kim, Seonghwan Kim, Song
Chong and Se-Young Yun
- Abstract summary: We propose a novel benchmark called the StarCraft Multi-Agent Challenges+.
This challenge is interested in the exploration capability of MARL algorithms to efficiently learn implicit multi-stage tasks and environmental factors as well as micro-control.
We investigate MARL algorithms under SMAC+ and observe that recent approaches work well in similar settings to the previous challenges, but misbehave in offensive scenarios.
- Score: 14.399479538886064
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, we propose a novel benchmark called the StarCraft Multi-Agent
Challenges+, where agents learn to perform multi-stage tasks and to use
environmental factors without precise reward functions. The previous challenges
(SMAC) recognized as a standard benchmark of Multi-Agent Reinforcement Learning
are mainly concerned with ensuring that all agents cooperatively eliminate
approaching adversaries only through fine manipulation with obvious reward
functions. This challenge, on the other hand, is interested in the exploration
capability of MARL algorithms to efficiently learn implicit multi-stage tasks
and environmental factors as well as micro-control. This study covers both
offensive and defensive scenarios. In the offensive scenarios, agents must
learn to first find opponents and then eliminate them. The defensive scenarios
require agents to use topographic features. For example, agents need to
position themselves behind protective structures to make it harder for enemies
to attack. We investigate MARL algorithms under SMAC+ and observe that recent
approaches work well in similar settings to the previous challenges, but
misbehave in offensive scenarios. Additionally, we observe that an enhanced
exploration approach has a positive effect on performance but is not able to
completely solve all scenarios. This study proposes new directions for future
research.
Related papers
- MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure [37.56309011441144]
This paper introduces MESA, a novel meta-exploration method for cooperative multi-agent learning.
It learns to explore by first identifying the agents' high-rewarding joint state-action subspace from training tasks and then learning a set of diverse exploration policies to "cover" the subspace.
Experiments show that with learned exploration policies, MESA achieves significantly better performance in sparse-reward tasks in several multi-agent particle environments and multi-agent MuJoCo environments.
arXiv Detail & Related papers (2024-05-01T23:19:48Z) - Imagine, Initialize, and Explore: An Effective Exploration Method in
Multi-Agent Reinforcement Learning [27.81925751697255]
We propose a novel method for efficient multi-agent exploration in complex scenarios.
We formulate the imagination as a sequence modeling problem, where the states, observations, prompts, actions, and rewards are predicted autoregressively.
By initializing agents at the critical states, IIE significantly increases the likelihood of discovering potentially important underexplored regions.
arXiv Detail & Related papers (2024-02-28T01:45:01Z) - Deep Multi-Agent Reinforcement Learning for Decentralized Active
Hypothesis Testing [11.639503711252663]
We tackle the multi-agent active hypothesis testing (AHT) problem by introducing a novel algorithm rooted in the framework of deep multi-agent reinforcement learning.
We present a comprehensive set of experimental results that effectively showcase the agents' ability to learn collaborative strategies and enhance performance.
arXiv Detail & Related papers (2023-09-14T01:18:04Z) - Off-Beat Multi-Agent Reinforcement Learning [62.833358249873704]
We investigate model-free multi-agent reinforcement learning (MARL) in environments where off-beat actions are prevalent.
We propose a novel episodic memory, LeGEM, for model-free MARL algorithms.
We evaluate LeGEM on various multi-agent scenarios with off-beat actions, including Stag-Hunter Game, Quarry Game, Afforestation Game, and StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2022-05-27T02:21:04Z) - AutoDIME: Automatic Design of Interesting Multi-Agent Environments [3.1546318469750205]
We examine a set of intrinsic teacher rewards derived from prediction problems that can be applied in multi-agent settings.
Of the intrinsic rewards considered we found value disagreement to be most consistent across tasks.
Our results suggest that intrinsic teacher rewards, and in particular value disagreement, are a promising approach for automating both single and multi-agent environment design.
arXiv Detail & Related papers (2022-03-04T18:25:33Z) - Q-Mixing Network for Multi-Agent Pathfinding in Partially Observable
Grid Environments [62.997667081978825]
We consider the problem of multi-agent navigation in partially observable grid environments.
We suggest utilizing the reinforcement learning approach when the agents, first, learn the policies that map observations to actions and then follow these policies to reach their goals.
arXiv Detail & Related papers (2021-08-13T09:44:47Z) - Cooperative Exploration for Multi-Agent Deep Reinforcement Learning [127.4746863307944]
We propose cooperative multi-agent exploration (CMAE) for deep reinforcement learning.
The goal is selected from multiple projected state spaces via a normalized entropy-based technique.
We demonstrate that CMAE consistently outperforms baselines on various tasks.
arXiv Detail & Related papers (2021-07-23T20:06:32Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - Exploration and Incentives in Reinforcement Learning [107.42240386544633]
We consider complex exploration problems, where each agent faces the same (but unknown) MDP.
Agents control the choice of policies, whereas an algorithm can only issue recommendations.
We design an algorithm which explores all reachable states in the MDP.
arXiv Detail & Related papers (2021-02-28T00:15:53Z) - Planning to Explore via Self-Supervised World Models [120.31359262226758]
Plan2Explore is a self-supervised reinforcement learning agent.
We present a new approach to self-supervised exploration and fast adaptation to new tasks.
Without any training supervision or task-specific interaction, Plan2Explore outperforms prior self-supervised exploration methods.
arXiv Detail & Related papers (2020-05-12T17:59:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.