A further exploration of deep Multi-Agent Reinforcement Learning with
Hybrid Action Space
- URL: http://arxiv.org/abs/2208.14447v1
- Date: Tue, 30 Aug 2022 07:40:15 GMT
- Title: A further exploration of deep Multi-Agent Reinforcement Learning with
Hybrid Action Space
- Authors: Hongzhi Hua, Guixuan Wen, Kaigui Wu
- Abstract summary: We propose two algorithms: deep multi-agent hybrid soft actor-critic (MAHSAC) and multi-agent hybrid deep deterministic policy gradients (MAHDDPG)
Our experiences are running on multi-agent particle environment which is an easy multi-agent particle world, along with some basic simulated physics.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The research of extending deep reinforcement learning (drl) to multi-agent
field has solved many complicated problems and made great achievements.
However, almost all these studies only focus on discrete or continuous action
space and there are few works having ever used multi-agent deep reinforcement
learning to real-world environment problems which mostly have a hybrid action
space. Therefore, in this paper, we propose two algorithms: deep multi-agent
hybrid soft actor-critic (MAHSAC) and multi-agent hybrid deep deterministic
policy gradients (MAHDDPG) to fill this gap. This two algorithms follow the
centralized training and decentralized execution (CTDE) paradigm and could
handle hybrid action space problems. Our experiences are running on multi-agent
particle environment which is an easy multi-agent particle world, along with
some basic simulated physics. The experimental results show that these
algorithms have good performances.
Related papers
- MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure [37.56309011441144]
This paper introduces MESA, a novel meta-exploration method for cooperative multi-agent learning.
It learns to explore by first identifying the agents' high-rewarding joint state-action subspace from training tasks and then learning a set of diverse exploration policies to "cover" the subspace.
Experiments show that with learned exploration policies, MESA achieves significantly better performance in sparse-reward tasks in several multi-agent particle environments and multi-agent MuJoCo environments.
arXiv Detail & Related papers (2024-05-01T23:19:48Z) - Accelerating Search-Based Planning for Multi-Robot Manipulation by Leveraging Online-Generated Experiences [20.879194337982803]
Multi-Agent Path-Finding (MAPF) algorithms have shown promise in discrete 2D domains, providing rigorous guarantees.
We propose an approach for accelerating conflict-based search algorithms by leveraging their repetitive and incremental nature.
arXiv Detail & Related papers (2024-03-29T20:31:07Z) - Latent Exploration for Reinforcement Learning [87.42776741119653]
In Reinforcement Learning, agents learn policies by exploring and interacting with the environment.
We propose LATent TIme-Correlated Exploration (Lattice), a method to inject temporally-correlated noise into the latent state of the policy network.
arXiv Detail & Related papers (2023-05-31T17:40:43Z) - Multi-agent Deep Covering Skill Discovery [50.812414209206054]
We propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space.
Also, we propose a novel framework to adopt the multi-agent options in the MARL process.
We show that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-10-07T00:40:59Z) - Deep Multi-Agent Reinforcement Learning with Hybrid Action Spaces based
on Maximum Entropy [0.0]
We propose Deep Multi-Agent Hybrid Soft Actor-Critic (MAHSAC) to handle multi-agent problems with hybrid action spaces.
This algorithm follows the centralized training but decentralized execution (CTDE) paradigm, and extend the Soft Actor-Critic algorithm (SAC) to handle hybrid action space problems.
Our experiences are running on an easy multi-agent particle world with a continuous observation and discrete action space, along with some basic simulated physics.
arXiv Detail & Related papers (2022-06-10T13:52:59Z) - Cooperative Exploration for Multi-Agent Deep Reinforcement Learning [127.4746863307944]
We propose cooperative multi-agent exploration (CMAE) for deep reinforcement learning.
The goal is selected from multiple projected state spaces via a normalized entropy-based technique.
We demonstrate that CMAE consistently outperforms baselines on various tasks.
arXiv Detail & Related papers (2021-07-23T20:06:32Z) - The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games [67.47961797770249]
Multi-Agent PPO (MAPPO) is a multi-agent PPO variant which adopts a centralized value function.
We show that MAPPO achieves performance comparable to the state-of-the-art in three popular multi-agent testbeds.
arXiv Detail & Related papers (2021-03-02T18:59:56Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z) - FACMAC: Factored Multi-Agent Centralised Policy Gradients [103.30380537282517]
We propose FACtored Multi-Agent Centralised policy gradients (FACMAC)
It is a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces.
We evaluate FACMAC on variants of the multi-agent particle environments, a novel multi-agent MuJoCo benchmark, and a challenging set of StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2020-03-14T21:29:09Z) - Optimizing Cooperative path-finding: A Scalable Multi-Agent RRT* with Dynamic Potential Fields [11.872579571976903]
This study introduces the multi-agent RRT* potential field (MA-RRT*PF), an innovative algorithm that addresses computational efficiency and path-finding efficacy in dense scenarios.
The empirical evaluations highlight MA-RRT*PF's significant superiority over conventional multi-agent RRT* (MA-RRT*) in dense environments.
arXiv Detail & Related papers (2019-11-16T13:02:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.