Deep Q-Network Based Multi-agent Reinforcement Learning with Binary
Action Agents
- URL: http://arxiv.org/abs/2008.04109v1
- Date: Thu, 6 Aug 2020 15:16:05 GMT
- Title: Deep Q-Network Based Multi-agent Reinforcement Learning with Binary
Action Agents
- Authors: Abdul Mueed Hafiz and Ghulam Mohiuddin Bhat
- Abstract summary: Deep Q-Network (DQN) based multi-agent systems (MAS) for reinforcement learning (RL) use various schemes where in the agents have to learn and communicate.
We propose a simple but efficient DQN based MAS for RL which uses shared state and rewards, but agent-specific actions.
The benefits of the approach are overall simplicity, faster convergence and better performance as compared to conventional DQN based approaches.
- Score: 1.8782750537161614
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Q-Network (DQN) based multi-agent systems (MAS) for reinforcement
learning (RL) use various schemes where in the agents have to learn and
communicate. The learning is however specific to each agent and communication
may be satisfactorily designed for the agents. As more complex Deep QNetworks
come to the fore, the overall complexity of the multi-agent system increases
leading to issues like difficulty in training, need for higher resources and
more training time, difficulty in fine-tuning, etc. To address these issues we
propose a simple but efficient DQN based MAS for RL which uses shared state and
rewards, but agent-specific actions, for updation of the experience replay pool
of the DQNs, where each agent is a DQN. The benefits of the approach are
overall simplicity, faster convergence and better performance as compared to
conventional DQN based approaches. It should be noted that the method can be
extended to any DQN. As such we use simple DQN and DDQN (Double Q-learning)
respectively on three separate tasks i.e. Cartpole-v1 (OpenAI Gym environment)
, LunarLander-v2 (OpenAI Gym environment) and Maze Traversal (customized
environment). The proposed approach outperforms the baseline on these tasks by
decent margins respectively.
Related papers
- Multi-agent Reinforcement Learning with Deep Networks for Diverse Q-Vectors [3.9801926395657325]
This paper proposes a deep Q-networks (DQN) algorithm capable of learning various Q-vectors using Max, Nash, and Maximin strategies.
The effectiveness of this approach is demonstrated in an environment where dual robotic arms collaborate to lift a pot.
arXiv Detail & Related papers (2024-06-12T03:30:10Z) - Weakly Coupled Deep Q-Networks [5.76924666595801]
We propose a novel deep reinforcement learning algorithm that enhances performance in weakly coupled Markov decision processes (WCMDP)
WCDQN employs a single network to train multiple DQN "subagents", one for each subproblem, and then combine their solutions to establish an upper bound on the optimal action value.
arXiv Detail & Related papers (2023-10-28T20:07:57Z) - Multi-Agent Reinforcement Learning with Action Masking for UAV-enabled
Mobile Communications [1.3053649021965603]
Unmanned Aerial Vehicles (UAVs) are increasingly used as aerial base stations to provide ad hoc communications infrastructure.
This paper focuses on the use of multiple UAVs for providing wireless communication to mobile users in the absence of terrestrial communications infrastructure.
We jointly optimize UAV 3D trajectory and NOMA power allocation to maximize system throughput.
arXiv Detail & Related papers (2023-03-29T14:41:03Z) - MA2QL: A Minimalist Approach to Fully Decentralized Multi-Agent
Reinforcement Learning [63.46052494151171]
We propose textitmulti-agent alternate Q-learning (MA2QL), where agents take turns to update their Q-functions by Q-learning.
We prove that when each agent guarantees a $varepsilon$-convergence at each turn, their joint policy converges to a Nash equilibrium.
Results show MA2QL consistently outperforms IQL, which verifies the effectiveness of MA2QL, despite such minimal changes.
arXiv Detail & Related papers (2022-09-17T04:54:32Z) - M$^2$DQN: A Robust Method for Accelerating Deep Q-learning Network [6.689964384669018]
We propose a framework which uses the Max-Mean loss in Deep Q-Network (M$2$DQN)
Instead of sampling one batch of experiences in the training step, we sample several batches from the experience replay and update the parameters such as the maximum TD-error of these batches is minimized.
We verify the effectiveness of this framework with one of the most widely used techniques, Double DQN (DDQN) in several gym games.
arXiv Detail & Related papers (2022-09-16T09:20:35Z) - Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior.
The retrieval process is trained to retrieve information from the dataset that may be useful in the current context.
We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z) - Deep Reinforcement Learning with Spiking Q-learning [51.386945803485084]
spiking neural networks (SNNs) are expected to realize artificial intelligence (AI) with less energy consumption.
It provides a promising energy-efficient way for realistic control tasks by combining SNNs with deep reinforcement learning (RL)
arXiv Detail & Related papers (2022-01-21T16:42:11Z) - Multi-Agent Collaboration via Reward Attribution Decomposition [75.36911959491228]
We propose Collaborative Q-learning (CollaQ) that achieves state-of-the-art performance in the StarCraft multi-agent challenge.
CollaQ is evaluated on various StarCraft Attribution maps and shows that it outperforms existing state-of-the-art techniques.
arXiv Detail & Related papers (2020-10-16T17:42:11Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z) - Efficient Ridesharing Dispatch Using Multi-Agent Reinforcement Learning [0.0]
Ride-sharing services such as Uber and Lyft offer a service where passengers can order a car to pick them up.
Traditional Reinforcement Learning (RL) based methods attempting to solve the ridesharing problem are unable to accurately model the complex environment in which taxis operate.
We show that our model performs better than the IDQN baseline on a fixed grid size and is able to generalize well to smaller or larger grid sizes.
Our algorithm is able to outperform IDQN baseline in the scenario where we have a variable number of passengers and cars in each episode.
arXiv Detail & Related papers (2020-06-18T23:37:53Z) - Distributed Reinforcement Learning for Cooperative Multi-Robot Object
Manipulation [53.262360083572005]
We consider solving a cooperative multi-robot object manipulation task using reinforcement learning (RL)
We propose two distributed multi-agent RL approaches: distributed approximate RL (DA-RL) and game-theoretic RL (GT-RL)
Although we focus on a small system of two agents in this paper, both DA-RL and GT-RL apply to general multi-agent systems, and are expected to scale well to large systems.
arXiv Detail & Related papers (2020-03-21T00:43:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.