Related papers: Centralizing State-Values in Dueling Networks for Multi-Robot Reinforcement Learning Mapless Navigation

Centralizing State-Values in Dueling Networks for Multi-Robot Reinforcement Learning Mapless Navigation

URL: http://arxiv.org/abs/2112.09012v1
Date: Thu, 16 Dec 2021 16:47:00 GMT
Title: Centralizing State-Values in Dueling Networks for Multi-Robot Reinforcement Learning Mapless Navigation
Authors: Enrico Marchesini, Alessandro Farinelli
Abstract summary: We study the problem of multi-robot mapless navigation in the popular Training and Decentralized Execution (CTDE) paradigm. This problem is challenging when each robot considers its path without explicitly sharing observations with other robots. We propose a novel architecture for CTDE that uses a centralized state-value network to compute a joint state-value.
Score: 87.85646257351212
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of multi-robot mapless navigation in the popular Centralized Training and Decentralized Execution (CTDE) paradigm. This problem is challenging when each robot considers its path without explicitly sharing observations with other robots and can lead to non-stationary issues in Deep Reinforcement Learning (DRL). The typical CTDE algorithm factorizes the joint action-value function into individual ones, to favor cooperation and achieve decentralized execution. Such factorization involves constraints (e.g., monotonicity) that limit the emergence of novel behaviors in an individual as each agent is trained starting from a joint action-value. In contrast, we propose a novel architecture for CTDE that uses a centralized state-value network to compute a joint state-value, which is used to inject global state information in the value-based updates of the agents. Consequently, each model computes its gradient update for the weights, considering the overall state of the environment. Our idea follows the insights of Dueling Networks as a separate estimation of the joint state-value has both the advantage of improving sample efficiency, while providing each robot information whether the global state is (or is not) valuable. Experiments in a robotic navigation task with 2 4, and 8 robots, confirm the superior performance of our approach over prior CTDE methods (e.g., VDN, QMIX).

Related papers

Collaborative Value Function Estimation Under Model Mismatch: A Federated Temporal Difference Analysis [55.13545823385091]
Federated reinforcement learning (FedRL) enables collaborative learning while preserving data privacy by preventing direct data exchange between agents. In real-world applications, each agent may experience slightly different transition dynamics, leading to inherent model mismatches. We show that even moderate levels of information sharing can significantly mitigate environment-specific errors.
arXiv Detail & Related papers (2025-03-21T18:06:28Z)
Capability-Aware Shared Hypernetworks for Flexible Heterogeneous Multi-Robot Coordination [2.681242476043447]
We propose Capability-Aware Shared Hypernetworks (CASH) to enable a single architecture to dynamically adapt to each robot and the current context. CASH encodes shared decision making strategies that can be adapted to each robot based on local observations and the robots' individual and collective capabilities.
arXiv Detail & Related papers (2025-01-10T15:39:39Z)
Generalizability of Graph Neural Networks for Decentralized Unlabeled Motion Planning [72.86540018081531]
Unlabeled motion planning involves assigning a set of robots to target locations while ensuring collision avoidance. This problem forms an essential building block for multi-robot systems in applications such as exploration, surveillance, and transportation. We address this problem in a decentralized setting where each robot knows only the positions of its $k$-nearest robots and $k$-nearest targets.
arXiv Detail & Related papers (2024-09-29T23:57:25Z)
Attention Graph for Multi-Robot Social Navigation with Deep Reinforcement Learning [0.0]
We present MultiSoc, a new method for learning multi-agent socially aware navigation strategies using deep reinforcement learning (RL) Inspired by recent works on multi-agent deep RL, our method leverages graph-based representation of agent interactions, combining the positions and fields of view of entities (pedestrians and agents) Our method learns faster than social navigation deep RL mono-agent techniques, and enables efficient multi-agent implicit coordination in challenging crowd navigation with multiple heterogeneous humans.
arXiv Detail & Related papers (2024-01-31T15:24:13Z)
Interactive Autonomous Navigation with Internal State Inference and Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework. These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents. Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z)
Decentralized Multi-Agent Reinforcement Learning with Global State Prediction [3.5843971648706296]
A critical challenge is non-stationarity, which occurs when two or more robots update individual or shared policies concurrently. We pose our problem as a Partially Observable Markov Decision Process, due to the absence of global knowledge on other agents. In the first, the robots exchange no messages, and are trained to rely on implicit communication through push-and-pull on the object to transport. In the second approach, we introduce Global State Prediction (GSP), a network trained to forma a belief over the swarm as a whole and predict its future states.
arXiv Detail & Related papers (2023-06-22T14:38:12Z)
Distributed Reinforcement Learning for Robot Teams: A Review [10.92709534981466]
Recent advances in sensing, actuation, and computation have opened the door to multi-robot systems. Community has leveraged model-free multi-agent reinforcement learning to devise efficient, scalable controllers for multi-robot systems. Recent findings: Decentralized MRS face fundamental challenges, such as non-stationarity and partial observability.
arXiv Detail & Related papers (2022-04-07T15:34:19Z)
CTDS: Centralized Teacher with Decentralized Student for Multi-Agent Reinforcement Learning [114.69155066932046]
This work proposes a novel. Teacher with Decentralized Student (C TDS) framework, which consists of a teacher model and a student model. Specifically, the teacher model allocates the team reward by learning individual Q-values conditioned on global observation. The student model utilizes the partial observations to approximate the Q-values estimated by the teacher model.
arXiv Detail & Related papers (2022-03-16T06:03:14Z)
Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy Gradients [43.862956745961654]
LSF-SAC is a novel framework that features a variational inference-based information-sharing mechanism as extra state information. We evaluate LSF-SAC on the StarCraft II micromanagement challenge and demonstrate that it outperforms several state-of-the-art methods in challenging collaborative tasks.
arXiv Detail & Related papers (2022-01-04T17:05:07Z)
Learning Connectivity for Data Distribution in Robot Teams [96.39864514115136]
We propose a task-agnostic, decentralized, low-latency method for data distribution in ad-hoc networks using Graph Neural Networks (GNN) Our approach enables multi-agent algorithms based on global state information to function by ensuring it is available at each robot. We train the distributed GNN communication policies via reinforcement learning using the average Age of Information as the reward function and show that it improves training stability compared to task-specific reward functions.
arXiv Detail & Related papers (2021-03-08T21:48:55Z)
Robot Navigation in a Crowd by Integrating Deep Reinforcement Learning and Online Planning [8.211771115758381]
It is still an open and challenging problem for mobile robots navigating along time-efficient and collision-free paths in a crowd. Deep reinforcement learning is a promising solution to this problem. We propose a graph-based deep reinforcement learning method, SG-DQN. Our model can help the robot better understand the crowd and achieve a high success rate of more than 0.99 in the crowd navigation task.
arXiv Detail & Related papers (2021-02-26T02:17:13Z)
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning [55.20040781688844]
QMIX is a novel value-based method that can train decentralised policies in a centralised end-to-end fashion. We propose the StarCraft Multi-Agent Challenge (SMAC) as a new benchmark for deep multi-agent reinforcement learning.
arXiv Detail & Related papers (2020-03-19T16:51:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.