Model-based Reinforcement Learning for Decentralized Multiagent
Rendezvous
- URL: http://arxiv.org/abs/2003.06906v2
- Date: Mon, 9 Nov 2020 05:34:13 GMT
- Title: Model-based Reinforcement Learning for Decentralized Multiagent
Rendezvous
- Authors: Rose E. Wang, J. Chase Kew, Dennis Lee, Tsang-Wei Edward Lee, Tingnan
Zhang, Brian Ichter, Jie Tan, Aleksandra Faust
- Abstract summary: Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans.
We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous.
- Score: 66.6895109554163
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Collaboration requires agents to align their goals on the fly. Underlying the
human ability to align goals with other agents is their ability to predict the
intentions of others and actively update their own plans. We propose
hierarchical predictive planning (HPP), a model-based reinforcement learning
method for decentralized multiagent rendezvous. Starting with pretrained,
single-agent point to point navigation policies and using noisy,
high-dimensional sensor inputs like lidar, we first learn via self-supervision
motion predictions of all agents on the team. Next, HPP uses the prediction
models to propose and evaluate navigation subgoals for completing the
rendezvous task without explicit communication among agents. We evaluate HPP in
a suite of unseen environments, with increasing complexity and numbers of
obstacles. We show that HPP outperforms alternative reinforcement learning,
path planning, and heuristic-based baselines on challenging, unseen
environments. Experiments in the real world demonstrate successful transfer of
the prediction models from sim to real world without any additional
fine-tuning. Altogether, HPP removes the need for a centralized operator in
multiagent systems by combining model-based RL and inference methods, enabling
agents to dynamically align plans.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Multi-Agent Reinforcement Learning-Based UAV Pathfinding for Obstacle Avoidance in Stochastic Environment [12.122881147337505]
We propose a novel centralized training with decentralized execution method based on multi-agent reinforcement learning.
In our approach, agents communicate only with the centralized planner to make decentralized decisions online.
We conduct multi-step value convergence in multi-agent reinforcement learning to enhance the training efficiency.
arXiv Detail & Related papers (2023-10-25T14:21:22Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - Learning Control Admissibility Models with Graph Neural Networks for
Multi-Agent Navigation [9.05607520128194]
Control admissibility models (CAMs) can be easily composed and used for online inference for an arbitrary number of agents.
We show that the CAM models can be trained in environments with only a few agents and be easily composed for deployment in dense environments with hundreds of agents, achieving better performance than state-of-the-art methods.
arXiv Detail & Related papers (2022-10-17T19:20:58Z) - Fully Decentralized Model-based Policy Optimization for Networked
Systems [23.46407780093797]
This work aims to improve data efficiency of multi-agent control by model-based learning.
We consider networked systems where agents are cooperative and communicate only locally with their neighbors.
In our method, each agent learns a dynamic model to predict future states and broadcast their predictions by communication, and then the policies are trained under the model rollouts.
arXiv Detail & Related papers (2022-07-13T23:52:14Z) - Deep Interactive Motion Prediction and Planning: Playing Games with
Motion Prediction Models [162.21629604674388]
This work presents a game-theoretic Model Predictive Controller (MPC) that uses a novel interactive multi-agent neural network policy as part of its predictive model.
Fundamental to the success of our method is the design of a novel multi-agent policy network that can steer a vehicle given the state of the surrounding agents and the map information.
arXiv Detail & Related papers (2022-04-05T17:58:18Z) - Learning Efficient Multi-Agent Cooperative Visual Exploration [18.42493808094464]
We consider the task of visual indoor exploration with multiple agents, where the agents need to cooperatively explore the entire indoor region using as few steps as possible.
We extend the state-of-the-art single-agent RL solution, Active Neural SLAM (ANS), to the multi-agent setting by introducing a novel RL-based global-goal planner, Spatial Coordination Planner ( SCP)
SCP leverages spatial information from each individual agent in an end-to-end manner and effectively guides the agents to navigate towards different spatial goals with high exploration efficiency.
arXiv Detail & Related papers (2021-10-12T04:48:10Z) - Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points.
Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters.
We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z) - Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search.
We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.