Multi-Agent Reinforcement Learning-Based UAV Pathfinding for Obstacle Avoidance in Stochastic Environment
- URL: http://arxiv.org/abs/2310.16659v2
- Date: Fri, 25 Oct 2024 08:56:21 GMT
- Title: Multi-Agent Reinforcement Learning-Based UAV Pathfinding for Obstacle Avoidance in Stochastic Environment
- Authors: Qizhen Wu, Kexin Liu, Lei Chen, Jinhu Lü,
- Abstract summary: We propose a novel centralized training with decentralized execution method based on multi-agent reinforcement learning.
In our approach, agents communicate only with the centralized planner to make decentralized decisions online.
We conduct multi-step value convergence in multi-agent reinforcement learning to enhance the training efficiency.
- Score: 12.122881147337505
- License:
- Abstract: Traditional methods plan feasible paths for multiple agents in the stochastic environment. However, the methods' iterations with the changes in the environment result in computation complexities, especially for the decentralized agents without a centralized planner. Although reinforcement learning provides a plausible solution because of the generalization for different environments, it struggles with enormous agent-environment interactions in training. Here, we propose a novel centralized training with decentralized execution method based on multi-agent reinforcement learning, which is improved based on the idea of model predictive control. In our approach, agents communicate only with the centralized planner to make decentralized decisions online in the stochastic environment. Furthermore, considering the communication constraint with the centralized planner, each agent plans feasible paths through the extended observation, which combines information on neighboring agents based on the distance-weighted mean field approach. Inspired by the rolling optimization approach of model predictive control, we conduct multi-step value convergence in multi-agent reinforcement learning to enhance the training efficiency, which reduces the expensive interactions in convergence. Experiment results in both comparison, ablation, and real-robot studies validate the effectiveness and generalization performance of our method.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Imitation Learning based Alternative Multi-Agent Proximal Policy
Optimization for Well-Formed Swarm-Oriented Pursuit Avoidance [15.498559530889839]
In this paper, we put forward a decentralized learning based Alternative Multi-Agent Proximal Policy Optimization (IA-MAPPO) algorithm to execute the pursuit avoidance task in well-formed swarm.
We utilize imitation learning to decentralize the formation controller, so as to reduce the communication overheads and enhance the scalability.
The simulation results validate the effectiveness of IA-MAPPO and extensive ablation experiments further show the performance comparable to a centralized solution with significant decrease in communication overheads.
arXiv Detail & Related papers (2023-11-06T06:58:16Z) - Decentralized Adaptive Formation via Consensus-Oriented Multi-Agent
Communication [9.216867817261493]
We propose a novel Consensus-based Decentralized Adaptive Formation (Cons-DecAF) framework.
Specifically, we develop a novel multi-agent reinforcement learning method, Consensus-oriented Multi-Agent Communication (ConsMAC)
Instead of pre-assigning specific positions of agents, we employ a displacement-based formation by Hausdorff distance to significantly improve the formation efficiency.
arXiv Detail & Related papers (2023-07-23T10:41:17Z) - Expeditious Saliency-guided Mix-up through Random Gradient Thresholding [89.59134648542042]
Mix-up training approaches have proven to be effective in improving the generalization ability of Deep Neural Networks.
In this paper, inspired by the superior qualities of each direction over one another, we introduce a novel method that lies at the junction of the two routes.
We name our method R-Mix following the concept of "Random Mix-up"
In order to address the question of whether there exists a better decision protocol, we train a Reinforcement Learning agent that decides the mix-up policies.
arXiv Detail & Related papers (2022-12-09T14:29:57Z) - Scalable Multi-Agent Model-Based Reinforcement Learning [1.95804735329484]
We propose a new method called MAMBA which utilizes Model-Based Reinforcement Learning (MBRL) to further leverage centralized training in cooperative environments.
We argue that communication between agents is enough to sustain a world model for each agent during execution phase while imaginary rollouts can be used for training, removing the necessity to interact with the environment.
arXiv Detail & Related papers (2022-05-25T08:35:00Z) - The Gradient Convergence Bound of Federated Multi-Agent Reinforcement
Learning with Efficient Communication [20.891460617583302]
The paper considers independent reinforcement learning (IRL) for collaborative decision-making in the paradigm of federated learning (FL)
FL generates excessive communication overheads between agents and a remote central server.
This paper proposes two advanced optimization schemes to improve the system's utility value.
arXiv Detail & Related papers (2021-03-24T07:21:43Z) - Adaptive Serverless Learning [114.36410688552579]
We propose a novel adaptive decentralized training approach, which can compute the learning rate from data dynamically.
Our theoretical results reveal that the proposed algorithm can achieve linear speedup with respect to the number of workers.
To reduce the communication-efficient overhead, we further propose a communication-efficient adaptive decentralized training approach.
arXiv Detail & Related papers (2020-08-24T13:23:02Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z) - Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search.
We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z) - Model-based Reinforcement Learning for Decentralized Multiagent
Rendezvous [66.6895109554163]
Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans.
We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous.
arXiv Detail & Related papers (2020-03-15T19:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.