Dashing for the Golden Snitch: Multi-Drone Time-Optimal Motion Planning with Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2409.16720v1
- Date: Wed, 25 Sep 2024 08:09:52 GMT
- Title: Dashing for the Golden Snitch: Multi-Drone Time-Optimal Motion Planning with Multi-Agent Reinforcement Learning
- Authors: Xian Wang, Jin Zhou, Yuanli Feng, Jiahao Mei, Jiming Chen, Shuo Li,
- Abstract summary: This paper presents a decentralized policy network for time-optimal multi-drone flight using multi-agent reinforcement learning.
To strike a balance between flight efficiency and collision avoidance, we introduce a soft collision penalty inspired by optimization-based methods.
Extensive simulations show that, despite slight performance trade-offs compared to single-drone systems, our multi-drone approach maintains near-time-optimal performance with low collision rates.
- Score: 10.579847782542982
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent innovations in autonomous drones have facilitated time-optimal flight in single-drone configurations and enhanced maneuverability in multi-drone systems through the application of optimal control and learning-based methods. However, few studies have achieved time-optimal motion planning for multi-drone systems, particularly during highly agile maneuvers or in dynamic scenarios. This paper presents a decentralized policy network for time-optimal multi-drone flight using multi-agent reinforcement learning. To strike a balance between flight efficiency and collision avoidance, we introduce a soft collision penalty inspired by optimization-based methods. By customizing PPO in a centralized training, decentralized execution (CTDE) fashion, we unlock higher efficiency and stability in training, while ensuring lightweight implementation. Extensive simulations show that, despite slight performance trade-offs compared to single-drone systems, our multi-drone approach maintains near-time-optimal performance with low collision rates. Real-world experiments validate our method, with two quadrotors using the same network as simulation achieving a maximum speed of 13.65 m/s and a maximum body rate of 13.4 rad/s in a 5.5 m * 5.5 m * 2.0 m space across various tracks, relying entirely on onboard computation.
Related papers
- Hybrid Imitation-Learning Motion Planner for Urban Driving [0.0]
We propose a novel hybrid motion planner that integrates both learning-based and optimization-based techniques.
Our model effectively balances safety and human-likeness, mitigating the trade-off inherent in these objectives.
We validate our approach through simulation experiments and further demonstrate its efficacy by deploying it in real-world self-driving vehicles.
arXiv Detail & Related papers (2024-09-04T16:54:31Z) - UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning [79.16150966434299]
We formulate a UAV-enabled collaborative beamforming multi-objective optimization problem (UCBMOP) to maximize the transmission rate of the UVAA and minimize the energy consumption of all UAVs.
We use the heterogeneous-agent trust region policy optimization (HATRPO) as the basic framework, and then propose an improved HATRPO algorithm, namely HATRPO-UCB.
arXiv Detail & Related papers (2024-04-11T03:19:22Z) - AirPilot: Interpretable PPO-based DRL Auto-Tuned Nonlinear PID Drone Controller for Robust Autonomous Flights [1.947822083318316]
AirPilot is a nonlinear Deep Reinforcement Learning (DRL) - enhanced Proportional Integral Derivative (PID) drone controller.
AirPilot controller combines the simplicity and effectiveness of traditional PID control with the adaptability, learning capability, and optimization potential of DRL.
Airpilot is capable of reducing the navigation error of the default PX4 PID position controller by 90%, improving effective navigation speed of a fine-tuned PID controller by 21%.
arXiv Detail & Related papers (2024-03-30T00:46:43Z) - TransVisDrone: Spatio-Temporal Transformer for Vision-based
Drone-to-Drone Detection in Aerial Videos [57.92385818430939]
Drone-to-drone detection using visual feed has crucial applications, such as detecting drone collisions, detecting drone attacks, or coordinating flight with other drones.
Existing methods are computationally costly, follow non-end-to-end optimization, and have complex multi-stage pipelines, making them less suitable for real-time deployment on edge devices.
We propose a simple yet effective framework, itTransVisDrone, that provides an end-to-end solution with higher computational efficiency.
arXiv Detail & Related papers (2022-10-16T03:05:13Z) - Learning a Single Near-hover Position Controller for Vastly Different
Quadcopters [56.37274861303324]
This paper proposes an adaptive near-hover position controller for quadcopters.
It can be deployed to quadcopters of very different mass, size and motor constants.
It also shows rapid adaptation to unknown disturbances during runtime.
arXiv Detail & Related papers (2022-09-19T17:55:05Z) - Motion Planning and Control for Multi Vehicle Autonomous Racing at High
Speeds [100.61456258283245]
This paper presents a multi-layer motion planning and control architecture for autonomous racing.
The proposed solution has been applied on a Dallara AV-21 racecar and tested at oval race tracks achieving lateral accelerations up to 25 $m/s2$.
arXiv Detail & Related papers (2022-07-22T15:16:54Z) - Time-Optimal Planning for Quadrotor Waypoint Flight [50.016821506107455]
Planning time-optimal trajectories at the actuation limit of a quadrotor is an open problem.
We propose a solution while exploiting the full quadrotor's actuator potential.
We validate our method in real-world flights in one of the world's largest motion-capture systems.
arXiv Detail & Related papers (2021-08-10T09:26:43Z) - Identification and Avoidance of Static and Dynamic Obstacles on Point
Cloud for UAVs Navigation [7.14505983271756]
We introduce a technique to distinguish dynamic obstacles from static ones with only point cloud input.
A computationally efficient obstacle avoidance motion planning approach is proposed and it is in line with an improved relative velocity method.
The approach is able to avoid both static obstacles and dynamic ones in the same framework.
arXiv Detail & Related papers (2021-05-14T02:44:18Z) - Time-Efficient Mars Exploration of Simultaneous Coverage and Charging
with Multiple Drones [14.160624396972707]
This paper presents a time-efficient scheme for Mars exploration by the cooperation of multiple drones and a rover.
A comprehensive framework has been developed with joint consideration for limited energy, sensor model, communication range and safety radius.
Extensive simulations have been conducted to demonstrate the remarkable performance of TIME-SC2.
arXiv Detail & Related papers (2020-11-16T07:28:37Z) - Multi-Agent Reinforcement Learning in NOMA-aided UAV Networks for
Cellular Offloading [59.32570888309133]
A novel framework is proposed for cellular offloading with the aid of multiple unmanned aerial vehicles (UAVs)
Non-orthogonal multiple access (NOMA) technique is employed at each UAV to further improve the spectrum efficiency of the wireless network.
A mutual deep Q-network (MDQN) algorithm is proposed to jointly determine the optimal 3D trajectory and power allocation of UAVs.
arXiv Detail & Related papers (2020-10-18T20:22:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.