Collision-Free Flocking with a Dynamic Squad of Fixed-Wing UAVs Using
Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2101.08074v1
- Date: Wed, 20 Jan 2021 11:23:35 GMT
- Title: Collision-Free Flocking with a Dynamic Squad of Fixed-Wing UAVs Using
Deep Reinforcement Learning
- Authors: Chao Yan, Xiaojia Xiang, Chang Wang, Zhen Lan
- Abstract summary: We deal with the decentralized leader-follower flocking control problem through deep reinforcement learning (DRL)
We propose a novel reinforcement learning algorithm CACER-II for training a shared control policy for all the followers.
As a result, the variable-length system state can be encoded into a fixed-length embedding vector, which makes the learned DRL policies independent with the number or the order of followers.
- Score: 2.555094847583209
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Developing the collision-free flocking behavior for a dynamic squad of
fixed-wing UAVs is still a challenge due to kinematic complexity and
environmental uncertainty. In this paper, we deal with the decentralized
leader-follower flocking control problem through deep reinforcement learning
(DRL). Specifically, we formulate a decentralized DRL-based decision making
framework from the perspective of every follower, where a collision avoidance
mechanism is integrated into the flocking controller. Then, we propose a novel
reinforcement learning algorithm CACER-II for training a shared control policy
for all the followers. Besides, we design a plug-n-play embedding module based
on convolutional neural networks and the attention mechanism. As a result, the
variable-length system state can be encoded into a fixed-length embedding
vector, which makes the learned DRL policies independent with the number or the
order of followers. Finally, numerical simulation results demonstrate the
effectiveness of the proposed method, and the learned policies can be directly
transferred to semiphysical simulation without any parameter finetuning.
Related papers
- Variational Autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance [0.0]
Deep Reinforcement Learning (DRL) has emerged as a promising control framework.
Current DRL algorithms require disproportionally large computational resources to find near-optimal policies.
This paper presents a comprehensive exploration of our proposed approach in maritime control systems.
arXiv Detail & Related papers (2024-03-31T09:25:28Z) - Integrating DeepRL with Robust Low-Level Control in Robotic Manipulators for Non-Repetitive Reaching Tasks [0.24578723416255746]
In robotics, contemporary strategies are learning-based, characterized by a complex black-box nature and a lack of interpretability.
We propose integrating a collision-free trajectory planner based on deep reinforcement learning (DRL) with a novel auto-tuning low-level control strategy.
arXiv Detail & Related papers (2024-02-04T15:54:03Z) - Partial End-to-end Reinforcement Learning for Robustness Against Modelling Error in Autonomous Racing [0.0]
This paper addresses the issue of increasing the performance of reinforcement learning (RL) solutions for autonomous racing cars.
We propose a partial end-to-end algorithm that decouples the planning and control tasks.
By leveraging the robustness of a classical controller, our partial end-to-end driving algorithm exhibits better robustness towards model mismatches than standard end-to-end algorithms.
arXiv Detail & Related papers (2023-12-11T14:27:10Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - Model-Based Reinforcement Learning with Isolated Imaginations [61.67183143982074]
We propose Iso-Dream++, a model-based reinforcement learning approach.
We perform policy optimization based on the decoupled latent imaginations.
This enables long-horizon visuomotor control tasks to benefit from isolating mixed dynamics sources in the wild.
arXiv Detail & Related papers (2023-03-27T02:55:56Z) - Enhancing Cyber Resilience of Networked Microgrids using Vertical
Federated Reinforcement Learning [3.9338764026621758]
We propose a novel federated reinforcement learning (Fed-RL) methodology to enhance the cyber resiliency of networked microgrids.
To circumvent data-sharing issues and concerns for proprietary privacy in multi-party-owned networked grids, we propose a novel Fed-RL algorithm to train the RL agents.
The proposed methodology is validated with numerical examples of modified IEEE 123-bus benchmark test systems.
arXiv Detail & Related papers (2022-12-17T22:56:02Z) - Training and Evaluation of Deep Policies using Reinforcement Learning
and Generative Models [67.78935378952146]
GenRL is a framework for solving sequential decision-making problems.
It exploits the combination of reinforcement learning and latent variable generative models.
We experimentally determine the characteristics of generative models that have most influence on the performance of the final policy training.
arXiv Detail & Related papers (2022-04-18T22:02:32Z) - Neural Dynamic Policies for End-to-End Sensorimotor Learning [51.24542903398335]
The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces.
We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space.
NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks.
arXiv Detail & Related papers (2020-12-04T18:59:32Z) - Online Reinforcement Learning Control by Direct Heuristic Dynamic
Programming: from Time-Driven to Event-Driven [80.94390916562179]
Time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives.
It is desirable to prevent the time-driven dHDP from updating due to insignificant system event such as noise.
We show how the event-driven dHDP algorithm works in comparison to the original time-driven dHDP.
arXiv Detail & Related papers (2020-06-16T05:51:25Z) - Integrating Deep Reinforcement Learning with Model-based Path Planners
for Automated Driving [0.0]
We propose a hybrid approach for integrating a path planning pipe into a vision based DRL framework.
In summary, the DRL agent is trained to follow the path planner's waypoints as close as possible.
Experimental results show that the proposed method can plan its path and navigate between randomly chosen origin-destination points.
arXiv Detail & Related papers (2020-02-02T17:10:19Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.