Related papers: A Dual Curriculum Learning Framework for Multi-UAV Pursuit-Evasion in Diverse Environments

A Dual Curriculum Learning Framework for Multi-UAV Pursuit-Evasion in Diverse Environments

URL: http://arxiv.org/abs/2312.12255v2
Date: Tue, 30 Apr 2024 06:18:21 GMT
Title: A Dual Curriculum Learning Framework for Multi-UAV Pursuit-Evasion in Diverse Environments
Authors: Jiayu Chen, Guosheng Li, Chao Yu, Xinyi Yang, Botian Xu, Huazhong Yang, Yu Wang,
Abstract summary: This paper addresses multi-UAV pursuit-evasion, where a group of drones cooperate to capture a fast evader in a confined environment with obstacles. Existing algorithms, which simplify the pursuit-evasion problem, often lack expressive coordination strategies and struggle to capture the evader in extreme scenarios. We introduce a dual curriculum learning framework, named DualCL, which addresses multi-UAV pursuit-evasion in diverse environments and demonstrates zero-shot transfer ability to unseen scenarios.
Score: 15.959963737956848
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper addresses multi-UAV pursuit-evasion, where a group of drones cooperates to capture a fast evader in a confined environment with obstacles. Existing heuristic algorithms, which simplify the pursuit-evasion problem, often lack expressive coordination strategies and struggle to capture the evader in extreme scenarios, such as when the evader moves at high speeds. In contrast, reinforcement learning (RL) has been applied to this problem and has the potential to obtain highly cooperative capture strategies. However, RL-based methods face challenges in training for complex 3-dimensional scenarios with diverse task settings due to the vast exploration space. The dynamics constraints of drones further restrict the ability of reinforcement learning to acquire high-performance capture strategies. In this work, we introduce a dual curriculum learning framework, named DualCL, which addresses multi-UAV pursuit-evasion in diverse environments and demonstrates zero-shot transfer ability to unseen scenarios. DualCL comprises two main components: the Intrinsic Parameter Curriculum Proposer, which progressively suggests intrinsic parameters from easy to hard to improve the capture capability of drones, and the External Environment Generator, tasked with exploring unresolved scenarios and generating appropriate training distributions of external environment parameters. The simulation experimental results show that DualCL significantly outperforms baseline methods, achieving over 90% capture rate and reducing the capture timestep by at least 27.5% in the training scenarios. Additionally, it exhibits the best zero-shot generalization ability in unseen environments. Moreover, we demonstrate the transferability of our pursuit strategy from simulation to real-world environments. Further details can be found on the project website at https://sites.google.com/view/dualcl.

Related papers

Reinforcement Learning for Decision-Level Interception Prioritization in Drone Swarm Defense [56.47577824219207]
We present a case study demonstrating the practical advantages of reinforcement learning in addressing this challenge.<n>We introduce a high-fidelity simulation environment that captures realistic operational constraints.<n>Agent learns to coordinate multiple effectors for optimal interception prioritization.<n>We evaluate the learned policy against a handcrafted rule-based baseline across hundreds of simulated attack scenarios.
arXiv Detail & Related papers (2025-08-01T13:55:39Z)
Scaling Multi Agent Reinforcement Learning for Underwater Acoustic Tracking via Autonomous Vehicles [0.9217021281095907]
Multi-Agent Reinforcement Learning is notoriously sample-inefficient.<n>High-fidelity simulators like Gazebo's LRAUV provide 100x faster-than-real-time single-robot simulations.<n>We propose an iterative distillation method that transfers high-fidelity simulations into a simplified, GPU-accelerated environment.
arXiv Detail & Related papers (2025-05-13T04:42:30Z)
A General Infrastructure and Workflow for Quadrotor Deep Reinforcement Learning and Reality Deployment [48.90852123901697]
We propose a platform that enables seamless transfer of end-to-end deep reinforcement learning (DRL) policies to quadrotors. Our platform provides rich types of environments including hovering, dynamic obstacle avoidance, trajectory tracking, balloon hitting, and planning in unknown environments.
arXiv Detail & Related papers (2025-04-21T14:25:23Z)
RAPID: Robust and Agile Planner Using Inverse Reinforcement Learning for Vision-Based Drone Navigation [9.25068777307471]
This paper introduces a learning-based visual planner for agile drone flight in cluttered environments. The proposed planner generates collision-free waypoints in milliseconds, enabling drones to perform agile maneuvers in complex environments without building separate perception, mapping, and planning modules.
arXiv Detail & Related papers (2025-02-04T06:42:08Z)
Multi-UAV Pursuit-Evasion with Online Planning in Unknown Environments by Deep Reinforcement Learning [16.761470423715338]
Multi-UAV pursuit-evasion poses a key challenge for UAV swarm intelligence. We introduce an evader prediction-enhanced network to tackle partial observability in cooperative strategy learning. We derive a feasible policy via a two-stage reward refinement and deploy the policy on real quadrotors in a zero-shot manner.
arXiv Detail & Related papers (2024-09-24T08:40:04Z)
Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning [53.3760591018817]
We propose a new benchmarking environment for aquatic navigation using recent advances in the integration between game engines and Deep Reinforcement Learning. Specifically, we focus on PPO, one of the most widely accepted algorithms, and we propose advanced training techniques. Our empirical evaluation shows that a well-designed combination of these ingredients can achieve promising results.
arXiv Detail & Related papers (2024-05-30T23:20:23Z)
Bootstrapping Reinforcement Learning with Imitation for Vision-Based Agile Flight [20.92646531472541]
We propose a novel approach that combines the performance of Reinforcement Learning (RL) and the sample efficiency of Imitation Learning (IL) Our framework contains three phases teacher policy using RL with privileged state information distilling it into a student policy via IL, and adaptive fine-tuning via RL. Tests show our approach can not only learn in scenarios where RL from scratch fails but also outperforms existing IL methods in both robustness and performance.
arXiv Detail & Related papers (2024-03-18T19:25:57Z)
Learning to Fly in Seconds [7.259696592534715]
We show how curriculum learning and a highly optimized simulator enhance sample complexity and lead to fast training times. Our framework enables Simulation-to-Reality (Sim2Real) transfer for direct control after only 18 seconds of training on a consumer-grade laptop.
arXiv Detail & Related papers (2023-11-22T01:06:45Z)
Contrastive Initial State Buffer for Reinforcement Learning [25.849626996870526]
In Reinforcement Learning, the trade-off between exploration and exploitation poses a complex challenge for achieving efficient learning from limited samples. We introduce the concept of a Contrastive Initial State Buffer, which strategically selects states from past experiences and uses them to initialize the agent in the environment. We validate our approach on two complex robotic tasks without relying on any prior information about the environment.
arXiv Detail & Related papers (2023-09-18T13:26:40Z)
Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models [67.78935378952146]
GenRL is a framework for solving sequential decision-making problems. It exploits the combination of reinforcement learning and latent variable generative models. We experimentally determine the characteristics of generative models that have most influence on the performance of the final policy training.
arXiv Detail & Related papers (2022-04-18T22:02:32Z)
Transferable Deep Reinforcement Learning Framework for Autonomous Vehicles with Joint Radar-Data Communications [69.24726496448713]
We propose an intelligent optimization framework based on the Markov Decision Process (MDP) to help the AV make optimal decisions. We then develop an effective learning algorithm leveraging recent advances of deep reinforcement learning techniques to find the optimal policy for the AV. We show that the proposed transferable deep reinforcement learning framework reduces the obstacle miss detection probability by the AV up to 67% compared to other conventional deep reinforcement learning approaches.
arXiv Detail & Related papers (2021-05-28T08:45:37Z)
Decentralized Reinforcement Learning for Multi-Target Search and Detection by a Team of Drones [12.055303570215335]
Targets search and detection encompasses a variety of decision problems such as coverage, surveillance, search, observing and pursuit-evasion. We develop a multi-agent deep reinforcement learning (MADRL) method to coordinate a group of aerial vehicles (drones) for the purpose of locating a set of static targets in an unknown area.
arXiv Detail & Related papers (2021-03-17T09:04:47Z)
Language-guided Navigation via Cross-Modal Grounding and Alternate Adversarial Learning [66.9937776799536]
The emerging vision-and-language navigation (VLN) problem aims at learning to navigate an agent to the target location in unseen photo-realistic environments. The main challenges of VLN arise mainly from two aspects: first, the agent needs to attend to the meaningful paragraphs of the language instruction corresponding to the dynamically-varying visual environments. We propose a cross-modal grounding module to equip the agent with a better ability to track the correspondence between the textual and visual modalities.
arXiv Detail & Related papers (2020-11-22T09:13:46Z)
ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation [99.2543521972137]
ReLMoGen is a framework that combines a learned policy to predict subgoals and a motion generator to plan and execute the motion needed to reach these subgoals. Our method is benchmarked on a diverse set of seven robotics tasks in photo-realistic simulation environments. ReLMoGen shows outstanding transferability between different motion generators at test time, indicating a great potential to transfer to real robots.
arXiv Detail & Related papers (2020-08-18T08:05:15Z)
Robust Reinforcement Learning via Adversarial training with Langevin Dynamics [51.234482917047835]
We introduce a sampling perspective to tackle the challenging task of training robust Reinforcement Learning (RL) agents. We present a novel, scalable two-player RL algorithm, which is a sampling variant of the two-player policy method.
arXiv Detail & Related papers (2020-02-14T14:59:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.