Curriculum-Based Iterative Self-Play for Scalable Multi-Drone Racing
- URL: http://arxiv.org/abs/2510.22570v1
- Date: Sun, 26 Oct 2025 08:03:06 GMT
- Title: Curriculum-Based Iterative Self-Play for Scalable Multi-Drone Racing
- Authors: Onur Akgün,
- Abstract summary: CRUISE is a reinforcement learning framework for multi-drone racing.<n>It combines a progressive difficulty curriculum with an efficient self-play mechanism to foster robust competitive behaviors.<n>It achieves nearly double the planner's mean racing speed, maintains high success rates, and demonstrates robust scalability as agent density increases.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The coordination of multiple autonomous agents in high-speed, competitive environments represents a significant engineering challenge. This paper presents CRUISE (Curriculum-Based Iterative Self-Play for Scalable Multi-Drone Racing), a reinforcement learning framework designed to solve this challenge in the demanding domain of multi-drone racing. CRUISE overcomes key scalability limitations by synergistically combining a progressive difficulty curriculum with an efficient self-play mechanism to foster robust competitive behaviors. Validated in high-fidelity simulation with realistic quadrotor dynamics, the resulting policies significantly outperform both a standard reinforcement learning baseline and a state-of-the-art game-theoretic planner. CRUISE achieves nearly double the planner's mean racing speed, maintains high success rates, and demonstrates robust scalability as agent density increases. Ablation studies confirm that the curriculum structure is the critical component for this performance leap. By providing a scalable and effective training methodology, CRUISE advances the development of autonomous systems for dynamic, competitive tasks and serves as a blueprint for future real-world deployment.
Related papers
- Actor-Curator: Co-adaptive Curriculum Learning via Policy-Improvement Bandits for RL Post-Training [63.34044358216334]
ACTOR-CURATOR is a scalable and fully automated curriculum learning framework for reinforcement learning post-training of large language models.<n> Empirically, ACTOR-CURATOR consistently outperforms uniform sampling and strong curriculum baselines.
arXiv Detail & Related papers (2026-02-24T04:19:48Z) - LongCat-Flash-Thinking-2601 Technical Report [134.89732115690705]
LongCat-Flash-Thinking-2601 is an open-source Mixture-of-Experts (MoE) reasoning model with superior agentic reasoning capability.<n>LongCat-Flash-Thinking-2601 achieves state-of-the-art performance among open-source models on a wide range of agentic benchmarks.
arXiv Detail & Related papers (2026-01-23T13:20:09Z) - FASTer: Toward Efficient Autoregressive Vision Language Action Modeling via Neural Action Tokenization [61.10456021136654]
We introduce FASTer, a unified framework for efficient and general robot learning.<n>FASTerVQ encodes action chunks as single-channel images, capturing global-temporal dependencies while maintaining a high compression ratio.<n>FASTerVLA builds on this tokenizer with block-wise autoregressive decoding and a lightweight action expert, achieving both faster inference and higher task performance.
arXiv Detail & Related papers (2025-12-04T16:21:38Z) - DeepThinkVLA: Enhancing Reasoning Capability of Vision-Language-Action Models [51.76664843721462]
DeepThinkVLA is a new architecture for Vision-Language-Action models.<n>It generates sequential CoT with causal attention and switches to bidirectional attention for fast decoding of action vectors.<n>It achieves a 97.0% success rate on the LIBERO benchmark.
arXiv Detail & Related papers (2025-10-31T05:26:16Z) - SPIRAL: Self-Play Incremental Racing Algorithm for Learning in Multi-Drone Competitions [0.0]
This paper introduces SPIRAL, a novel approach for training autonomous drones in multi-agent racing competitions.<n> SPIRAL distinctively employs a self-play mechanism to incrementally cultivate complex racing behaviors.<n>Our method is designed for versatility, allowing integration with any state-of-the-art Deep Reinforcement Learning (DRL) algorithms.
arXiv Detail & Related papers (2025-10-26T07:59:44Z) - Curriculum Learning With Counterfactual Group Relative Policy Advantage For Multi-Agent Reinforcement Learning [15.539607264374242]
Multi-agent reinforcement learning (MARL) has achieved strong performance in cooperative adversarial tasks.<n>We propose a dynamic curriculum learning framework that employs an self-adaptive difficulty adjustment mechanism.<n>Our method improves both training stability and final performance, achieving competitive results against state-of-the-art methods.
arXiv Detail & Related papers (2025-06-09T08:38:18Z) - Digital Twin Synchronization: Bridging the Sim-RL Agent to a Real-Time Robotic Additive Manufacturing Control [2.5709786140685633]
This research advances the integration of Soft Actor-Critic with digital twins for industrial robotics applications.<n>The system architecture combines Unity's simulation environment with ROS2 for seamless digital twin synchronization.<n>Results show rapid policy convergence and robust task execution in both simulated and physical environments.
arXiv Detail & Related papers (2025-01-29T22:06:53Z) - Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics [50.191655141020505]
This work advances model-based reinforcement learning by addressing the challenges of long-horizon prediction, error accumulation, and sim-to-real transfer.<n>By providing a scalable and robust framework, the introduced methods pave the way for adaptive and efficient robotic systems in real-world applications.
arXiv Detail & Related papers (2025-01-17T10:39:09Z) - LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image Restoration [62.3751291442432]
We propose LoRA-IR, a flexible framework that dynamically leverages compact low-rank experts to facilitate efficient all-in-one image restoration.
LoRA-IR consists of two training stages: degradation-guided pre-training and parameter-efficient fine-tuning.
Experiments demonstrate that LoRA-IR achieves SOTA performance across 14 IR tasks and 29 benchmarks, while maintaining computational efficiency.
arXiv Detail & Related papers (2024-10-20T13:00:24Z) - Scalable Supervisory Architecture for Autonomous Race Cars [0.0]
This paper presents a scalable architecture designed for autonomous racing.
It emphasizes modularity, adaptability to diverse configurations, and the ability to supervise parallel execution of pipelines.
The results confirm the architecture's scalability and versatility, providing a robust foundation for the development of competitive autonomous racing systems.
arXiv Detail & Related papers (2024-08-27T13:19:17Z) - An Imitative Reinforcement Learning Framework for Pursuit-Lock-Launch Missions [9.002353110876529]
Unmanned Combat Aerial Vehicle (UCAV) Within-Visual-Range (WVR) engagement plays a decisive role on the aerial battlefields.<n>We propose a novel imitative reinforcement learning framework, which efficiently leverages expert data while enabling autonomous exploration.<n>Our framework can quickly learn the critical knowledge in complex aerial combat tasks, achieving up to a 100% success rate and demonstrating excellent robustness.
arXiv Detail & Related papers (2024-06-17T13:59:52Z) - RILe: Reinforced Imitation Learning [60.63173816209543]
RILe (Reinforced Learning) is a framework that combines the strengths of imitation learning and inverse reinforcement learning to learn a dense reward function efficiently.<n>Our framework produces high-performing policies in high-dimensional tasks where direct imitation fails to replicate complex behaviors.
arXiv Detail & Related papers (2024-06-12T17:56:31Z) - Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning [53.3760591018817]
We propose a new benchmarking environment for aquatic navigation using recent advances in the integration between game engines and Deep Reinforcement Learning.
Specifically, we focus on PPO, one of the most widely accepted algorithms, and we propose advanced training techniques.
Our empirical evaluation shows that a well-designed combination of these ingredients can achieve promising results.
arXiv Detail & Related papers (2024-05-30T23:20:23Z) - When Does Contrastive Learning Preserve Adversarial Robustness from
Pretraining to Finetuning? [99.4914671654374]
We propose AdvCL, a novel adversarial contrastive pretraining framework.
We show that AdvCL is able to enhance cross-task robustness transferability without loss of model accuracy and finetuning efficiency.
arXiv Detail & Related papers (2021-11-01T17:59:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.