Leveling the Playing Field: Carefully Comparing Classical and Learned Controllers for Quadrotor Trajectory Tracking
- URL: http://arxiv.org/abs/2506.17832v1
- Date: Sat, 21 Jun 2025 22:03:00 GMT
- Title: Leveling the Playing Field: Carefully Comparing Classical and Learned Controllers for Quadrotor Trajectory Tracking
- Authors: Pratik Kunapuli, Jake Welde, Dinesh Jayaraman, Vijay Kumar,
- Abstract summary: Learning-based control approaches like reinforcement learning (RL) have recently produced a slew of impressive results for tasks like quadrotor trajectory tracking and drone racing.<n>We observe, however, that reliably comparing the performance of such very different classes of controllers is more complicated than might appear at first sight.<n>We develop a set of best practices for synthesizing the best-in-class RL and geometric controllers for benchmarking.
- Score: 26.134736322861443
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning-based control approaches like reinforcement learning (RL) have recently produced a slew of impressive results for tasks like quadrotor trajectory tracking and drone racing. Naturally, it is common to demonstrate the advantages of these new controllers against established methods like analytical controllers. We observe, however, that reliably comparing the performance of such very different classes of controllers is more complicated than might appear at first sight. As a case study, we take up the problem of agile tracking of an end-effector for a quadrotor with a fixed arm. We develop a set of best practices for synthesizing the best-in-class RL and geometric controllers (GC) for benchmarking. In the process, we resolve widespread RL-favoring biases in prior studies that provide asymmetric access to: (1) the task definition, in the form of an objective function, (2) representative datasets, for parameter optimization, and (3) feedforward information, describing the desired future trajectory. The resulting findings are the following: our improvements to the experimental protocol for comparing learned and classical controllers are critical, and each of the above asymmetries can yield misleading conclusions. Prior works have claimed that RL outperforms GC, but we find the gaps between the two controller classes are much smaller than previously published when accounting for symmetric comparisons. Geometric control achieves lower steady-state error than RL, while RL has better transient performance, resulting in GC performing better in relatively slow or less agile tasks, but RL performing better when greater agility is required. Finally, we open-source implementations of geometric and RL controllers for these aerial vehicles, implementing best practices for future development. Website and code is available at https://pratikkunapuli.github.io/rl-vs-gc/
Related papers
- VerIF: Verification Engineering for Reinforcement Learning in Instruction Following [55.60192044049083]
Reinforcement learning with verifiable rewards (RLVR) has become a key technique for enhancing large language models (LLMs)<n>We propose VerIF, a verification method that combines rule-based code verification with LLM-based verification from a large reasoning model.<n>We apply RL training with VerIF to two models, achieving significant improvements across several representative instruction-following benchmarks.
arXiv Detail & Related papers (2025-06-11T17:10:36Z) - Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving [27.551399472250168]
Large Language Models (LLMs) often struggle with mathematical reasoning tasks requiring precise, verifiable computation.<n>While Reinforcement Learning (RL) from outcome-based rewards enhances text-based reasoning, understanding how agents autonomously learn to leverage external tools like code execution remains crucial.
arXiv Detail & Related papers (2025-05-12T17:23:34Z) - Reaching the Limit in Autonomous Racing: Optimal Control versus
Reinforcement Learning [66.10854214036605]
A central question in robotics is how to design a control system for an agile mobile robot.
We show that a neural network controller trained with reinforcement learning (RL) outperformed optimal control (OC) methods in this setting.
Our findings allowed us to push an agile drone to its maximum performance, achieving a peak acceleration greater than 12 times the gravitational acceleration and a peak velocity of 108 kilometers per hour.
arXiv Detail & Related papers (2023-10-17T02:40:27Z) - Plume: A Framework for High Performance Deep RL Network Controllers via
Prioritized Trace Sampling [8.917042313344943]
We introduce a framework, Plume, to automatically identify and balance the skewed input trace distribution in DRL training datasets.
We evaluate Plume on three networking environments, including Adaptive Bitrate Streaming, Congestion Control, and Load Balancing.
Plume offers superior performance in both simulation and real-world settings, across different controllers and DRL algorithms.
arXiv Detail & Related papers (2023-02-24T02:09:33Z) - Learning to Optimize for Reinforcement Learning [58.01132862590378]
Reinforcement learning (RL) is essentially different from supervised learning, and in practice, these learneds do not work well even in simple RL tasks.
Agent-gradient distribution is non-independent and identically distributed, leading to inefficient meta-training.
We show that, although only trained in toy tasks, our learned can generalize unseen complex tasks in Brax.
arXiv Detail & Related papers (2023-02-03T00:11:02Z) - Training Efficient Controllers via Analytic Policy Gradient [44.0762454494769]
Control design for robotic systems is complex and often requires solving an optimization to follow a trajectory accurately.
Online optimization approaches like Model Predictive Control (MPC) have been shown to achieve great tracking performance, but require high computing power.
We propose an Analytic Policy Gradient (APG) method to tackle this problem.
arXiv Detail & Related papers (2022-09-26T22:04:35Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - RL-DARTS: Differentiable Architecture Search for Reinforcement Learning [62.95469460505922]
We introduce RL-DARTS, one of the first applications of Differentiable Architecture Search (DARTS) in reinforcement learning (RL)
By replacing the image encoder with a DARTS supernet, our search method is sample-efficient, requires minimal extra compute resources, and is also compatible with off-policy and on-policy RL algorithms, needing only minor changes in preexisting code.
We show that the supernet gradually learns better cells, leading to alternative architectures which can be highly competitive against manually designed policies, but also verify previous design choices for RL policies.
arXiv Detail & Related papers (2021-06-04T03:08:43Z) - RL-Scope: Cross-Stack Profiling for Deep Reinforcement Learning
Workloads [4.575381867242508]
We propose RL-Scope, a cross-stack profiler that scopes low-level CPU/GPU resource usage to high-level algorithmic operations.
We demonstrate RL-Scope's utility through in-depth case studies.
arXiv Detail & Related papers (2021-02-08T15:42:48Z) - Learning Dexterous Manipulation from Suboptimal Experts [69.8017067648129]
Relative Entropy Q-Learning (REQ) is a simple policy algorithm that combines ideas from successful offline and conventional RL algorithms.
We show how REQ is also effective for general off-policy RL, offline RL, and RL from demonstrations.
arXiv Detail & Related papers (2020-10-16T18:48:49Z) - Decoupling Representation Learning from Reinforcement Learning [89.82834016009461]
We introduce an unsupervised learning task called Augmented Temporal Contrast (ATC)
ATC trains a convolutional encoder to associate pairs of observations separated by a short time difference.
In online RL experiments, we show that training the encoder exclusively using ATC matches or outperforms end-to-end RL.
arXiv Detail & Related papers (2020-09-14T19:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.