Related papers: Improving the Resilience of Quadrotors in Underground Environments by Combining Learning-based and Safety Controllers

Improving the Resilience of Quadrotors in Underground Environments by Combining Learning-based and Safety Controllers

URL: http://arxiv.org/abs/2509.02808v1
Date: Tue, 02 Sep 2025 20:22:54 GMT
Title: Improving the Resilience of Quadrotors in Underground Environments by Combining Learning-based and Safety Controllers
Authors: Isaac Ronald Ward, Mark Paral, Kristopher Riordan, Mykel J. Kochenderfer,
Abstract summary: We train a normalizing flow-based prior over the environment, which provides a measure of how far out-of-distribution the quadrotor is at any given time.<n>We use this measure as a runtime monitor, allowing us to switch between a learning-based controller and a safe controller when we are sufficiently out-of-distribution.
Score: 22.566692834880396
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Autonomously controlling quadrotors in large-scale subterranean environments is applicable to many areas such as environmental surveying, mining operations, and search and rescue. Learning-based controllers represent an appealing approach to autonomy, but are known to not generalize well to `out-of-distribution' environments not encountered during training. In this work, we train a normalizing flow-based prior over the environment, which provides a measure of how far out-of-distribution the quadrotor is at any given time. We use this measure as a runtime monitor, allowing us to switch between a learning-based controller and a safe controller when we are sufficiently out-of-distribution. Our methods are benchmarked on a point-to-point navigation task in a simulated 3D cave environment based on real-world point cloud data from the DARPA Subterranean Challenge Final Event Dataset. Our experimental results show that our combined controller simultaneously possesses the liveness of the learning-based controller (completing the task quickly) and the safety of the safety controller (avoiding collision).

Related papers

Plug-and-Play Benchmarking of Reinforcement Learning Algorithms for Large-Scale Flow Control [61.155940786140455]
Reinforcement learning (RL) has shown promising results in active flow control (AFC)<n>Current AFC benchmarks rely on external computational fluid dynamics (CFD) solvers, are not fully differentiable, and provide limited 3D and multi-agent support.<n>We introduce FluidGym, the first standalone, fully differentiable benchmark suite for RL in AFC.
arXiv Detail & Related papers (2026-01-21T14:13:44Z)
Designing Control Barrier Function via Probabilistic Enumeration for Safe Reinforcement Learning Navigation [55.02966123945644]
We propose a hierarchical control framework leveraging neural network verification techniques to design control barrier functions (CBFs) and policy correction mechanisms.<n>Our approach relies on probabilistic enumeration to identify unsafe regions of operation, which are then used to construct a safe CBF-based control layer.<n>These experiments demonstrate the ability of the proposed solution to correct unsafe actions while preserving efficient navigation behavior.
arXiv Detail & Related papers (2025-04-30T13:47:25Z)
A General Infrastructure and Workflow for Quadrotor Deep Reinforcement Learning and Reality Deployment [48.90852123901697]
We propose a platform that enables seamless transfer of end-to-end deep reinforcement learning (DRL) policies to quadrotors.<n>Our platform provides rich types of environments including hovering, dynamic obstacle avoidance, trajectory tracking, balloon hitting, and planning in unknown environments.
arXiv Detail & Related papers (2025-04-21T14:25:23Z)
Extensive Exploration in Complex Traffic Scenarios using Hierarchical Reinforcement Learning [7.380119332658803]
Our research introduces a pioneering hierarchical framework that efficiently decomposes intricate decision-making problems into manageable subtasks.<n>We adopt a two step training process that trains the high-level controller and low-level controller separately.<n>The high-level controller exhibits an enhanced exploration potential with long-term delayed rewards, and the low-level controller provides longitudinal and lateral control ability using short-term instantaneous rewards.
arXiv Detail & Related papers (2025-01-25T00:00:11Z)
Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers. Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy. We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z)
A comparison of RL-based and PID controllers for 6-DOF swimming robots: hybrid underwater object tracking [8.362739554991073]
We present an exploration and assessment of employing a centralized deep Q-network (DQN) controller as a substitute for PID controllers. Our primary focus centers on illustrating this transition with the specific case of underwater object tracking. Our experiments, conducted within a Unity-based simulator, validate the effectiveness of a centralized RL agent over separated PID controllers.
arXiv Detail & Related papers (2024-01-29T23:14:15Z)
In-Distribution Barrier Functions: Self-Supervised Policy Filters that Avoid Out-of-Distribution States [84.24300005271185]
We propose a control filter that wraps any reference policy and effectively encourages the system to stay in-distribution with respect to offline-collected safe demonstrations. Our method is effective for two different visuomotor control tasks in simulation environments, including both top-down and egocentric view settings.
arXiv Detail & Related papers (2023-01-27T22:28:19Z)
Learning to Control Direct Current Motor for Steering in Real Time via Reinforcement Learning [2.3554584457413483]
We make use of the NFQ algorithm for steering position control of a golf cart in both a real hardware and a simulated environment. We were able to increase the rate of successful control under four minutes in simulation and under 11 minutes in real hardware.
arXiv Detail & Related papers (2021-07-31T03:24:36Z)
Deluca -- A Differentiable Control Library: Environments, Methods, and Benchmarking [52.44199258132215]
We present an open-source library of differentiable physics and robotics environments. The library features several popular environments, including classical control settings from OpenAI Gym. We give several use-cases of new scientific results obtained using the library.
arXiv Detail & Related papers (2021-02-19T15:06:47Z)
Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion [95.1825179206694]
We present a framework that synthesizes robust controllers for a quadruped robot. A high-level controller learns to choose from a set of primitives in response to changes in the environment. A low-level controller that utilizes an established control method to robustly execute the primitives.
arXiv Detail & Related papers (2020-09-21T16:49:26Z)
Vision-Based Autonomous Drone Control using Supervised Learning in Simulation [0.0]
We propose a vision-based control approach using Supervised Learning for autonomous navigation and landing of MAVs in indoor environments. We trained a Convolutional Neural Network (CNN) that maps low resolution image and sensor input to high-level control commands. Our approach requires shorter training times than similar Reinforcement Learning approaches and can potentially overcome the limitations of manual data collection faced by comparable Supervised Learning approaches.
arXiv Detail & Related papers (2020-09-09T13:45:41Z)
Learning Power Control from a Fixed Batch of Data [28.618312473850974]
We exploit power control data, gathered from a monitored environment, for performing power control in an unexplored environment. We adopt offline deep reinforcement learning, whereby the agent learns the policy to produce the transmission powers solely by using the data.
arXiv Detail & Related papers (2020-08-05T01:00:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.