Data-Efficient Deep Reinforcement Learning for Attitude Control of
Fixed-Wing UAVs: Field Experiments
- URL: http://arxiv.org/abs/2111.04153v2
- Date: Wed, 19 Apr 2023 09:32:40 GMT
- Title: Data-Efficient Deep Reinforcement Learning for Attitude Control of
Fixed-Wing UAVs: Field Experiments
- Authors: Eivind B{\o}hn, Erlend M. Coates, Dirk Reinhardt, and Tor Arne
Johansen
- Abstract summary: We show that DRL can successfully learn to perform attitude control of a fixed-wing UAV operating directly on the original nonlinear dynamics.
We deploy the learned controller on the UAV in flight tests, demonstrating comparable performance to the state-of-the-art ArduPlane proportional-integral-derivative (PID) attitude controller.
- Score: 0.37798600249187286
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Attitude control of fixed-wing unmanned aerial vehicles (UAVs) is a difficult
control problem in part due to uncertain nonlinear dynamics, actuator
constraints, and coupled longitudinal and lateral motions. Current
state-of-the-art autopilots are based on linear control and are thus limited in
their effectiveness and performance. Deep reinforcement learning (DRL) is a
machine learning method to automatically discover optimal control laws through
interaction with the controlled system, which can handle complex nonlinear
dynamics. We show in this paper that DRL can successfully learn to perform
attitude control of a fixed-wing UAV operating directly on the original
nonlinear dynamics, requiring as little as three minutes of flight data. We
initially train our model in a simulation environment and then deploy the
learned controller on the UAV in flight tests, demonstrating comparable
performance to the state-of-the-art ArduPlane proportional-integral-derivative
(PID) attitude controller with no further online learning required. Learning
with significant actuation delay and diversified simulated dynamics were found
to be crucial for successful transfer to control of the real UAV. In addition
to a qualitative comparison with the ArduPlane autopilot, we present a
quantitative assessment based on linear analysis to better understand the
learning controller's behavior.
Related papers
- Custom Non-Linear Model Predictive Control for Obstacle Avoidance in Indoor and Outdoor Environments [0.0]
This paper introduces a Non-linear Model Predictive Control (NMPC) framework for the DJI Matrice 100.
The framework supports various trajectory types and employs a penalty-based cost function for control accuracy in tight maneuvers.
arXiv Detail & Related papers (2024-10-03T17:50:19Z) - Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Modelling, Positioning, and Deep Reinforcement Learning Path Tracking
Control of Scaled Robotic Vehicles: Design and Experimental Validation [3.807917169053206]
Scaled robotic cars are commonly equipped with a hierarchical control acthiecture that includes tasks dedicated to vehicle state estimation and control.
This paper covers both aspects by proposing (i) a federeted extended Kalman filter (FEKF) and (ii) a novel deep reinforcement learning (DRL) path tracking controller trained via an expert demonstrator.
The experimentally validated model is used for (i) supporting the design of the FEKF and (ii) serving as a digital twin for training the proposed DRL-based path tracking algorithm.
arXiv Detail & Related papers (2024-01-10T14:40:53Z) - DATT: Deep Adaptive Trajectory Tracking for Quadrotor Control [62.24301794794304]
Deep Adaptive Trajectory Tracking (DATT) is a learning-based approach that can precisely track arbitrary, potentially infeasible trajectories in the presence of large disturbances in the real world.
DATT significantly outperforms competitive adaptive nonlinear and model predictive controllers for both feasible smooth and infeasible trajectories in unsteady wind fields.
It can efficiently run online with an inference time less than 3.2 ms, less than 1/4 of the adaptive nonlinear model predictive control baseline.
arXiv Detail & Related papers (2023-10-13T12:22:31Z) - Real-Time Model-Free Deep Reinforcement Learning for Force Control of a
Series Elastic Actuator [56.11574814802912]
State-of-the art robotic applications utilize series elastic actuators (SEAs) with closed-loop force control to achieve complex tasks such as walking, lifting, and manipulation.
Model-free PID control methods are more prone to instability due to nonlinearities in the SEA.
Deep reinforcement learning has proved to be an effective model-free method for continuous control tasks.
arXiv Detail & Related papers (2023-04-11T00:51:47Z) - Active Learning of Discrete-Time Dynamics for Uncertainty-Aware Model Predictive Control [46.81433026280051]
We present a self-supervised learning approach that actively models the dynamics of nonlinear robotic systems.
Our approach showcases high resilience and generalization capabilities by consistently adapting to unseen flight conditions.
arXiv Detail & Related papers (2022-10-23T00:45:05Z) - Control-oriented meta-learning [25.316358215670274]
We use data-driven modeling with neural networks to learn, offline from past data, an adaptive controller with an internal parametric model of nonlinear features.
We meta-learn the adaptive controller with closed-loop tracking simulation as the base-learner and the average tracking error as the meta-objective.
arXiv Detail & Related papers (2022-04-14T03:02:27Z) - Learning to Control Direct Current Motor for Steering in Real Time via
Reinforcement Learning [2.3554584457413483]
We make use of the NFQ algorithm for steering position control of a golf cart in both a real hardware and a simulated environment.
We were able to increase the rate of successful control under four minutes in simulation and under 11 minutes in real hardware.
arXiv Detail & Related papers (2021-07-31T03:24:36Z) - Adaptive-Control-Oriented Meta-Learning for Nonlinear Systems [29.579737941918022]
We learn, offline from past data, an adaptive controller with an internal parametric model of nonlinear features.
We meta-learn the adaptive controller with closed-loop tracking simulation as the base-learner and the average tracking error as the meta-objective.
With a nonlinear planar rotorcraft subject to wind, we demonstrate that our adaptive controller outperforms other controllers trained with regression-oriented meta-learning.
arXiv Detail & Related papers (2021-03-07T23:49:59Z) - Learning a Contact-Adaptive Controller for Robust, Efficient Legged
Locomotion [95.1825179206694]
We present a framework that synthesizes robust controllers for a quadruped robot.
A high-level controller learns to choose from a set of primitives in response to changes in the environment.
A low-level controller that utilizes an established control method to robustly execute the primitives.
arXiv Detail & Related papers (2020-09-21T16:49:26Z) - Logarithmic Regret Bound in Partially Observable Linear Dynamical
Systems [91.43582419264763]
We study the problem of system identification and adaptive control in partially observable linear dynamical systems.
We present the first model estimation method with finite-time guarantees in both open and closed-loop system identification.
We show that AdaptOn is the first algorithm that achieves $textpolylogleft(Tright)$ regret in adaptive control of unknown partially observable linear dynamical systems.
arXiv Detail & Related papers (2020-03-25T06:00:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.