Using Soft Actor-Critic for Low-Level UAV Control
- URL: http://arxiv.org/abs/2010.02293v1
- Date: Mon, 5 Oct 2020 19:16:57 GMT
- Title: Using Soft Actor-Critic for Low-Level UAV Control
- Authors: Gabriel Moraes Barros and Esther Luna Colombini
- Abstract summary: We present a framework to train the Soft Actor-Critic (SAC) algorithm to low-level control of a quadrotor in a go-to-target task.
SAC can not only learn a robust policy, but it can also cope with unseen scenarios.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Unmanned Aerial Vehicles (UAVs), or drones, have recently been used in
several civil application domains from organ delivery to remote locations to
wireless network coverage. These platforms, however, are naturally unstable
systems for which many different control approaches have been proposed.
Generally based on classic and modern control, these algorithms require
knowledge of the robot's dynamics. However, recently, model-free reinforcement
learning has been successfully used for controlling drones without any prior
knowledge of the robot model. In this work, we present a framework to train the
Soft Actor-Critic (SAC) algorithm to low-level control of a quadrotor in a
go-to-target task. All experiments were conducted under simulation. With the
experiments, we show that SAC can not only learn a robust policy, but it can
also cope with unseen scenarios. Videos from the simulations are available in
https://www.youtube.com/watch?v=9z8vGs0Ri5g and the code in
https://github.com/larocs/SAC_uav.
Related papers
- HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit [52.12750762494588]
Current humanoid teleoperation systems either lack reliable low-level control policies, or struggle to acquire accurate whole-body control commands.
We propose a novel humanoid teleoperation cockpit integrates a humanoid loco-manipulation policy and a low-cost exoskeleton-based hardware system.
arXiv Detail & Related papers (2025-02-18T16:33:38Z) - Hand-Object Interaction Pretraining from Videos [77.92637809322231]
We learn general robot manipulation priors from 3D hand-object interaction trajectories.
We do so by sharing both the human hand and the manipulated object in 3D space and human motions to robot actions.
We empirically demonstrate that finetuning this policy, with both reinforcement learning (RL) and behavior cloning (BC), enables sample-efficient adaptation to downstream tasks and simultaneously improves robustness and generalizability compared to prior approaches.
arXiv Detail & Related papers (2024-09-12T17:59:07Z) - Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Learning a Single Near-hover Position Controller for Vastly Different
Quadcopters [56.37274861303324]
This paper proposes an adaptive near-hover position controller for quadcopters.
It can be deployed to quadcopters of very different mass, size and motor constants.
It also shows rapid adaptation to unknown disturbances during runtime.
arXiv Detail & Related papers (2022-09-19T17:55:05Z) - A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a
Platform [0.0]
We proposed a reinforcement learning framework based on Gazebo that is a kind of physical simulation platform (ROS-RL)
We used three continuous action space reinforcement learning algorithms in the framework to dealing with the problem of autonomous landing of drones.
arXiv Detail & Related papers (2022-09-07T06:33:57Z) - Adapting Rapid Motor Adaptation for Bipedal Robots [73.5914982741483]
We leverage recent advances in rapid adaptation for locomotion control, and extend them to work on bipedal robots.
A-RMA adapts the base policy for the imperfect extrinsics estimator by finetuning it using model-free RL.
We demonstrate that A-RMA outperforms a number of RL-based baseline controllers and model-based controllers in simulation.
arXiv Detail & Related papers (2022-05-30T17:59:09Z) - Verifying Learning-Based Robotic Navigation Systems [61.01217374879221]
We show how modern verification engines can be used for effective model selection.
Specifically, we use verification to detect and rule out policies that may demonstrate suboptimal behavior.
Our work is the first to demonstrate the use of verification backends for recognizing suboptimal DRL policies in real-world robots.
arXiv Detail & Related papers (2022-05-26T17:56:43Z) - Learning of Long-Horizon Sparse-Reward Robotic Manipulator Tasks with
Base Controllers [26.807673929816026]
We propose a method of learning long-horizon sparse-reward tasks utilizing one or more traditional base controllers.
Our algorithm incorporates the existing base controllers into stages of exploration, value learning, and policy update.
Our method bears the potential of leveraging existing industrial robot manipulation systems to build more flexible and intelligent controllers.
arXiv Detail & Related papers (2020-11-24T14:23:57Z) - AirCapRL: Autonomous Aerial Human Motion Capture using Deep
Reinforcement Learning [38.429105809093116]
We introduce a deep reinforcement learning (RL) based multi-robot formation controller for the task of autonomous aerial human motion capture (MoCap)
We focus on vision-based MoCap, where the objective is to estimate the trajectory of body pose and shape a single moving person using multiple aerial vehicles.
arXiv Detail & Related papers (2020-07-13T12:30:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.