Related papers: Hovering Flight of Soft-Actuated Insect-Scale Micro Aerial Vehicles using Deep Reinforcement Learning

Hovering Flight of Soft-Actuated Insect-Scale Micro Aerial Vehicles using Deep Reinforcement Learning

URL: http://arxiv.org/abs/2502.12355v1
Date: Mon, 17 Feb 2025 22:45:59 GMT
Title: Hovering Flight of Soft-Actuated Insect-Scale Micro Aerial Vehicles using Deep Reinforcement Learning
Authors: Yi-Hsuan Hsiao, Wei-Tung Chen, Yun-Sheng Chang, Pulkit Agrawal, YuFeng Chen,
Abstract summary: Soft-actuated insect-scale micro aerial vehicles (IMAVs) pose unique challenges for designing robust and computationally efficient controllers.<n>Here, we design a deep reinforcement learning (RL) controller that addresses system delay and uncertainties.<n>We deploy this controller on two different insect-scale aerial robots that weigh 720 mg and 850 mg, respectively.
Score: 25.353235604712562
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Soft-actuated insect-scale micro aerial vehicles (IMAVs) pose unique challenges for designing robust and computationally efficient controllers. At the millimeter scale, fast robot dynamics ($\sim$ms), together with system delay, model uncertainty, and external disturbances significantly affect flight performances. Here, we design a deep reinforcement learning (RL) controller that addresses system delay and uncertainties. To initialize this neural network (NN) controller, we propose a modified behavior cloning (BC) approach with state-action re-matching to account for delay and domain-randomized expert demonstration to tackle uncertainty. Then we apply proximal policy optimization (PPO) to fine-tune the policy during RL, enhancing performance and smoothing commands. In simulations, our modified BC substantially increases the mean reward compared to baseline BC; and RL with PPO improves flight quality and reduces command fluctuations. We deploy this controller on two different insect-scale aerial robots that weigh 720 mg and 850 mg, respectively. The robots demonstrate multiple successful zero-shot hovering flights, with the longest lasting 50 seconds and root-mean-square errors of 1.34 cm in lateral direction and 0.05 cm in altitude, marking the first end-to-end deep RL-based flight on soft-driven IMAVs.

Related papers

Task Delay and Energy Consumption Minimization for Low-altitude MEC via Evolutionary Multi-objective Deep Reinforcement Learning [52.64813150003228]
The low-altitude economy (LAE), driven by unmanned aerial vehicles (UAVs) and other aircraft, has revolutionized fields such as transportation, agriculture, and environmental monitoring. In the upcoming six-generation (6G) era, UAV-assisted mobile edge computing (MEC) is particularly crucial in challenging environments such as mountainous or disaster-stricken areas. The task offloading problem is one of the key issues in UAV-assisted MEC, primarily addressing the trade-off between minimizing the task delay and the energy consumption of the UAV.
arXiv Detail & Related papers (2025-01-11T02:32:42Z)
Monocular Obstacle Avoidance Based on Inverse PPO for Fixed-wing UAVs [29.207513994002202]
Fixed-wing Unmanned Aerial Vehicles (UAVs) are one of the most commonly used platforms for the Low-altitude Economy (LAE) and Urban Air Mobility (UAM) Classical obstacle avoidance systems, which rely on prior maps or sophisticated sensors, face limitations in unknown low-altitude environments and small UAV platforms. This paper proposes a lightweight deep reinforcement learning (DRL) based UAV collision avoidance system.
arXiv Detail & Related papers (2024-11-27T03:03:37Z)
Dashing for the Golden Snitch: Multi-Drone Time-Optimal Motion Planning with Multi-Agent Reinforcement Learning [10.579847782542982]
This paper presents a decentralized policy network using multi-agent reinforcement learning for time-optimal multi-drone flight. To strike a balance between flight efficiency and collision avoidance, we introduce a soft collision-free mechanism inspired by optimization-based methods. Extensive simulations show that, despite slight performance trade-offs compared to single-drone systems, our multi-drone approach maintains near-time-optimal performance with a low collision rate.
arXiv Detail & Related papers (2024-09-25T08:09:52Z)
AirPilot: Interpretable PPO-based DRL Auto-Tuned Nonlinear PID Drone Controller for Robust Autonomous Flights [0.046873264197900916]
AirPilot is a nonlinear Deep Reinforcement Learning (DRL) - enhanced Proportional Integral Derivative (PID) drone controller.<n>AirPilot controller combines the simplicity and effectiveness of traditional PID control with the adaptability, learning capability, and optimization potential of DRL.<n>Airpilot is capable of reducing the navigation error of the default PX4 PID position controller by 90%, improving effective navigation speed of a fine-tuned PID controller by 21%.
arXiv Detail & Related papers (2024-03-30T00:46:43Z)
Reaching the Limit in Autonomous Racing: Optimal Control versus Reinforcement Learning [66.10854214036605]
A central question in robotics is how to design a control system for an agile mobile robot. We show that a neural network controller trained with reinforcement learning (RL) outperformed optimal control (OC) methods in this setting. Our findings allowed us to push an agile drone to its maximum performance, achieving a peak acceleration greater than 12 times the gravitational acceleration and a peak velocity of 108 kilometers per hour.
arXiv Detail & Related papers (2023-10-17T02:40:27Z)
Real-Time Model-Free Deep Reinforcement Learning for Force Control of a Series Elastic Actuator [56.11574814802912]
State-of-the art robotic applications utilize series elastic actuators (SEAs) with closed-loop force control to achieve complex tasks such as walking, lifting, and manipulation. Model-free PID control methods are more prone to instability due to nonlinearities in the SEA. Deep reinforcement learning has proved to be an effective model-free method for continuous control tasks.
arXiv Detail & Related papers (2023-04-11T00:51:47Z)
Robust, High-Rate Trajectory Tracking on Insect-Scale Soft-Actuated Aerial Robots with Deep-Learned Tube MPC [0.0]
We present an approach for agile and computationally efficient trajectory tracking on the MIT SoftFly, a sub-gram MAV (0.7 grams) Our strategy employs a cascaded control scheme, where an adaptive attitude controller is combined with a neural network policy trained to imitate a trajectory tracking robust tube model predictive controller (RTMPC) We experimentally evaluate our approach, achieving position Root Mean Square Errors lower than 1.8 cm even in the more challenging maneuvers, obtaining a 60% reduction in maximum position error compared to our previous work, and robustness demonstrating to large external disturbances.
arXiv Detail & Related papers (2022-09-20T21:30:16Z)
Learning a Single Near-hover Position Controller for Vastly Different Quadcopters [56.37274861303324]
This paper proposes an adaptive near-hover position controller for quadcopters. It can be deployed to quadcopters of very different mass, size and motor constants. It also shows rapid adaptation to unknown disturbances during runtime.
arXiv Detail & Related papers (2022-09-19T17:55:05Z)
Adapting Rapid Motor Adaptation for Bipedal Robots [73.5914982741483]
We leverage recent advances in rapid adaptation for locomotion control, and extend them to work on bipedal robots. A-RMA adapts the base policy for the imperfect extrinsics estimator by finetuning it using model-free RL. We demonstrate that A-RMA outperforms a number of RL-based baseline controllers and model-based controllers in simulation.
arXiv Detail & Related papers (2022-05-30T17:59:09Z)
OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation [50.59541802645156]
Operational Space Control (OSC) has been used as an effective task-space controller for manipulation. We propose OSC for Adaptation and Robustness (OSCAR), a data-driven variant of OSC that compensates for modeling errors. We evaluate our method on a variety of simulated manipulation problems, and find substantial improvements over an array of controller baselines.
arXiv Detail & Related papers (2021-10-02T01:21:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.