Self-Supervised Monocular Visual Drone Model Identification through Improved Occlusion Handling
- URL: http://arxiv.org/abs/2504.21695v1
- Date: Wed, 30 Apr 2025 14:38:01 GMT
- Title: Self-Supervised Monocular Visual Drone Model Identification through Improved Occlusion Handling
- Authors: Stavrow A. Bahnam, Christophe De Wagter, Guido C. H. E. de Croon,
- Abstract summary: Ego-motion estimation is vital for drones when flying in GPS-denied environments.<n>We propose a self-supervised learning scheme to train a neural-network-based drone model using only onboard monocular video and flight controller data.<n>We demonstrate the value of the neural drone model by integrating it into a traditional filter-based VIO system.
- Score: 17.368574409020475
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ego-motion estimation is vital for drones when flying in GPS-denied environments. Vision-based methods struggle when flight speed increases and close-by objects lead to difficult visual conditions with considerable motion blur and large occlusions. To tackle this, vision is typically complemented by state estimation filters that combine a drone model with inertial measurements. However, these drone models are currently learned in a supervised manner with ground-truth data from external motion capture systems, limiting scalability to different environments and drones. In this work, we propose a self-supervised learning scheme to train a neural-network-based drone model using only onboard monocular video and flight controller data (IMU and motor feedback). We achieve this by first training a self-supervised relative pose estimation model, which then serves as a teacher for the drone model. To allow this to work at high speed close to obstacles, we propose an improved occlusion handling method for training self-supervised pose estimation models. Due to this method, the root mean squared error of resulting odometry estimates is reduced by an average of 15%. Moreover, the student neural drone model can be successfully obtained from the onboard data. It even becomes more accurate at higher speeds compared to its teacher, the self-supervised vision-based model. We demonstrate the value of the neural drone model by integrating it into a traditional filter-based VIO system (ROVIO), resulting in superior odometry accuracy on aggressive 3D racing trajectories near obstacles. Self-supervised learning of ego-motion estimation represents a significant step toward bridging the gap between flying in controlled, expensive lab environments and real-world drone applications. The fusion of vision and drone models will enable higher-speed flight and improve state estimation, on any drone in any environment.
Related papers
- Drone Detection and Tracking with YOLO and a Rule-based Method [0.0]
An increased volume of drone activity in public spaces requires regulatory actions for purposes of privacy protection and safety.<n> detection tasks are usually automated and performed by deep learning models which are trained on annotated image datasets.<n>This paper builds on a previous work and extends an already published open source dataset.<n>Since the detection models are based on a single image input, a simple cross-correlation based tracker is used to reduce detection drops and improve tracking performance in videos.
arXiv Detail & Related papers (2025-02-07T19:53:10Z) - A Cross-Scene Benchmark for Open-World Drone Active Tracking [54.235808061746525]
Drone Visual Active Tracking aims to autonomously follow a target object by controlling the motion system based on visual observations.<n>We propose a unified cross-scene cross-domain benchmark for open-world drone active tracking called DAT.<n>We also propose a reinforcement learning-based drone tracking method called R-VAT.
arXiv Detail & Related papers (2024-12-01T09:37:46Z) - Active Human Pose Estimation via an Autonomous UAV Agent [13.188563931419056]
This paper focuses on the task of human pose estimation from videos capturing a person's activity.
To address this, relocating the camera to a new vantage point is necessary to clarify the view.
Our proposed solution comprises three main components: a NeRF-based Drone-View Data Generation Framework, an On-Drone Network for Camera View Error Estimation, and a Combined Planner.
arXiv Detail & Related papers (2024-07-01T21:20:52Z) - Learning Deep Sensorimotor Policies for Vision-based Autonomous Drone
Racing [52.50284630866713]
Existing systems often require hand-engineered components for state estimation, planning, and control.
This paper tackles the vision-based autonomous-drone-racing problem by learning deep sensorimotor policies.
arXiv Detail & Related papers (2022-10-26T19:03:17Z) - TransVisDrone: Spatio-Temporal Transformer for Vision-based
Drone-to-Drone Detection in Aerial Videos [57.92385818430939]
Drone-to-drone detection using visual feed has crucial applications, such as detecting drone collisions, detecting drone attacks, or coordinating flight with other drones.
Existing methods are computationally costly, follow non-end-to-end optimization, and have complex multi-stage pipelines, making them less suitable for real-time deployment on edge devices.
We propose a simple yet effective framework, itTransVisDrone, that provides an end-to-end solution with higher computational efficiency.
arXiv Detail & Related papers (2022-10-16T03:05:13Z) - Learning a Single Near-hover Position Controller for Vastly Different
Quadcopters [56.37274861303324]
This paper proposes an adaptive near-hover position controller for quadcopters.
It can be deployed to quadcopters of very different mass, size and motor constants.
It also shows rapid adaptation to unknown disturbances during runtime.
arXiv Detail & Related papers (2022-09-19T17:55:05Z) - Visual Attention Prediction Improves Performance of Autonomous Drone
Racing Agents [45.36060508554703]
Humans race drones faster than neural networks trained for end-to-end autonomous flight.
This work investigates whether neural networks capable of imitating human eye gaze behavior and attention can improve neural network performance.
arXiv Detail & Related papers (2022-01-07T18:07:51Z) - Dogfight: Detecting Drones from Drones Videos [58.158988162743825]
This paper attempts to address the problem of drones detection from other flying drones variations.
The erratic movement of the source and target drones, small size, arbitrary shape, large intensity, and occlusion make this problem quite challenging.
To handle this, instead of using region-proposal based methods, we propose to use a two-stage segmentation-based approach.
arXiv Detail & Related papers (2021-03-31T17:43:31Z) - Learn by Observation: Imitation Learning for Drone Patrolling from
Videos of A Human Navigator [22.06785798356346]
We propose to let the drone learn patrolling in the air by observing and imitating how a human navigator does it on the ground.
The observation process enables the automatic collection and annotation of data using inter-frame geometric consistency.
A newly designed neural network is trained based on the annotated data to predict appropriate directions and translations.
arXiv Detail & Related papers (2020-08-30T15:20:40Z) - Learning to Fly via Deep Model-Based Reinforcement Learning [37.37420200406336]
We learn a thrust-attitude controller for a quadrotor through model-based reinforcement learning.
We show that "learning to fly" can be achieved with less than 30 minutes of experience with a single drone.
arXiv Detail & Related papers (2020-03-19T15:55:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.