Deep Learning Computer Vision Algorithms for Real-time UAVs On-board
Camera Image Processing
- URL: http://arxiv.org/abs/2211.01037v1
- Date: Wed, 2 Nov 2022 11:10:42 GMT
- Title: Deep Learning Computer Vision Algorithms for Real-time UAVs On-board
Camera Image Processing
- Authors: Alessandro Palmas, Pietro Andronico
- Abstract summary: This paper describes how advanced deep learning based computer vision algorithms are applied to enable real-time on-board sensor processing for small UAVs.
All algorithms have been developed using state-of-the-art image processing methods based on deep neural networks.
- Score: 77.34726150561087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes how advanced deep learning based computer vision
algorithms are applied to enable real-time on-board sensor processing for small
UAVs. Four use cases are considered: target detection, classification and
localization, road segmentation for autonomous navigation in GNSS-denied zones,
human body segmentation, and human action recognition. All algorithms have been
developed using state-of-the-art image processing methods based on deep neural
networks. Acquisition campaigns have been carried out to collect custom
datasets reflecting typical operational scenarios, where the peculiar point of
view of a multi-rotor UAV is replicated. Algorithms architectures and trained
models performances are reported, showing high levels of both accuracy and
inference speed. Output examples and on-field videos are presented,
demonstrating models operation when deployed on a GPU-powered commercial
embedded device (NVIDIA Jetson Xavier) mounted on board of a custom quad-rotor,
paving the way to enabling high level autonomy.
Related papers
- Semantic Segmentation of Unmanned Aerial Vehicle Remote Sensing Images using SegFormer [0.14999444543328289]
This paper evaluates the effectiveness and efficiency of SegFormer, a semantic segmentation framework, for the semantic segmentation of UAV images.
SegFormer variants, ranging in real-time (B0) to high-performance (B5) models, are assessed using the UAVid dataset tailored for semantic segmentation tasks.
Experimental results showcase the model's performance on benchmark dataset, highlighting its ability to accurately delineate objects and land cover features in diverse UAV scenarios.
arXiv Detail & Related papers (2024-10-01T21:40:15Z) - TK-Planes: Tiered K-Planes with High Dimensional Feature Vectors for Dynamic UAV-based Scenes [58.180556221044235]
We present a new approach to bridge the domain gap between synthetic and real-world data for unmanned aerial vehicle (UAV)-based perception.
Our formulation is designed for dynamic scenes, consisting of small moving objects or human actions.
We evaluate its performance on challenging datasets, including Okutama Action and UG2.
arXiv Detail & Related papers (2024-05-04T21:55:33Z) - High Speed Human Action Recognition using a Photonic Reservoir Computer [1.7403133838762443]
We introduce a new training method for the reservoir computer, based on "Timesteps Of Interest"
We solve the task with high accuracy and speed, to the point of allowing for processing multiple video streams in real time.
arXiv Detail & Related papers (2023-05-24T16:04:42Z) - Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner.
Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping.
Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z) - Differentiable Frequency-based Disentanglement for Aerial Video Action
Recognition [56.91538445510214]
We present a learning algorithm for human activity recognition in videos.
Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras.
We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset.
arXiv Detail & Related papers (2022-09-15T22:16:52Z) - VideoPose: Estimating 6D object pose from videos [14.210010379733017]
We introduce a simple yet effective algorithm that uses convolutional neural networks to directly estimate object poses from videos.
Our proposed network takes a pre-trained 2D object detector as input, and aggregates visual features through a recurrent neural network to make predictions at each frame.
Experimental evaluation on the YCB-Video dataset show that our approach is on par with the state-of-the-art algorithms.
arXiv Detail & Related papers (2021-11-20T20:57:45Z) - Polyline Based Generative Navigable Space Segmentation for Autonomous
Visual Navigation [57.3062528453841]
We propose a representation-learning-based framework to enable robots to learn the navigable space segmentation in an unsupervised manner.
We show that the proposed PSV-Nets can learn the visual navigable space with high accuracy, even without any single label.
arXiv Detail & Related papers (2021-10-29T19:50:48Z) - Deep Direct Volume Rendering: Learning Visual Feature Mappings From
Exemplary Images [57.253447453301796]
We introduce Deep Direct Volume Rendering (DeepDVR), a generalization of Direct Volume Rendering (DVR) that allows for the integration of deep neural networks into the DVR algorithm.
We conceptualize the rendering in a latent color space, thus enabling the use of deep architectures to learn implicit mappings for feature extraction and classification.
Our generalization serves to derive novel volume rendering architectures that can be trained end-to-end directly from examples in image space.
arXiv Detail & Related papers (2021-06-09T23:03:00Z) - On Deep Learning Techniques to Boost Monocular Depth Estimation for
Autonomous Navigation [1.9007546108571112]
Inferring the depth of images is a fundamental inverse problem within the field of Computer Vision.
We propose a new lightweight and fast supervised CNN architecture combined with novel feature extraction models.
We also introduce an efficient surface normals module, jointly with a simple geometric 2.5D loss function, to solve SIDE problems.
arXiv Detail & Related papers (2020-10-13T18:37:38Z) - Deep Learning Based Vehicle Tracking System Using License Plate
Detection And Recognition [0.0]
The proposed system uses a novel approach to vehicle tracking using Vehicle License plate detection and recognition (OCR) technique.
Results were obtained at a speed of 30 frames per second with accuracy close to human.
arXiv Detail & Related papers (2020-05-10T14:03:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.