Real-Time Navigation for Autonomous Aerial Vehicles Using Video
- URL: http://arxiv.org/abs/2504.01996v1
- Date: Tue, 01 Apr 2025 01:14:42 GMT
- Title: Real-Time Navigation for Autonomous Aerial Vehicles Using Video
- Authors: Khizar Anjum, Parul Pandey, Vidyasagar Sadhu, Roberto Tron, Dario Pompili,
- Abstract summary: We introduce a novel Markov Decision Process(MDP) framework to reduce the workload of Computer Vision(CV) algorithms.<n>We apply our proposed framework to both feature-based and neural-network-based object-detection tasks.<n>These holistic tests show significant benefits in energy consumption and speed with only a limited loss in accuracy.
- Score: 11.414350041043326
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most applications in autonomous navigation using mounted cameras rely on the construction and processing of geometric 3D point clouds, which is an expensive process. However, there is another simpler way to make a space navigable quickly: to use semantic information (e.g., traffic signs) to guide the agent. However, detecting and acting on semantic information involves Computer Vision~(CV) algorithms such as object detection, which themselves are demanding for agents such as aerial drones with limited onboard resources. To solve this problem, we introduce a novel Markov Decision Process~(MDP) framework to reduce the workload of these CV approaches. We apply our proposed framework to both feature-based and neural-network-based object-detection tasks, using open-loop and closed-loop simulations as well as hardware-in-the-loop emulations. These holistic tests show significant benefits in energy consumption and speed with only a limited loss in accuracy compared to models based on static features and neural networks.
Related papers
- Efficient Baselines for Motion Prediction in Autonomous Driving [7.608073471097835]
Motion Prediction (MP) of multiple surroundings agents is a crucial task in arbitrarily complex environments.
We aim to develop compact models using State-Of-The-Art (SOTA) techniques for MP, including attention mechanisms and GNNs.
arXiv Detail & Related papers (2023-09-06T22:18:16Z) - UnLoc: A Universal Localization Method for Autonomous Vehicles using
LiDAR, Radar and/or Camera Input [51.150605800173366]
UnLoc is a novel unified neural modeling approach for localization with multi-sensor input in all weather conditions.
Our method is extensively evaluated on Oxford Radar RobotCar, ApolloSouthBay and Perth-WA datasets.
arXiv Detail & Related papers (2023-07-03T04:10:55Z) - Performance Study of YOLOv5 and Faster R-CNN for Autonomous Navigation
around Non-Cooperative Targets [0.0]
This paper discusses how the combination of cameras and machine learning algorithms can achieve the relative navigation task.
The performance of two deep learning-based object detection algorithms, Faster Region-based Convolutional Neural Networks (R-CNN) and You Only Look Once (YOLOv5) is tested.
The paper discusses the path to implementing the feature recognition algorithms and towards integrating them into the spacecraft Guidance Navigation and Control system.
arXiv Detail & Related papers (2023-01-22T04:53:38Z) - Deep Learning Computer Vision Algorithms for Real-time UAVs On-board
Camera Image Processing [77.34726150561087]
This paper describes how advanced deep learning based computer vision algorithms are applied to enable real-time on-board sensor processing for small UAVs.
All algorithms have been developed using state-of-the-art image processing methods based on deep neural networks.
arXiv Detail & Related papers (2022-11-02T11:10:42Z) - DOTIE -- Detecting Objects through Temporal Isolation of Events using a
Spiking Architecture [5.340730281227837]
Vision-based autonomous navigation systems rely on fast and accurate object detection algorithms to avoid obstacles.
We propose a novel technique that utilizes the temporal information inherently present in the events to efficiently detect moving objects.
We show that by utilizing our architecture, autonomous navigation systems can have minimal latency and energy overheads for performing object detection.
arXiv Detail & Related papers (2022-10-03T14:43:11Z) - CFTrack: Center-based Radar and Camera Fusion for 3D Multi-Object
Tracking [9.62721286522053]
We propose an end-to-end network for joint object detection and tracking based on radar and camera sensor fusion.
Our proposed method uses a center-based radar-camera fusion algorithm for object detection and utilizes a greedy algorithm for object association.
We evaluate our method on the challenging nuScenes dataset, where it achieves 20.0 AMOTA and outperforms all vision-based 3D tracking methods in the benchmark.
arXiv Detail & Related papers (2021-07-11T23:56:53Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device [53.323878851563414]
We propose a compiler-aware unified framework incorporating network enhancement and pruning search with the reinforcement learning techniques.
Specifically, a generator Recurrent Neural Network (RNN) is employed to provide the unified scheme for both network enhancement and pruning search automatically.
The proposed framework achieves real-time 3D object detection on mobile devices with competitive detection performance.
arXiv Detail & Related papers (2020-12-26T19:41:15Z) - Multi-scale Interaction for Real-time LiDAR Data Segmentation on an
Embedded Platform [62.91011959772665]
Real-time semantic segmentation of LiDAR data is crucial for autonomously driving vehicles.
Current approaches that operate directly on the point cloud use complex spatial aggregation operations.
We propose a projection-based method, called Multi-scale Interaction Network (MINet), which is very efficient and accurate.
arXiv Detail & Related papers (2020-08-20T19:06:11Z) - Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for
Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties.
Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates.
The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z) - Streaming Object Detection for 3-D Point Clouds [29.465873948076766]
LiDAR provides a prominent sensory modality that informs many existing perceptual systems.
The latency for perceptual systems based on point cloud data can be dominated by the amount of time for a complete rotational scan.
We show how operating on LiDAR data in its native streaming formulation offers several advantages for self driving object detection.
arXiv Detail & Related papers (2020-05-04T21:55:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.