Related papers: An Attention-Based Deep Learning Architecture for Real-Time Monocular Visual Odometry: Applications to GPS-free Drone Navigation

An Attention-Based Deep Learning Architecture for Real-Time Monocular Visual Odometry: Applications to GPS-free Drone Navigation

URL: http://arxiv.org/abs/2404.17745v1
Date: Sat, 27 Apr 2024 01:22:45 GMT
Title: An Attention-Based Deep Learning Architecture for Real-Time Monocular Visual Odometry: Applications to GPS-free Drone Navigation
Authors: Olivier Brochu Dufour, Abolfazl Mohebbi, Sofiane Achiche,
Abstract summary: Research in visual odometry is advancing, potentially solving GPS-free navigation issues. Recent studies utilizing deep neural networks (DNNs) have shown improved performance. This paper presents a novel real-time monocular visual odometry model for drones.
Score: 0.9558392439655012
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Drones are increasingly used in fields like industry, medicine, research, disaster relief, defense, and security. Technical challenges, such as navigation in GPS-denied environments, hinder further adoption. Research in visual odometry is advancing, potentially solving GPS-free navigation issues. Traditional visual odometry methods use geometry-based pipelines which, while popular, often suffer from error accumulation and high computational demands. Recent studies utilizing deep neural networks (DNNs) have shown improved performance, addressing these drawbacks. Deep visual odometry typically employs convolutional neural networks (CNNs) and sequence modeling networks like recurrent neural networks (RNNs) to interpret scenes and deduce visual odometry from video sequences. This paper presents a novel real-time monocular visual odometry model for drones, using a deep neural architecture with a self-attention module. It estimates the ego-motion of a camera on a drone, using consecutive video frames. An inference utility processes the live video feed, employing deep learning to estimate the drone's trajectory. The architecture combines a CNN for image feature extraction and a long short-term memory (LSTM) network with a multi-head attention module for video sequence modeling. Tested on two visual odometry datasets, this model converged 48% faster than a previous RNN model and showed a 22% reduction in mean translational drift and a 12% improvement in mean translational absolute trajectory error, demonstrating enhanced robustness to noise.

Related papers

Robot Localization and Mapping Final Report -- Sequential Adversarial Learning for Self-Supervised Deep Visual Odometry [2.512491726995032]
Visual odometry (VO) and SLAM have been using multi-view geometry via local structure from motion for decades. Deep neural networks to extract high level features is ubiquitous in computer vision. The goal of this work is to tackle these limitations of past approaches and to develop a method that can provide better depths and pose estimates.
arXiv Detail & Related papers (2023-09-08T06:24:17Z)
Deep Multi-Threshold Spiking-UNet for Image Processing [51.88730892920031]
This paper introduces the novel concept of Spiking-UNet for image processing, which combines the power of Spiking Neural Networks (SNNs) with the U-Net architecture. To achieve an efficient Spiking-UNet, we face two primary challenges: ensuring high-fidelity information propagation through the network via spikes and formulating an effective training strategy. Experimental results show that, on image segmentation and denoising, our Spiking-UNet achieves comparable performance to its non-spiking counterpart.
arXiv Detail & Related papers (2023-07-20T16:00:19Z)
Lost Vibration Test Data Recovery Using Convolutional Neural Network: A Case Study [0.0]
This paper proposes a CNN algorithm for the Alamosa Canyon Bridge as a real structure. Three different CNN models were considered to predict one and two malfunctioned sensors. The accuracy of the model was increased by adding a convolutional layer.
arXiv Detail & Related papers (2022-04-11T23:24:03Z)
FuNNscope: Visual microscope for interactively exploring the loss landscape of fully connected neural networks [77.34726150561087]
We show how to explore high-dimensional landscape characteristics of neural networks. We generalize observations on small neural networks to more complex systems. An interactive dashboard opens up a number of possible application networks.
arXiv Detail & Related papers (2022-04-09T16:41:53Z)
Spatial-Temporal Map Vehicle Trajectory Detection Using Dynamic Mode Decomposition and Res-UNet+ Neural Networks [0.0]
This paper presents a machine-learning-enhanced longitudinal scanline method to extract vehicle trajectories from high-angle traffic cameras. The Dynamic Mode Decomposition (DMD) method is applied to extract vehicle strands by decomposing the Spatial-Temporal Map (STMap) into the sparse foreground and low-rank background. A deep neural network named Res-UNet+ was designed for the semantic segmentation task by adapting two prevalent deep learning architectures.
arXiv Detail & Related papers (2022-01-13T00:49:24Z)
Keypoint Message Passing for Video-based Person Re-Identification [106.41022426556776]
Video-based person re-identification (re-ID) is an important technique in visual surveillance systems which aims to match video snippets of people captured by different cameras. Existing methods are mostly based on convolutional neural networks (CNNs), whose building blocks either process local neighbor pixels at a time, or, when 3D convolutions are used to model temporal information, suffer from the misalignment problem caused by person movement. In this paper, we propose to overcome the limitations of normal convolutions with a human-oriented graph method. Specifically, features located at person joint keypoints are extracted and connected as a spatial-temporal graph
arXiv Detail & Related papers (2021-11-16T08:01:16Z)
Automatic Extraction of Road Networks from Satellite Images by using Adaptive Structural Deep Belief Network [0.0]
Our model is applied to an automatic recognition method of road network system, called RoadTracer. RoadTracer can generate a road map on the ground surface from aerial photograph data. In order to improve the accuracy and the calculation time, our Adaptive DBN was implemented on the RoadTracer instead of the CNN.
arXiv Detail & Related papers (2021-10-25T07:06:10Z)
Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware. Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks. We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z)
UAV-AdNet: Unsupervised Anomaly Detection using Deep Neural Networks for Aerial Surveillance [20.318367304051176]
We propose a holistic anomaly detection system using deep neural networks for surveillance of critical infrastructures. First, we present a method for the explicit representation of spatial layouts of objects in bird-view images. Then, we propose a deep neural network architecture for unsupervised anomaly detection (UAV-AdNet) Unlike studies in the literature, we combine GPS and image data to predict abnormal observations.
arXiv Detail & Related papers (2020-11-05T14:26:29Z)
Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties. Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates. The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)
Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings. DNNs are often treated as black box systems, which complicates their evaluation and validation. One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations.
arXiv Detail & Related papers (2020-06-30T14:56:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.