Match and Locate: low-frequency monocular odometry based on deep feature
matching
- URL: http://arxiv.org/abs/2311.10034v1
- Date: Thu, 16 Nov 2023 17:32:58 GMT
- Title: Match and Locate: low-frequency monocular odometry based on deep feature
matching
- Authors: Stepan Konev, Yuriy Biktairov
- Abstract summary: We introduce a novel approach for the robotic odometry which only requires a single camera.
The approach is based on matching image features between the consecutive frames of the video stream using deep feature matching models.
We evaluate the performance of the approach in the AISG-SLA Visual Localisation Challenge and find that while being computationally efficient and easy to implement our method shows competitive results.
- Score: 0.65268245109828
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate and robust pose estimation plays a crucial role in many robotic
systems. Popular algorithms for pose estimation typically rely on high-fidelity
and high-frequency signals from various sensors. Inclusion of these sensors
makes the system less affordable and much more complicated. In this work we
introduce a novel approach for the robotic odometry which only requires a
single camera and, importantly, can produce reliable estimates given even
extremely low-frequency signal of around one frame per second. The approach is
based on matching image features between the consecutive frames of the video
stream using deep feature matching models. The resulting coarse estimate is
then adjusted by a convolutional neural network, which is also responsible for
estimating the scale of the transition, otherwise irretrievable using only the
feature matching information. We evaluate the performance of the approach in
the AISG-SLA Visual Localisation Challenge and find that while being
computationally efficient and easy to implement our method shows competitive
results with only around $3^{\circ}$ of orientation estimation error and $2m$
of translation estimation error taking the third place in the challenge.
Related papers
- SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning [17.99904937160487]
We introduce SCIPaD, a novel approach that incorporates spatial clues for unsupervised depth-pose joint learning.
SCIPaD achieves a reduction of 22.2% in average translation error and 34.8% in average angular error for camera pose estimation task on the KITTI Odometry dataset.
arXiv Detail & Related papers (2024-07-07T06:52:51Z) - Global Context Aggregation Network for Lightweight Saliency Detection of
Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.
First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module.
The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z) - Vanishing Point Estimation in Uncalibrated Images with Prior Gravity
Direction [82.72686460985297]
We tackle the problem of estimating a Manhattan frame.
We derive two new 2-line solvers, one of which does not suffer from singularities affecting existing solvers.
We also design a new non-minimal method, running on an arbitrary number of lines, to boost the performance in local optimization.
arXiv Detail & Related papers (2023-08-21T13:03:25Z) - RCDN -- Robust X-Corner Detection Algorithm based on Advanced CNN Model [3.580983453285039]
We present a novel detection algorithm which can maintain high sub-pixel precision on inputs under multiple interferences.
The whole algorithm, adopting a coarse-to-fine strategy, contains a X-corner detection network and three post-processing techniques.
Evaluations on real and synthetic images indicate that the presented algorithm has the higher detection rate, sub-pixel accuracy and robustness than other commonly used methods.
arXiv Detail & Related papers (2023-07-07T10:40:41Z) - The KFIoU Loss for Rotated Object Detection [115.334070064346]
In this paper, we argue that one effective alternative is to devise an approximate loss who can achieve trend-level alignment with SkewIoU loss.
Specifically, we model the objects as Gaussian distribution and adopt Kalman filter to inherently mimic the mechanism of SkewIoU.
The resulting new loss called KFIoU is easier to implement and works better compared with exact SkewIoU.
arXiv Detail & Related papers (2022-01-29T10:54:57Z) - HHP-Net: A light Heteroscedastic neural network for Head Pose estimation
with uncertainty [2.064612766965483]
We introduce a novel method to estimate the head pose of people in single images starting from a small set of head keypoints.
Our model is simple to implement and more efficient with respect to the state of the art.
arXiv Detail & Related papers (2021-11-02T08:55:45Z) - SignalNet: A Low Resolution Sinusoid Decomposition and Estimation
Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples.
We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions.
In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z) - Self-Supervised Multi-Frame Monocular Scene Flow [61.588808225321735]
We introduce a multi-frame monocular scene flow network based on self-supervised learning.
We observe state-of-the-art accuracy among monocular scene flow methods based on self-supervised learning.
arXiv Detail & Related papers (2021-05-05T17:49:55Z) - Lite-FPN for Keypoint-based Monocular 3D Object Detection [18.03406686769539]
Keypoint-based monocular 3D object detection has made tremendous progress and achieved great speed-accuracy trade-off.
We propose a sort of lightweight feature pyramid network called Lite-FPN to achieve multi-scale feature fusion.
Our proposed method achieves significantly higher accuracy and frame rate at the same time.
arXiv Detail & Related papers (2021-05-01T14:44:31Z) - SADet: Learning An Efficient and Accurate Pedestrian Detector [68.66857832440897]
This paper proposes a series of systematic optimization strategies for the detection pipeline of one-stage detector.
It forms a single shot anchor-based detector (SADet) for efficient and accurate pedestrian detection.
Though structurally simple, it presents state-of-the-art result and real-time speed of $20$ FPS for VGA-resolution images.
arXiv Detail & Related papers (2020-07-26T12:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.