CMSG Cross-Media Semantic-Graph Feature Matching Algorithm for
Autonomous Vehicle Relocalization
- URL: http://arxiv.org/abs/2305.08318v1
- Date: Mon, 15 May 2023 03:08:10 GMT
- Title: CMSG Cross-Media Semantic-Graph Feature Matching Algorithm for
Autonomous Vehicle Relocalization
- Authors: Shuhang Tan, Hengyu Liu, Zhiling Wang
- Abstract summary: Cross-media methods are developing, which combined live image data and Lidar map.
We propose CMSG, a novel cross-media algorithm for AV relocalization tasks.
Semantic features are utilized for better interpretation of the correlation between point clouds and image features.
- Score: 0.36832029288386137
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Relocalization is the basis of map-based localization algorithms. Camera and
LiDAR map-based methods are pervasive since their robustness under different
scenarios. Generally, mapping and localization using the same sensor have
better accuracy since matching features between the same type of data is
easier. However, due to the camera's lack of 3D information and the high cost
of LiDAR, cross-media methods are developing, which combined live image data
and Lidar map. Although matching features between different media is
challenging, we believe cross-media is the tendency for AV relocalization since
its low cost and accuracy can be comparable to the same-sensor-based methods.
In this paper, we propose CMSG, a novel cross-media algorithm for AV
relocalization tasks. Semantic features are utilized for better interpretation
the correlation between point clouds and image features. What's more,
abstracted semantic graph nodes are introduced, and a graph network
architecture is integrated to better extract the similarity of semantic
features. Validation experiments are conducted on the KITTI odometry dataset.
Our results show that CMSG can have comparable or even better accuracy compared
to current single-sensor-based methods at a speed of 25 FPS on NVIDIA 1080 Ti
GPU.
Related papers
- Deep Homography Estimation for Visual Place Recognition [49.235432979736395]
We propose a transformer-based deep homography estimation (DHE) network.
It takes the dense feature map extracted by a backbone network as input and fits homography for fast and learnable geometric verification.
Experiments on benchmark datasets show that our method can outperform several state-of-the-art methods.
arXiv Detail & Related papers (2024-02-25T13:22:17Z) - A Hierarchical Descriptor Framework for On-the-Fly Anatomical Location
Matching between Longitudinal Studies [0.07499722271664144]
We propose a method to match anatomical locations between pairs of medical images in longitudinal comparisons.
The matching is made possible by computing a descriptor of the query point in a source image.
A hierarchical search operation finds the corresponding point with the most similar descriptor in the target image.
arXiv Detail & Related papers (2023-08-11T18:01:27Z) - LFM-3D: Learnable Feature Matching Across Wide Baselines Using 3D
Signals [9.201550006194994]
Learnable matchers often underperform when there exists only small regions of co-visibility between image pairs.
We propose LFM-3D, a Learnable Feature Matching framework that uses models based on graph neural networks.
We show that the resulting improved correspondences lead to much higher relative posing accuracy for in-the-wild image pairs.
arXiv Detail & Related papers (2023-03-22T17:46:27Z) - Visual Cross-View Metric Localization with Dense Uncertainty Estimates [11.76638109321532]
This work addresses visual cross-view metric localization for outdoor robotics.
Given a ground-level color image and a satellite patch that contains the local surroundings, the task is to identify the location of the ground camera within the satellite patch.
We devise a novel network architecture with denser satellite descriptors, similarity matching at the bottleneck, and a dense spatial distribution as output to capture multi-modal localization ambiguities.
arXiv Detail & Related papers (2022-08-17T20:12:23Z) - Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization
Using Satellite Image [91.29546868637911]
This paper addresses the problem of vehicle-mounted camera localization by matching a ground-level image with an overhead-view satellite map.
The key idea is to formulate the task as pose estimation and solve it by neural-net based optimization.
Experiments on standard autonomous vehicle localization datasets have confirmed the superiority of the proposed method.
arXiv Detail & Related papers (2022-04-10T19:16:58Z) - Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z) - Video-based Person Re-identification without Bells and Whistles [49.51670583977911]
Video-based person re-identification (Re-ID) aims at matching the video tracklets with cropped video frames for identifying the pedestrians under different cameras.
There exists severe spatial and temporal misalignment for those cropped tracklets due to the imperfect detection and tracking results generated with obsolete methods.
We present a simple re-Detect and Link (DL) module which can effectively reduce those unexpected noise through applying the deep learning-based detection and tracking on the cropped tracklets.
arXiv Detail & Related papers (2021-05-22T10:17:38Z) - Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences.
We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline.
Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z) - On the Texture Bias for Few-Shot CNN Segmentation [21.349705243254423]
Convolutional Neural Networks (CNNs) are driven by shapes to perform visual recognition tasks.
Recent evidence suggests texture bias in CNNs provides higher performing models when learning on large labeled training datasets.
We propose a novel architecture that integrates a set of Difference of Gaussians (DoG) to attenuate high-frequency local components in the feature space.
arXiv Detail & Related papers (2020-03-09T11:55:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.