Semantic Signatures for Large-scale Visual Localization
- URL: http://arxiv.org/abs/2005.03388v1
- Date: Thu, 7 May 2020 11:33:10 GMT
- Title: Semantic Signatures for Large-scale Visual Localization
- Authors: Li Weng, Valerie Gouet-Brunet, Bahman Soheilian
- Abstract summary: This work explores a different path by utilizing high-level semantic information.
It is found that object information in a street view can facilitate localization.
Several metrics and protocols are proposed for signature comparison and retrieval.
- Score: 2.9542356825059715
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual localization is a useful alternative to standard localization
techniques. It works by utilizing cameras. In a typical scenario, features are
extracted from captured images and compared with geo-referenced databases.
Location information is then inferred from the matching results. Conventional
schemes mainly use low-level visual features. These approaches offer good
accuracy but suffer from scalability issues. In order to assist localization in
large urban areas, this work explores a different path by utilizing high-level
semantic information. It is found that object information in a street view can
facilitate localization. A novel descriptor scheme called "semantic signature"
is proposed to summarize this information. A semantic signature consists of
type and angle information of visible objects at a spatial location. Several
metrics and protocols are proposed for signature comparison and retrieval. They
illustrate different trade-offs between accuracy and complexity. Extensive
simulation results confirm the potential of the proposed scheme in large-scale
applications. This paper is an extended version of a conference paper in
CBMI'18. A more efficient retrieval protocol is presented with additional
experiment results.
Related papers
- FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization [57.59857784298536]
Direct 2D-3D matching algorithms require significantly less memory but suffer from lower accuracy due to the larger and more ambiguous search space.
We address this ambiguity by fusing local and global descriptors using a weighted average operator within a 2D-3D search framework.
We consistently improve the accuracy over local-only systems and achieve performance close to hierarchical methods while halving memory requirements.
arXiv Detail & Related papers (2024-08-21T23:42:16Z) - MSSPlace: Multi-Sensor Place Recognition with Visual and Text Semantics [41.94295877935867]
We study the impact of leveraging a multi-camera setup and integrating diverse data sources for multimodal place recognition.
Our proposed method named MSSPlace utilizes images from multiple cameras, LiDAR point clouds, semantic segmentation masks, and text annotations to generate comprehensive place descriptors.
arXiv Detail & Related papers (2024-07-22T14:24:56Z) - AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization [57.34659640776723]
We propose an end-to-end framework named AddressCLIP to solve the problem with more semantics.
We have built three datasets from Pittsburgh and San Francisco on different scales specifically for the IAL problem.
arXiv Detail & Related papers (2024-07-11T03:18:53Z) - Deep Homography Estimation for Visual Place Recognition [49.235432979736395]
We propose a transformer-based deep homography estimation (DHE) network.
It takes the dense feature map extracted by a backbone network as input and fits homography for fast and learnable geometric verification.
Experiments on benchmark datasets show that our method can outperform several state-of-the-art methods.
arXiv Detail & Related papers (2024-02-25T13:22:17Z) - Are Local Features All You Need for Cross-Domain Visual Place
Recognition? [13.519413608607781]
Visual Place Recognition aims to predict the coordinates of an image based solely on visual clues.
Despite recent advances, recognizing the same place when the query comes from a significantly different distribution is still a major hurdle for state of the art retrieval methods.
In this work we explore whether re-ranking methods based on spatial verification can tackle these challenges.
arXiv Detail & Related papers (2023-04-12T14:46:57Z) - Location retrieval using visible landmarks based qualitative place
signatures [0.7119463843130092]
A qualitative location retrieval method is proposed in this work by describing locations/places using qualitative place signatures (QPS)
After dividing the space into place cells each with individual signatures attached, a coarse-to-fine location retrieval method is proposed to efficiently identify the possible location(s) of viewers based on their qualitative observations.
arXiv Detail & Related papers (2022-07-26T13:57:49Z) - Learning Implicit Feature Alignment Function for Semantic Segmentation [51.36809814890326]
Implicit Feature Alignment function (IFA) is inspired by the rapidly expanding topic of implicit neural representations.
We show that IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions.
Our method can be combined with improvement on various architectures, and it achieves state-of-the-art accuracy trade-off on common benchmarks.
arXiv Detail & Related papers (2022-06-17T09:40:14Z) - Sparse Spatial Transformers for Few-Shot Learning [6.271261279657655]
Learning from limited data is challenging because data scarcity leads to a poor generalization of the trained model.
We propose a novel transformer-based neural network architecture called sparse spatial transformers.
Our method finds task-relevant features and suppresses task-irrelevant features.
arXiv Detail & Related papers (2021-09-27T10:36:32Z) - SSC: Semantic Scan Context for Large-Scale Place Recognition [13.228580954956342]
We explore the use of high-level features, namely semantics, to improve the representation ability of descriptors.
We propose a novel global descriptor, Semantic Scan Context, which explores semantic information to represent scenes more effectively.
Our approach outperforms the state-of-the-art methods with a large margin.
arXiv Detail & Related papers (2021-07-01T11:51:19Z) - Geography-Aware Self-Supervised Learning [79.4009241781968]
We show that due to their different characteristics, a non-trivial gap persists between contrastive and supervised learning on standard benchmarks.
We propose novel training methods that exploit the spatially aligned structure of remote sensing data.
Our experiments show that our proposed method closes the gap between contrastive and supervised learning on image classification, object detection and semantic segmentation for remote sensing.
arXiv Detail & Related papers (2020-11-19T17:29:13Z) - Geometrically Mappable Image Features [85.81073893916414]
Vision-based localization of an agent in a map is an important problem in robotics and computer vision.
We propose a method that learns image features targeted for image-retrieval-based localization.
arXiv Detail & Related papers (2020-03-21T15:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.