Benchmarking Image Retrieval for Visual Localization
- URL: http://arxiv.org/abs/2011.11946v2
- Date: Tue, 1 Dec 2020 07:19:03 GMT
- Title: Benchmarking Image Retrieval for Visual Localization
- Authors: No\'e Pion, Martin Humenberger, Gabriela Csurka, Yohann Cabon, Torsten
Sattler
- Abstract summary: Visual localization is a core component of technologies such as autonomous driving and augmented reality.
It is common practice to use state-of-the-art image retrieval algorithms for these tasks.
This paper focuses on understanding the role of image retrieval for multiple visual localization tasks.
- Score: 41.38065116577011
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Visual localization, i.e., camera pose estimation in a known scene, is a core
component of technologies such as autonomous driving and augmented reality.
State-of-the-art localization approaches often rely on image retrieval
techniques for one of two tasks: (1) provide an approximate pose estimate or
(2) determine which parts of the scene are potentially visible in a given query
image. It is common practice to use state-of-the-art image retrieval algorithms
for these tasks. These algorithms are often trained for the goal of retrieving
the same landmark under a large range of viewpoint changes. However, robustness
to viewpoint changes is not necessarily desirable in the context of visual
localization. This paper focuses on understanding the role of image retrieval
for multiple visual localization tasks. We introduce a benchmark setup and
compare state-of-the-art retrieval representations on multiple datasets. We
show that retrieval performance on classical landmark retrieval/recognition
tasks correlates only for some but not all tasks to localization performance.
This indicates a need for retrieval approaches specifically designed for
localization tasks. Our benchmark and evaluation protocols are available at
https://github.com/naver/kapture-localization.
Related papers
- Teaching VLMs to Localize Specific Objects from In-context Examples [56.797110842152]
Vision-Language Models (VLMs) have shown remarkable capabilities across diverse visual tasks.
Current VLMs lack a fundamental cognitive ability: learning to localize objects in a scene by taking into account the context.
This work is the first to explore and benchmark personalized few-shot localization for VLMs.
arXiv Detail & Related papers (2024-11-20T13:34:22Z) - Revisit Anything: Visual Place Recognition via Image Segment Retrieval [8.544326445217369]
Existing visual place recognition pipelines encode the "whole" image and search for matches.
We address this by encoding and searching for "image segments" instead of the whole images.
We show that retrieving these partial representations leads to significantly higher recognition recall than the typical whole image based retrieval.
arXiv Detail & Related papers (2024-09-26T16:49:58Z) - Are Local Features All You Need for Cross-Domain Visual Place
Recognition? [13.519413608607781]
Visual Place Recognition aims to predict the coordinates of an image based solely on visual clues.
Despite recent advances, recognizing the same place when the query comes from a significantly different distribution is still a major hurdle for state of the art retrieval methods.
In this work we explore whether re-ranking methods based on spatial verification can tackle these challenges.
arXiv Detail & Related papers (2023-04-12T14:46:57Z) - Visual Localization via Few-Shot Scene Region Classification [84.34083435501094]
Visual (re)localization addresses the problem of estimating the 6-DoF camera pose of a query image captured in a known scene.
Recent advances in structure-based localization solve this problem by memorizing the mapping from image pixels to scene coordinates.
We propose a scene region classification approach to achieve fast and effective scene memorization with few-shot images.
arXiv Detail & Related papers (2022-08-14T22:39:02Z) - Investigating the Role of Image Retrieval for Visual Localization -- An
exhaustive benchmark [46.166955777187816]
This paper focuses on understanding the role of image retrieval for multiple visual localization paradigms.
We introduce a novel benchmark setup and compare state-of-the-art retrieval representations on multiple datasets.
Using these tools and in-depth analysis, we show that retrieval performance on classical landmark retrieval or place recognition tasks correlates only for some but not all paradigms to localization performance.
arXiv Detail & Related papers (2022-05-31T12:59:01Z) - Deep Metric Learning for Ground Images [4.864819846886142]
We deal with the initial localization task, in which we have no prior knowledge about the current robot positioning.
We propose a deep metric learning approach that retrieves the most similar reference images to the query image.
In contrast to existing approaches to image retrieval for ground images, our approach achieves significantly better recall performance and improves the localization performance of a state-of-the-art ground texture based localization method.
arXiv Detail & Related papers (2021-09-03T14:43:59Z) - Cross-Descriptor Visual Localization and Mapping [81.16435356103133]
Visual localization and mapping is the key technology underlying the majority of Mixed Reality and robotics systems.
We present three novel scenarios for localization and mapping which require the continuous update of feature representations.
Our data-driven approach is agnostic to the feature descriptor type, has low computational requirements, and scales linearly with the number of description algorithms.
arXiv Detail & Related papers (2020-12-02T18:19:51Z) - Geometrically Mappable Image Features [85.81073893916414]
Vision-based localization of an agent in a map is an important problem in robotics and computer vision.
We propose a method that learns image features targeted for image-retrieval-based localization.
arXiv Detail & Related papers (2020-03-21T15:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.