Fugu-MT 論文翻訳(概要): Investigating the Role of Image Retrieval for Visual Localization -- An exhaustive benchmark

論文の概要: Investigating the Role of Image Retrieval for Visual Localization -- An exhaustive benchmark

arxiv url: http://arxiv.org/abs/2205.15761v1
Date: Tue, 31 May 2022 12:59:01 GMT
ステータス: 翻訳完了
システム内更新日: 2022-06-01 12:30:56.592358
Title: Investigating the Role of Image Retrieval for Visual Localization -- An exhaustive benchmark
Title（参考訳）: ビジュアルローカライゼーションにおける画像検索の役割の検討 - 徹底的なベンチマーク
Authors: Martin Humenberger and Yohann Cabon and No\'e Pion and Philippe Weinzaepfel and Donghwan Lee and Nicolas Gu\'erin and Torsten Sattler and Gabriela Csurka
Abstract要約: 本稿では,複数の視覚的ローカライゼーションパラダイムにおける画像検索の役割を理解することに焦点を当てる。本稿では、新しいベンチマーク設定を導入し、複数のデータセットにおける最先端の検索表現を比較した。これらのツールと奥行き分析を用いて、古典的ランドマーク検索や位置認識タスクにおける検索性能は、ローカライズ性能に限らず、すべてのパラダイムで相関していることを示す。
参考スコア（独自算出の注目度）: 46.166955777187816
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Visual localization, i.e., camera pose estimation in a known scene, is a core component of technologies such as autonomous driving and augmented reality. State-of-the-art localization approaches often rely on image retrieval techniques for one of two purposes: (1) provide an approximate pose estimate or (2) determine which parts of the scene are potentially visible in a given query image. It is common practice to use state-of-the-art image retrieval algorithms for both of them. These algorithms are often trained for the goal of retrieving the same landmark under a large range of viewpoint changes which often differs from the requirements of visual localization. In order to investigate the consequences for visual localization, this paper focuses on understanding the role of image retrieval for multiple visual localization paradigms. First, we introduce a novel benchmark setup and compare state-of-the-art retrieval representations on multiple datasets using localization performance as metric. Second, we investigate several definitions of "ground truth" for image retrieval. Using these definitions as upper bounds for the visual localization paradigms, we show that there is still sgnificant room for improvement. Third, using these tools and in-depth analysis, we show that retrieval performance on classical landmark retrieval or place recognition tasks correlates only for some but not all paradigms to localization performance. Finally, we analyze the effects of blur and dynamic scenes in the images. We conclude that there is a need for retrieval approaches specifically designed for localization paradigms. Our benchmark and evaluation protocols are available at https://github.com/naver/kapture-localization.
Abstract（参考訳）: 視覚の定位、すなわち既知のシーンにおけるカメラのポーズ推定は、自動運転や拡張現実といった技術のコアコンポーネントである。最先端のローカライゼーションアプローチは,(1)近似ポーズ推定,(2)所定のクエリ画像でシーンのどの部分が潜在的に見えるかを決定する,という2つの目的で画像検索技術に依存することが多い。どちらも最先端の画像検索アルゴリズムを用いるのが一般的である。これらのアルゴリズムは、しばしば視覚的ローカライゼーションの要求と異なる幅広い視点の変化の下で同じランドマークを取得することを目標に訓練される。視覚的ローカライゼーションの結果を明らかにするために,複数の視覚的ローカライゼーションパラダイムにおける画像検索の役割を理解することに焦点を当てた。まず,ローカライズ性能を指標として,複数のデータセットにおける最先端の検索表現を比較する。次に,画像検索における「根拠真理」の定義について検討する。これらの定義を視覚的ローカライゼーションのパラダイムの上限として用いることで、改善の余地がまだ残っていることを示す。第3に、これらのツールと奥行き分析を用いて、古典的ランドマーク検索や位置認識タスクにおける検索性能が、ローカライズ性能の全てのパラダイムにのみ相関していることを示す。最後に、画像中のぼやけやダイナミックなシーンの影響を分析する。我々は,ローカライゼーションパラダイムに特化した検索アプローチの必要性を結論づける。ベンチマークおよび評価プロトコルはhttps://github.com/naver/kapture-localizationで利用可能です。

論文の概要: Investigating the Role of Image Retrieval for Visual Localization -- An exhaustive benchmark

関連論文リスト