Fugu-MT 論文翻訳(概要): DualGeo: A Dual-View Framework for Worldwide Image Geo-localization

論文の概要: DualGeo: A Dual-View Framework for Worldwide Image Geo-localization

arxiv url: http://arxiv.org/abs/2604.25533v1
Date: Tue, 28 Apr 2026 12:00:04 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-29 16:49:17.841598
Title: DualGeo: A Dual-View Framework for Worldwide Image Geo-localization
Title（参考訳）: DualGeo: 世界規模の画像ジオローカライゼーションのためのデュアルビューフレームワーク
Authors: Junchao Cui, Wenqi Shi, Shaoyong Du, Hang He, Xuanzi Ma, Hao Tang, Xiangyang Luo,
Abstract要約: 本研究では,世界規模の画像位置情報化のための2段階フレームワークであるDualGeoを提案する。まず、画像とセマンティックセグメンテーション機能を融合させることで、地理的表現基盤を確立する。第2に、地理的クラスタリングを用いて、検索された候補を再ランク付けすることで、地理認知の洗練を行う。実験の結果、DualGeoは最先端の手法より優れており、街路レベル(1km)と都市レベル(25km)のローカライゼーション精度はそれぞれ3.6%-16.58%、1.29%-8.77%向上している。
参考スコア（独自算出の注目度）: 24.463319677769405
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Worldwide image geo-localization aims to infer the geographic location of an image captured anywhere on Earth, spanning street, city, regional, national, and continental scales. Existing methods rely on visual features that are sensitive to environmental variations (e.g., lighting, season, and weather) and lack effective post-processing to filter outlier candidates, limiting localization accuracy. To address these limitations, we propose DualGeo, a two-stage framework for worldwide image geo-localization. First, it establishes a geo-representational foundation by fusing image and semantic segmentation features via bidirectional cross-attention. The fused features are then aligned with GPS coordinates through dual-view contrastive learning to build a global retrieval database. Second, it performs geo-cognitive refinement by re-ranking retrieved candidates using geographic clustering. It then feeds them into large multimodal models (LMMs) for final coordinate prediction. Experiments on IM2GPS, IM2GPS3k, and YFCC4k show that DualGeo outperforms state-of-the-art methods, improving street-level (<1 km) and city-level (<25 km) localization accuracy by 3.6%-16.58% and 1.29%-8.77%, respectively. Our code and datasets are available : https://github.com/CJ310177/DualGeo.
Abstract（参考訳）: 世界規模の画像ジオローカライゼーションは、地球上のどこでも、街路、都市、地域、国家、大陸規模で撮影された画像の位置を推測することを目的としている。既存の手法では、環境の変化に敏感な視覚的特徴(例えば、照明、季節、天気など)を頼りにしており、オフショア候補をフィルタリングする効果的な後処理を欠いており、ローカライゼーションの精度が制限されている。これらの制約に対処するため,世界規模の画像位置情報化のための2段階フレームワークであるDualGeoを提案する。まず、画像とセマンティックセグメンテーション機能を双方向のクロスアテンションを介して融合させることにより、地理的表現基盤を確立する。融合した機能は、デュアルビューのコントラスト学習を通じてGPS座標と整列して、グローバルな検索データベースを構築する。第2に、地理的クラスタリングを用いて、検索された候補を再ランク付けすることで、地理認知の洗練を行う。その後、最終的な座標予測のために大きなマルチモーダルモデル(LMM)にフィードする。 IM2GPS、IM2GPS3k、YFCC4kの実験では、DualGeoは最先端の手法より優れており、街路レベル(<1 km)と都市レベル(<25 km)のローカライゼーション精度はそれぞれ3.6%-16.58%、1.29%-8.77%向上している。私たちのコードとデータセットは、https://github.com/CJ310177/DualGeo.comで利用可能です。

論文の概要: DualGeo: A Dual-View Framework for Worldwide Image Geo-localization

関連論文リスト