Fugu-MT 論文翻訳(概要): Satellite-Free Training for Drone-View Geo-Localization

論文の概要: Satellite-Free Training for Drone-View Geo-Localization

arxiv url: http://arxiv.org/abs/2604.01581v2
Date: Fri, 03 Apr 2026 03:13:11 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-06 12:42:34.273268
Title: Satellite-Free Training for Drone-View Geo-Localization
Title（参考訳）: 衛星フリーでドローンを視認できるジオローカライゼーション
Authors: Tao Liu, Yingzhi Zhang, Kan Ren, Xiaoqi Zhao,
Abstract要約: ドローンビューのジオローカライゼーションは、対応するジオタグ付き衛星タイルを検索することで、GPSで識別された環境におけるドローンの位置を決定することを目的としている。本稿では,ドローン画像のクロスプラットフォーム対応表現に変換する,衛星フリーのトレーニングフレームワークを提案する。
参考スコア（独自算出の注目度）: 23.183491899036138
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Drone-view geo-localization (DVGL) aims to determine the location of drones in GPS-denied environments by retrieving the corresponding geotagged satellite tile from a reference gallery given UAV observations of a location. In many existing formulations, these observations are represented by a single oblique UAV image. In contrast, our satellite-free setting is designed for multi-view UAV sequences, which are used to construct a geometry-normalized UAV-side location representation before cross-view retrieval. Existing approaches rely on satellite imagery during training, either through paired supervision or unsupervised alignment, which limits practical deployment when satellite data are unavailable or restricted. In this paper, we propose a satellite-free training (SFT) framework that converts drone imagery into cross-view compatible representations through three main stages: drone-side 3D scene reconstruction, geometry-based pseudo-orthophoto generation, and satellite-free feature aggregation for retrieval. Specifically, we first reconstruct dense 3D scenes from multi-view drone images using 3D Gaussian splatting and project the reconstructed geometry into pseudo-orthophotos via PCA-guided orthographic projection. This rendering stage operates directly on reconstructed scene geometry without requiring camera parameters at rendering time. Next, we refine these orthophotos with lightweight geometry-guided inpainting to obtain texture-complete drone-side views. Finally, we extract DINOv3 patch features from the generated orthophotos, learn a Fisher vector aggregation model solely from drone data, and reuse it at test time to encode satellite tiles for cross-view retrieval. Experimental results on University-1652 and SUES-200 show that our SFT framework substantially outperforms satellite-free generalization baselines and narrows the gap to methods trained with satellite imagery.
Abstract（参考訳）: ドローンビューのジオローカライゼーション (DVGL) は、位置をUAVで観測すると、対応するジオタグ付き衛星タイルを基準ギャラリーから取り出すことにより、GPSで識別された環境におけるドローンの位置を決定することを目的としている。多くの既存の定式化において、これらの観測は単一の斜めUAV画像で表される。対照的に、我々の衛星フリー設定は多視点UAVシーケンス用に設計されており、これは、クロスビュー検索の前に幾何正規化されたUAV側の位置表現を構築するのに使用される。既存のアプローチは、衛星データが利用できない場合や制限されていない場合の実際の展開を制限する、ペア化された監督または非監督的なアライメントを通じて、訓練中に衛星画像に依存する。本稿では, ドローン画像から3次元シーン再構成, 幾何学に基づく擬似写真生成, 検索のための衛星不要特徴集約という3つの主要な段階を通じて, クロスビュー対応表現に変換する, 衛星フリートレーニング(SFT)フレームワークを提案する。具体的には、まず3次元ガウシアンスプラッティングを用いて多視点ドローン画像から密集した3Dシーンを再構成し、PCAガイド撮影による擬似オルソフォトに投影する。このレンダリングステージは、レンダリング時にカメラパラメータを必要とせずに、再構成されたシーン形状を直接操作する。次に、テクスチャ完備なドローンサイドビューを得るために、軽量な幾何学誘導塗装でこれらの正光を洗練する。最後に、生成された正光線からDINOv3パッチの特徴を抽出し、ドローンデータのみからフィッシャーベクトル集約モデルを学習し、テスト時に再利用して、衛星タイルをクロスビュー検索する。大学1652とSUES-200の実験結果から、我々のSFTフレームワークは衛星自由化ベースラインを大幅に上回り、衛星画像で訓練された手法とのギャップを狭めていることがわかった。

論文の概要: Satellite-Free Training for Drone-View Geo-Localization

関連論文リスト