Fugu-MT 論文翻訳(概要): GeoMamba: A Geometry-driven MambaVision Framework and Dataset for Fine-grained Optical-SAR Object Retrieval

論文の概要: GeoMamba: A Geometry-driven MambaVision Framework and Dataset for Fine-grained Optical-SAR Object Retrieval

arxiv url: http://arxiv.org/abs/2605.19734v1
Date: Tue, 19 May 2026 12:08:09 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-20 15:03:09.316079
Title: GeoMamba: A Geometry-driven MambaVision Framework and Dataset for Fine-grained Optical-SAR Object Retrieval
Title（参考訳）: GeoMamba: 精密光学SARオブジェクト検索のための幾何学駆動型MambaVisionフレームワークとデータセット
Authors: Tiantong Fang, Xiuwei Wang, Jing Xiao, Wujie Zhou, Liang Liao, Mi Wang,
Abstract要約: GeoMambaは光学SAR微細検索のための幾何学駆動フレームワークである。 GFIモジュールは、クロスモーダルな機能相互作用を強化し、構造的な事前を組み込む。 GeoMambaは既存の手法を上回り、全検索環境で63.3% mAPと77.0% Rank-1の精度を達成した。
参考スコア（独自算出の注目度）: 54.741349848771144
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-source remote sensing enables complementary observation of ground objects, while cross-modal fine-grained object retrieval remains challenging, especially under unaligned optical and SAR conditions. Unlike conventional retrieval settings that rely on paired or spatially aligned samples, practical optical-SAR retrieval is affected by substantial modality discrepancy, speckle noise, and structural inconsistency, which limit robust cross-modal representation learning. To address this problem, we propose GeoMamba, a geometry-driven framework tailored for optical-SAR fine-grained retrieval. Specifically, GeoMamba introduces a Geometric Feature Injection (GFI) module that enhances cross-modal feature interaction and incorporates structural priors, thereby improving the robustness of SAR representations and promoting geometry-consistent feature learning. In addition, a Geometric Consistency Constraint (GCC) module, together with a Deep Supervision (DS) strategy, imposes hierarchical geometric constraints using classical operators, which helps preserve informative object structures during representation learning. We further construct a new dataset, FGOS-as, containing 11 aerospace and maritime categories for evaluating unaligned cross-modal fine-grained object retrieval in realistic remote sensing scenarios. Extensive experiments on FGOS-as demonstrate that GeoMamba outperforms existing methods, achieving 63.3% mAP and 77.0% Rank-1 accuracy in all-to-all retrieval setting.
Abstract（参考訳）: マルチソースリモートセンシングは、特に非整合光学およびSAR条件下では、クロスモーダルな細粒度物体の検索が困難であるのに対して、地上物体の相補的な観測を可能にする。対あるいは空間的に整列したサンプルに依存する従来の検索設定とは異なり、実用的な光学SAR検索は、頑健なクロスモーダル表現学習を制限する、相当なモダリティの相違、スペックルノイズ、構造的不整合の影響を受けている。この問題に対処するために,光学SARの微粒化検索に適した幾何駆動型フレームワークであるGeoMambaを提案する。具体的には、GeoMambaは、幾何的特徴注入(GFI)モジュールを導入し、クロスモーダルな特徴相互作用を強化し、構造的先行性を導入し、SAR表現の堅牢性を改善し、幾何一貫性のある特徴学習を促進する。さらに、GCCモジュールは、Deep Supervision(DS)戦略とともに、古典演算子を用いて階層的な幾何学的制約を課し、表現学習中の情報的対象構造を保護する。さらに,11の空域と海洋カテゴリーを含むFGOS-asという新たなデータセットを構築し,現実的なリモートセンシングシナリオにおいて,不整合のクロスモーダル微粒なオブジェクト検索を評価する。 FGOS-asに関する大規模な実験により、GeoMambaは既存の手法よりも優れており、全検索環境で63.3%のmAPと77.0%のRan-1精度を達成した。

論文の概要: GeoMamba: A Geometry-driven MambaVision Framework and Dataset for Fine-grained Optical-SAR Object Retrieval

関連論文リスト