Fugu-MT 論文翻訳(概要): Observe Less, Understand More: Cost-aware Cross-scale Observation for Remote Sensing Understanding

論文の概要: Observe Less, Understand More: Cost-aware Cross-scale Observation for Remote Sensing Understanding

arxiv url: http://arxiv.org/abs/2604.11415v1
Date: Mon, 13 Apr 2026 13:00:29 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-14 20:13:16.544643
Title: Observe Less, Understand More: Cost-aware Cross-scale Observation for Remote Sensing Understanding
Title（参考訳）: リモートセンシング理解のためのコスト認識型クロススケール観察
Authors: Zhenghao Xie, Jing Xiao, Zhenqi Wang, Kexin Ma, Liang Liao, Gui-Song Xia, Mi Wang,
Abstract要約: 高解像度(HR)画像は、はるかに高い買収コストと限られた範囲で重要な局所的な詳細を提供する。これは、LRに基づくグローバルな知覚からHRイメージを選択的に取得する、クロススケールなセンシング戦略のモチベーションである。 GL-10Mは,1000万個の空間的に整列したマルチ解像度画像の大規模ベンチマークである。
参考スコア（独自算出の注目度）: 49.97682794425118
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Remote sensing understanding inherently requires multi-resolution observation, since different targets and application tasks demand different levels of spatial detail. While low-resolution (LR) imagery enables efficient global observation, high-resolution (HR) imagery provides critical local details at much higher acquisition cost and limited coverage. This motivates a cross-scale sensing strategy that selectively acquires HR imagery from LR-based global perception to improve task performance under constrained cost. Existing methods for HR sampling methods typically make selection decisions from isolated LR patches, which ignore fine-grained intra-patch importance and cross-patch contextual interactions, leading to fragmented feature representation and suboptimal scene reasoning under sparse HR observations. To address this issue, we formulate cross-scale remote sensing understanding as a unified cost-aware problem that couples fine-grained HR sampling with cross-patch representation prediction, enabling more effective task reasoning with fewer HR observations. Furthermore, we present GL-10M, a large-scale benchmark of 10 million spatially aligned multi-resolution images, enabling systematic evaluation of budget-constrained cross-scale reasoning in remote sensing. Extensive experiments on recognition and retrieval tasks show that our method consistently achieves a superior performance-cost trade-off.
Abstract（参考訳）: リモートセンシングの理解には、異なる目標と応用タスクが異なる空間的詳細レベルを必要とするため、本質的にはマルチレゾリューションな観察が必要である。低分解能(LR)画像は効率的な地球観測を可能にするが、高分解能(HR)画像はより高い取得コストと限られた範囲で重要な局所的な詳細を提供する。これは、LRに基づくグローバルな認識からHR画像を選択的に取得し、制約されたコストでタスク性能を向上させる、クロススケールなセンシング戦略のモチベーションである。既存のHRサンプリング手法は、通常、分離されたLRパッチから選択決定を行うが、これはきめ細かなパッチ内重要度やコンテキスト間相互作用を無視し、断片化された特徴表現と、スパースHR観察下での最適シーン推論をもたらす。この問題に対処するために、我々はクロススケールなリモートセンシング理解を、より詳細なHRサンプリングとクロスパッチ表現予測を結合した統合コスト認識問題として定式化し、より効率的なHR観察によるタスク推論を可能にした。さらに,1000万個の空間的に整列したマルチレゾリューション画像の大規模ベンチマークであるGL-10Mを提案し,リモートセンシングにおける予算制約によるクロススケール推論の体系的評価を可能にした。認識・検索タスクに関する大規模な実験により,本手法は優れた性能・コストトレードオフを実現することができた。

論文の概要: Observe Less, Understand More: Cost-aware Cross-scale Observation for Remote Sensing Understanding

関連論文リスト