Fugu-MT 論文翻訳(概要): ZODS-RS -- Zero-training Oriented Detection & Segmentation for Remote Sensing

論文の概要: ZODS-RS -- Zero-training Oriented Detection & Segmentation for Remote Sensing

arxiv url: http://arxiv.org/abs/2606.10769v1
Date: Tue, 09 Jun 2026 12:24:55 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-10 15:40:58.484544
Title: ZODS-RS -- Zero-training Oriented Detection & Segmentation for Remote Sensing
Title（参考訳）: ZODS-RS -- リモートセンシングのためのゼロトレーニング指向検出とセグメンテーション
Authors: Zuan Gu, Tianhan Gao, Langxu Zhao,
Abstract要約: ZODS-RSはトレーニング不要でクローズドなパイプラインで、水平ボックス(HBB)とインスタンスマスクを出力する。我々のUAVデータセットでは、ZODS-RSはマスク$mathrmmIoU=mathbf31.10$を達成し、1つの5090のグラウンドドSAM上で、小さなオブジェクトAPを$mathbf+30.70$で改善する。
参考スコア（独自算出の注目度）: 0.28675177318965045
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Remote-sensing and UAV applications need models that generalize across platforms and viewpoints without task-specific training. Yet training-free pipelines often falter on oriented geometry, scale/rotation variation, and crowded ports or airfields, and rarely unify detection and segmentation. We introduce ZODS-RS, a training-free, closed-form pipeline that outputs horizontal boxes (HBB) and instance masks. Built on DINOv3 dense features and SAM-style proposals, ZODS-RS chains: PP (prototype purification via Tyler covariance), R-SEM (rotation-scale equivariant matching with separable kernels and global Hungarian assignment), and UAM (uncertainty-aware pixelwise merging with adaptive priors and optional negative prototypes). A lightweight CWLA fuses multiple DINOv3 layers. On FAIR1M (HBB) we obtain $\mathrm{mAP}_{0.50:0.95}=\mathbf{13.06}$ and $\mathrm{AP}_S=\mathbf{2.93}$ \emph{(class-averaged over ship/airplane)}; on xView (HBB) we report $\mathrm{mAP}=\mathbf{16.69}$. On our UAV dataset, ZODS-RS achieves mask $\mathrm{mIoU}=\mathbf{31.10}$ and improves small-object AP by $\mathbf{+30.70}$ over Grounded-SAM on a single 5090. This work offers a unified, \emph{no-training} solution for horizontal-box detection plus instance segmentation in aerial imagery; provides explicit closed-form formulations for PP/R-SEM/UAM tightly coupled with DINOv3; and demonstrates \emph{consistent} gains on small and crowded targets and under cross-domain shifts while keeping deployment simple.
Abstract（参考訳）: リモートセンシングとUAVアプリケーションは、タスク固有のトレーニングなしでプラットフォームや視点を一般化するモデルを必要とする。しかし、訓練のないパイプラインは、配向幾何学、スケール/ローテーションのバリエーション、混み合った港や飛行場、そして検出とセグメンテーションを統一することは滅多にない。本研究では、水平ボックス(HBB)とインスタンスマスクを出力する訓練不要でクローズドなパイプラインであるZODS-RSを紹介する。 DINOv3高密度特徴とSAMスタイルの提案に基づいて構築されたZODS-RSチェイン:PP(タイラー共分散によるプロトタイプ精製)、R-SEM(分離可能なカーネルとグローバルハンガリーの割り当てとのローテーションスケールの等式マッチング)、UAM(アダプティブプレファレンスとオプションの負のプロトタイプ)。軽量なCWLAは複数のDINOv3層を融合する。 FAIR1M (HBB) では $\mathrm{mAP}_{0.50:0.95}=\mathbf{13.06}$ と $\mathrm{AP}_S=\mathbf{2.93}$ \emph{(class-averaged over ship/airplane)} が得られ、xView (HBB) では $\mathrm{mAP}=\mathbf{16.69}$ が報告される。我々のUAVデータセットでは、ZODS-RSはマスク$\mathrm{mIoU}=\mathbf{31.10}$を達成し、単一5090のグラウンドドSAM上で$\mathbf{+30.70}$で小さなオブジェクトAPを改善する。この研究は水平ボックス検出のための統一された 'emph{no-training} ソリューションと空中画像のインスタンスセグメンテーションを提供し、PP/R-SEM/UAM を DINOv3 と密結合した明示的なクローズドフォームの定式化を提供する。

論文の概要: ZODS-RS -- Zero-training Oriented Detection & Segmentation for Remote Sensing

関連論文リスト