Fugu-MT 論文翻訳(概要): Generalized Small Object Detection:A Point-Prompted Paradigm and Benchmark

論文の概要: Generalized Small Object Detection:A Point-Prompted Paradigm and Benchmark

arxiv url: http://arxiv.org/abs/2604.02773v1
Date: Fri, 03 Apr 2026 06:32:18 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-06 17:20:24.350567
Title: Generalized Small Object Detection:A Point-Prompted Paradigm and Benchmark
Title（参考訳）: 一般化された小物体検出:ポイントプロンプトパラダイムとベンチマーク
Authors: Haoran Zhu, Wen Yang, Guangyou Yang, Chang Xu, Ruixiang Zhang, Fang Xu, Haijian Zhang, Gui-Song Xia,
Abstract要約: 小さい物体検出(SOD)は、非常に限られたピクセルとあいまいな物体の境界のために難しいままである。これらの特徴は、挑戦的なアノテーション、大規模な高品質データセットの可用性の制限、そして本質的に小さなオブジェクトに対する弱いセマンティック表現をもたらす。本研究では,小型オブジェクト検出のための大規模マルチドメインデータセットTinySet-9Mを導入することで,データ制限に対処する。
参考スコア（独自算出の注目度）: 54.91847070147244
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Small object detection (SOD) remains challenging due to extremely limited pixels and ambiguous object boundaries. These characteristics lead to challenging annotation, limited availability of large-scale high-quality datasets, and inherently weak semantic representations for small objects. In this work, we first address the data limitation by introducing TinySet-9M, the first large-scale, multi-domain dataset for small object detection. Beyond filling the gap in large-scale datasets, we establish a benchmark to evaluate the effectiveness of existing label-efficient detection methods for small objects. Our evaluation reveals that weak visual cues further exacerbate the performance degradation of label-efficient methods in small object detection, highlighting a critical challenge in label-efficient SOD. Secondly, to tackle the limitation of insufficient semantic representation, we move beyond training-time feature enhancement and propose a new paradigm termed Point-Prompt Small Object Detection (P2SOD). This paradigm introduces sparse point prompts at inference time as an efficient information bridge for category-level localization, enabling semantic augmentation. Building upon the P2SOD paradigm and the large-scale TinySet-9M dataset, we further develop DEAL (DEtect Any smalL object), a scalable and transferable point-prompted detection framework that learns robust, prompt-conditioned representations from large-scale data. With only a single click at inference time, DEAL achieves a 31.4% relative improvement over fully supervised baselines under strict localization metrics (e.g., AP75) on TinySet-9M, while generalizing effectively to unseen categories and unseen datasets. Our project is available at https://zhuhaoraneis.github.io/TinySet-9M/.
Abstract（参考訳）: 小さい物体検出(SOD)は、非常に限られたピクセルとあいまいな物体の境界のために難しいままである。これらの特徴は、挑戦的なアノテーション、大規模な高品質データセットの可用性の制限、そして本質的に小さなオブジェクトに対する弱いセマンティック表現をもたらす。本研究では,小型オブジェクト検出のための大規模マルチドメインデータセットTinySet-9Mを導入することで,データ制限に対処する。大規模データセットのギャップを埋めるだけでなく、我々は、小さなオブジェクトに対する既存のラベル効率検出手法の有効性を評価するためのベンチマークを構築した。評価の結果,弱い視覚的手がかりは,小物体検出におけるラベル効率の高い手法の性能劣化をさらに悪化させ,ラベル効率のSODにおける重要な課題を浮き彫りにした。第二に、セマンティック表現の限界に対処するため、訓練時機能拡張を超越し、P2SOD(Point-Prompt Small Object Detection)と呼ばれる新しいパラダイムを提案する。このパラダイムは、カテゴリーレベルのローカライゼーションのための効率的な情報ブリッジとして、推論時にスパースポイントプロンプトを導入し、セマンティック拡張を可能にする。 P2SODパラダイムと大規模TinySet-9Mデータセットに基づいて、大規模データから堅牢で迅速な条件付き表現を学習するスケーラブルで転送可能なポイントプロンプト検出フレームワークであるDEAL(DEtect Any smalL object)をさらに発展させる。推測時間のたった1クリックで、DeALはTinySet-9M上の厳密なローカライゼーションメトリクス(例えばAP75)の下で、完全な教師付きベースラインよりも31.4%の相対的な改善を実現した。私たちのプロジェクトはhttps://zhuhaoraneis.github.io/TinySet-9M/で利用可能です。

論文の概要: Generalized Small Object Detection:A Point-Prompted Paradigm and Benchmark

関連論文リスト