Fugu-MT 論文翻訳(概要): TinyFormer: Preserving Tiny Objects in YOLO-DETRHybridReal-time Detectors

論文の概要: TinyFormer: Preserving Tiny Objects in YOLO-DETRHybridReal-time Detectors

arxiv url: http://arxiv.org/abs/2605.25046v1
Date: Sun, 24 May 2026 12:42:13 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-26 19:50:18.6728
Title: TinyFormer: Preserving Tiny Objects in YOLO-DETRHybridReal-time Detectors
Title（参考訳）: TinyFormer: YOLO-DETRHybridReal-time Detector内のTinyオブジェクトを保存する
Authors: Jun-Wei Hsieh, Meng-Yu Kao, Ghufron Wahyu Kurniawan, Kuan-Chuan Peng,
Abstract要約: YOLOシリーズとDETRベースの検出器は、小さな物体検出に苦労する。 TinyFormerは、ViT表現、NMSなしのセット予測、YOLOスタイルのピラミッドネックを組み合わせたリアルタイム検出器である。 TinyFormerは、最近のYOLOシリーズ検出器と強力なDEIMv2ベースラインを一貫して上回っている。
参考スコア（独自算出の注目度）: 16.413336628781064
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: YOLO-series and DETR-based detectors struggle with tiny-object detection. YOLO-style models benefit from efficient dense prediction, but their large-stride backbones may suppress tiny instances in deep feature maps and make grid assignment ambiguous. DETR-based models remove hand-crafted post-processing through set prediction, yet they reason over coarse token grids, where tiny objects occupy only a few weak tokens and are easily overlooked during matching. To address these limitations, we propose TinyFormer, a unified YOLO--DETR hybrid real-time detector that combines ViT representations, NMS-free set prediction, and a YOLO-style pyramid neck for accurate small-object detection. TinyFormer introduces a Parallel Bi-fusion Module (PBM), which builds high-resolution shortcuts from shallow stages to the feature pyramid, preserving fine spatial details during multi-scale fusion. We further design a Spatial Semantic Adapter (SSA) to compensate for the spatial loss caused by coarse tokenization. SSA extracts high-resolution cues from early stages and injects them into transformer token embeddings, improving tiny-object localization without sacrificing the global modeling ability of DETR. Experiments on MS COCO show that TinyFormer consistently outperforms recent YOLO-series detectors and the strong DEIMv2 baseline. TinyFormer-X achieves 58.4% AP even without PBM, while adding PBM improves the overall AP to 58.5% and brings a 1.6% AP gain on small objects. With Objects365 pre-training, TinyFormer-X-PBM reaches 60.2% AP, surpassing RF-DETR and other Objects365-pretrained detectors with fewer parameters and lower computation. These results demonstrate that TinyFormer bridges dense YOLO-style feature fusion and DETR-style set prediction, providing a strong accuracy-efficiency trade-off for real-time tiny-object detection. Code is available at https://github.com/mmpmmpmmpjosh/TinyFormer.
Abstract（参考訳）: YOLOシリーズとDETRベースの検出器は、小さな物体検出に苦労する。 YOLOスタイルのモデルは効率的な密度予測の恩恵を受けるが、その大きなストライドバックボーンは、深い特徴写像の小さなインスタンスを抑圧し、グリッドの割り当てを曖昧にする可能性がある。 DETRベースのモデルは、設定された予測を通じて手作りのポストプロセッシングを除去するが、小さなオブジェクトがいくつかの弱いトークンを占有し、マッチング中に容易に見落としてしまう粗いトークングリッドを優先する。これらの制約に対処するために,VET表現,NMSフリーなセット予測,高精度な小物体検出のためのYOLOスタイルのピラミッドネックを組み合わせた統合型YOLO-DETRハイブリッドリアルタイム検出器TinyFormerを提案する。 TinyFormerはParallel Bi-fusion Module (PBM)を導入し、浅いステージから特徴ピラミッドへ高解像度のショートカットを構築する。さらに、粗いトークン化による空間損失を補償する空間意味適応器(SSA)を設計する。 SSAは、早期から高分解能のキューを抽出し、トランスフォーマートークンの埋め込みに注入し、DETRのグローバルモデリング能力を犠牲にすることなく、小さなオブジェクトのローカライゼーションを改善する。 MS COCOの実験では、TinyFormerは最近のYOLOシリーズ検出器と強力なDEIMv2ベースラインを一貫して上回っている。 TinyFormer-X は PBM を使わずに 58.4% AP を達成し、PBM を追加することで AP 全体の 58.5% が改善され、小さなオブジェクトに対して 1.6% AP が上昇する。 Objects365の事前トレーニングでは、TinyFormer-X-PBMは60.2%APに達し、RF-DETRや他のObjects365プリトレーニングされた検出器を上回り、パラメータが少なく、計算量も少ない。これらの結果から,TinyFormerは高密度YOLOスタイルの特徴融合とDETRスタイルのセット予測を橋渡しし,リアルタイムの微小物体検出に強い精度・効率のトレードオフをもたらすことが示された。コードはhttps://github.com/mmpmpmmpjosh/TinyFormer.comで入手できる。

論文の概要: TinyFormer: Preserving Tiny Objects in YOLO-DETRHybridReal-time Detectors

関連論文リスト