UFO-DETR: Frequency-Guided End-to-End Detector for UAV Tiny Objects
- URL: http://arxiv.org/abs/2602.22712v1
- Date: Thu, 26 Feb 2026 07:37:45 GMT
- Title: UFO-DETR: Frequency-Guided End-to-End Detector for UAV Tiny Objects
- Authors: Yuankai Chen, Kai Lin, Qihong Wu, Xinxuan Yang, Jiashuo Lai, Ruoen Chen, Haonan Shi, Minfan He, Meihua Wang,
- Abstract summary: Small target detection in UAV imagery faces significant challenges such as scale variations, dense distribution, and the dominance of small targets.<n>This paper proposes an end-to-end object detection framework, UFO-DETR, which integrates an LSKNet-based backbone network to optimize the receptive field and reduce the number of parameters.
- Score: 7.1136620172045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Small target detection in UAV imagery faces significant challenges such as scale variations, dense distribution, and the dominance of small targets. Existing algorithms rely on manually designed components, and general-purpose detectors are not optimized for UAV images, making it difficult to balance accuracy and complexity. To address these challenges, this paper proposes an end-to-end object detection framework, UFO-DETR, which integrates an LSKNet-based backbone network to optimize the receptive field and reduce the number of parameters. By combining the DAttention and AIFI modules, the model flexibly models multi-scale spatial relationships, improving multi-scale target detection performance. Additionally, the DynFreq-C3 module is proposed to enhance small target detection capability through cross-space frequency feature enhancement. Experimental results show that, compared to RT-DETR-L, the proposed method offers significant advantages in both detection performance and computational efficiency, providing an efficient solution for UAV edge computing.
Related papers
- TransBridge: Boost 3D Object Detection by Scene-Level Completion with Transformer Decoder [66.22997415145467]
This paper presents a joint completion and detection framework that improves the detection feature in sparse areas.<n> Specifically, we propose TransBridge, a novel transformer-based up-sampling block that fuses the features from the detection and completion networks.<n>The results show that our framework consistently improves end-to-end 3D object detection, with the mean average precision (mAP) ranging from 0.7 to 1.5 across multiple methods.
arXiv Detail & Related papers (2025-12-12T00:08:03Z) - Source-Free Object Detection with Detection Transformer [59.33653163035064]
Source-Free Object Detection (SFOD) enables knowledge transfer from a source domain to an unsupervised target domain for object detection without access to source data.<n>Most existing SFOD approaches are either confined to conventional object detection (OD) models like Faster R-CNN or designed as general solutions without tailored adaptations for novel OD architectures, especially Detection Transformer (DETR)<n>In this paper, we introduce Feature Reweighting ANd Contrastive Learning NetworK (FRANCK), a novel SFOD framework specifically designed to perform query-centric feature enhancement for DETRs.
arXiv Detail & Related papers (2025-10-13T07:35:04Z) - TinyDef-DETR: A Transformer-Based Framework for Defect Detection in Transmission Lines from UAV Imagery [12.48571944931548]
TinyDef-DETR is a framework designed to achieve accurate and efficient detection of transmission line defects from UAV-acquired images.<n>The model integrates four major components: an edge-enhanced ResNet backbone to strengthen boundary-sensitive representations, a stride-free space-to-depth module to enable detail-preserving downsampling, and a Focaler-Wise-SIoU regression loss to improve the localization of small and difficult objects.
arXiv Detail & Related papers (2025-09-07T12:36:33Z) - High-Frequency Semantics and Geometric Priors for End-to-End Detection Transformers in Challenging UAV Imagery [6.902247657565531]
We introduce HEDS-DETR, a holistically enhanced real-time Detection Transformer tailored for aerial scenes.<n>First, we propose a novel High-Frequency Enhanced Semantics Network (HFESNet) backbone, which yields highly discriminative features.<n>Second, our Efficient Small Object Pyramid (ESOP) counteracts information loss by efficiently fusing high-resolution features.<n>Third, we enhance decoder stability and localization precision with two synergistic components.
arXiv Detail & Related papers (2025-07-01T14:56:56Z) - FBRT-YOLO: Faster and Better for Real-Time Aerial Image Detection [21.38164867490915]
We propose a new family of real-time detectors for aerial image detection, named FBRT-YOLO, to address the imbalance between detection accuracy and efficiency.<n>FCM focuses on alleviating the problem of information imbalance caused by the loss of small target information in deep networks.<n>MKP, leverages convolutions with kernels of different sizes to enhance the relationships between targets of various scales.
arXiv Detail & Related papers (2025-04-29T11:53:54Z) - An Efficient Aerial Image Detection with Variable Receptive Fields [0.0]
We propose a transformer-based detector incorporating three key components.<n>VRF-DETR achieves 51.4% mAPtextsubscript50 and 31.8% mAPtextsubscript50:95 with only 13.5M parameters.
arXiv Detail & Related papers (2025-04-21T15:16:13Z) - Efficient Feature Fusion for UAV Object Detection [9.632727117779178]
Small objects, in particular, occupy small portions of images, making their accurate detection difficult.<n>Existing multi-scale feature fusion methods address these challenges by aggregating features across different resolutions.<n>We propose a novel feature fusion framework specifically designed for UAV object detection tasks.
arXiv Detail & Related papers (2025-01-29T20:39:16Z) - Efficient Detection Framework Adaptation for Edge Computing: A Plug-and-play Neural Network Toolbox Enabling Edge Deployment [59.61554561979589]
Edge computing has emerged as a key paradigm for deploying deep learning-based object detection in time-sensitive scenarios.<n>Existing edge detection methods face challenges: difficulty balancing detection precision with lightweight models, limited adaptability, and insufficient real-world validation.<n>We propose the Edge Detection Toolbox (ED-TOOLBOX), which utilizes generalizable plug-and-play components to adapt object detection models for edge environments.
arXiv Detail & Related papers (2024-12-24T07:28:10Z) - Efficient Oriented Object Detection with Enhanced Small Object Recognition in Aerial Images [2.9138705529771123]
We present a novel enhancement to the YOLOv8 model, tailored for oriented object detection tasks.<n>Our model features a wavelet transform-based C2f module for capturing associative features and an Adaptive Scale Feature Pyramid (ASFP) module that leverages P2 layer details.<n>Our approach provides a more efficient architectural design than DecoupleNet, which has 23.3M parameters, all while maintaining detection accuracy.
arXiv Detail & Related papers (2024-12-17T05:45:48Z) - Renormalized Connection for Scale-preferred Object Detection in Satellite Imagery [51.83786195178233]
We design a Knowledge Discovery Network (KDN) to implement the renormalization group theory in terms of efficient feature extraction.
Renormalized connection (RC) on the KDN enables synergistic focusing'' of multi-scale features.
RCs extend the multi-level feature's divide-and-conquer'' mechanism of the FPN-based detectors to a wide range of scale-preferred tasks.
arXiv Detail & Related papers (2024-09-09T13:56:22Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - SALISA: Saliency-based Input Sampling for Efficient Video Object
Detection [58.22508131162269]
We propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detection.
We show that SALISA significantly improves the detection of small objects.
arXiv Detail & Related papers (2022-04-05T17:59:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.