FBRT-YOLO: Faster and Better for Real-Time Aerial Image Detection
- URL: http://arxiv.org/abs/2504.20670v1
- Date: Tue, 29 Apr 2025 11:53:54 GMT
- Title: FBRT-YOLO: Faster and Better for Real-Time Aerial Image Detection
- Authors: Yao Xiao, Tingfa Xu, Yu Xin, Jianan Li,
- Abstract summary: We propose a new family of real-time detectors for aerial image detection, named FBRT-YOLO, to address the imbalance between detection accuracy and efficiency.<n>FCM focuses on alleviating the problem of information imbalance caused by the loss of small target information in deep networks.<n>MKP, leverages convolutions with kernels of different sizes to enhance the relationships between targets of various scales.
- Score: 21.38164867490915
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Embedded flight devices with visual capabilities have become essential for a wide range of applications. In aerial image detection, while many existing methods have partially addressed the issue of small target detection, challenges remain in optimizing small target detection and balancing detection accuracy with efficiency. These issues are key obstacles to the advancement of real-time aerial image detection. In this paper, we propose a new family of real-time detectors for aerial image detection, named FBRT-YOLO, to address the imbalance between detection accuracy and efficiency. Our method comprises two lightweight modules: Feature Complementary Mapping Module (FCM) and Multi-Kernel Perception Unit(MKP), designed to enhance object perception for small targets in aerial images. FCM focuses on alleviating the problem of information imbalance caused by the loss of small target information in deep networks. It aims to integrate spatial positional information of targets more deeply into the network,better aligning with semantic information in the deeper layers to improve the localization of small targets. We introduce MKP, which leverages convolutions with kernels of different sizes to enhance the relationships between targets of various scales and improve the perception of targets at different scales. Extensive experimental results on three major aerial image datasets, including Visdrone, UAVDT, and AI-TOD,demonstrate that FBRT-YOLO outperforms various real-time detectors in terms of performance and speed.
Related papers
- Enhanced Small Target Detection via Multi-Modal Fusion and Attention Mechanisms: A YOLOv5 Approach [1.90298817989995]
We propose a small target detection method based on multi-modal image fusion and attention mechanisms.
This method leverages YOLOv5, integrating infrared and visible light data along with a convolutional attention module to enhance detection performance.
Experimental results on anti-UAV and Visdrone datasets demonstrate the effectiveness and practicality of our approach.
arXiv Detail & Related papers (2025-04-15T15:02:10Z) - YOLO-RS: Remote Sensing Enhanced Crop Detection Methods [0.32985979395737786]
Existing target detection methods show poor performance when dealing with small targets in remote sensing images.<n>YOLO-RS is based on the latest Yolov11 which significantly enhances the detection of small targets.<n>Experiments validate the effectiveness and application potential of YOLO-RS in the task of detecting small targets in remote sensing images.
arXiv Detail & Related papers (2025-04-15T13:13:22Z) - Efficient Feature Fusion for UAV Object Detection [9.632727117779178]
Small objects, in particular, occupy small portions of images, making their accurate detection difficult.<n>Existing multi-scale feature fusion methods address these challenges by aggregating features across different resolutions.<n>We propose a novel feature fusion framework specifically designed for UAV object detection tasks.
arXiv Detail & Related papers (2025-01-29T20:39:16Z) - Renormalized Connection for Scale-preferred Object Detection in Satellite Imagery [51.83786195178233]
We design a Knowledge Discovery Network (KDN) to implement the renormalization group theory in terms of efficient feature extraction.
Renormalized connection (RC) on the KDN enables synergistic focusing'' of multi-scale features.
RCs extend the multi-level feature's divide-and-conquer'' mechanism of the FPN-based detectors to a wide range of scale-preferred tasks.
arXiv Detail & Related papers (2024-09-09T13:56:22Z) - Towards Robust Infrared Small Target Detection: A Feature-Enhanced and Sensitivity-Tunable Framework [2.4810332733338623]
We propose a feature-enhanced and sensitivity-tunable (FEST) framework, which is compatible with existing SIRST detection networks.<n>We show that our FEST framework can significantly enhance the performance of existing SIRST detection networks.<n> Notably, the multi-scale direction-aware network (MSDA-Net) equipped with the FEST framework won the first prize in the PRCV 2024 wide-area infrared small target detection competition.
arXiv Detail & Related papers (2024-07-29T15:22:02Z) - DASSF: Dynamic-Attention Scale-Sequence Fusion for Aerial Object Detection [6.635903943457569]
The original YOLO algorithm has low overall detection accuracy due to its weak ability to perceive targets of different scales.
This paper proposes a dynamic-attention scale-sequence fusion algorithm (DASSF) for small target detection in aerial images.
Experimental results show that when the DASSF method is applied to YOLOv8, compared to YOLOv8n, the model shows an increase of 9.2% and 2.4% in the mean average precision (mAP)
arXiv Detail & Related papers (2024-06-18T05:26:44Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - RRNet: Relational Reasoning Network with Parallel Multi-scale Attention
for Salient Object Detection in Optical Remote Sensing Images [82.1679766706423]
Salient object detection (SOD) for optical remote sensing images (RSIs) aims at locating and extracting visually distinctive objects/regions from the optical RSIs.
We propose a relational reasoning network with parallel multi-scale attention for SOD in optical RSIs.
Our proposed RRNet outperforms the existing state-of-the-art SOD competitors both qualitatively and quantitatively.
arXiv Detail & Related papers (2021-10-27T07:18:32Z) - Infrared Small-Dim Target Detection with Transformer under Complex
Backgrounds [155.388487263872]
We propose a new infrared small-dim target detection method with the transformer.
We adopt the self-attention mechanism of the transformer to learn the interaction information of image features in a larger range.
We also design a feature enhancement module to learn more features of small-dim targets.
arXiv Detail & Related papers (2021-09-29T12:23:41Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - PENet: Object Detection using Points Estimation in Aerial Images [9.33900415971554]
A novel network structure, Points Estimated Network (PENet), is proposed in this work to answer these challenges.
PENet uses a Mask Resampling Module (MRM) to augment the imbalanced datasets, a coarse anchor-free detector (CPEN) to effectively predict the center points of the small object clusters, and a fine anchor-free detector FPEN to locate the precise positions of the small objects.
Our experiments on aerial datasets visDrone and UAVDT showed that PENet achieved higher precision results than existing state-of-the-art approaches.
arXiv Detail & Related papers (2020-01-22T19:43:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.