RFWNet: A Lightweight Remote Sensing Object Detector Integrating Multi-Scale Receptive Fields and Foreground Focus Mechanism
- URL: http://arxiv.org/abs/2503.00545v1
- Date: Sat, 01 Mar 2025 16:02:15 GMT
- Title: RFWNet: A Lightweight Remote Sensing Object Detector Integrating Multi-Scale Receptive Fields and Foreground Focus Mechanism
- Authors: Yujie Lei, Wenjie Sun, Sen Jia, Qingquan Li, Jie Zhang,
- Abstract summary: This study proposes an efficient and lightweight RSOD algorithm integrat-ing multi-scale receptive fields and foreground focus mechanism, named RFWNet.<n> Experimental evaluations on the DOTA V1.0 and NWPU VHR-10 datasets demonstrate that RFWNet achieves advanced perfor-mance with 6.0M parameters and can achieves 52 FPS.
- Score: 10.997183129304409
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Challenges in remote sensing object detection (RSOD), such as high inter-class similarity, imbalanced foreground-background distribution, and the small size of objects in remote sensing images significantly hinder detection accuracy. Moreo-ver, the trade-off between model accuracy and computational complexity poses additional constraints on the application of RSOD algorithms. To address these issues, this study proposes an efficient and lightweight RSOD algorithm integrat-ing multi-scale receptive fields and foreground focus mechanism, named RFWNet. Specifically, we proposed a lightweight backbone network Receptive Field Adaptive Selection Network (RFASNet), leveraging the rich context infor-mation of remote sensing images to enhance class separability. Additionally, we developed a Foreground Background Separation Module (FBSM) consisting of a background redundant information filtering module and a foreground information enhancement module to emphasize critical regions within images while filtering redundant background information. Finally, we designed a loss function, the Weighted CIoU-Wasserstein (WCW) loss, which weights the IoU-based loss by using the Normalized Wasserstein Distance to mitigate model sensitivity to small object position deviations. Experimental evaluations on the DOTA V1.0 and NWPU VHR-10 datasets demonstrate that RFWNet achieves advanced perfor-mance with 6.0M parameters and can achieves 52 FPS.
Related papers
- LEGNet: Lightweight Edge-Gaussian Driven Network for Low-Quality Remote Sensing Image Object Detection [18.804394986840887]
LEGNet is a lightweight network that incorporates a novel edge-Gaussian aggregation module for low-quality remote sensing images.
Our key innovation lies in the synergistic integration of Scharr operator-based edge priors with uncertainty-aware Gaussian modeling.
LEGNet achieves state-of-the-art performance across five benchmark datasets while ensuring computational efficiency.
arXiv Detail & Related papers (2025-03-18T08:20:24Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Hi-ResNet: Edge Detail Enhancement for High-Resolution Remote Sensing Segmentation [10.919956120261539]
High-resolution remote sensing (HRS) semantic segmentation extracts key objects from high-resolution coverage areas.
objects of the same category within HRS images show significant differences in scale and shape across diverse geographical environments.
We propose a High-resolution remote sensing network (Hi-ResNet) with efficient network structure designs.
arXiv Detail & Related papers (2023-05-22T03:58:25Z) - Near-filed SAR Image Restoration with Deep Learning Inverse Technique: A
Preliminary Study [5.489791364472879]
Near-field synthetic aperture radar (SAR) provides a high-resolution image of a target's scattering distribution-hot spots.
Meanwhile, imaging result suffers inevitable degradation from sidelobes, clutters, and noises.
To restore the image, current methods make simplified assumptions; for example, the point spread function (PSF) is spatially consistent, the target consists of sparse point scatters, etc.
We reformulate the degradation model into a spatially variable complex-convolution model, where the near-field SAR's system response is considered.
A model-based deep learning network is designed to restore the
arXiv Detail & Related papers (2022-11-28T01:28:33Z) - Pyramid Grafting Network for One-Stage High Resolution Saliency
Detection [29.013012579688347]
We propose a one-stage framework called Pyramid Grafting Network (PGNet) to extract features from different resolution images independently.
An attention-based Cross-Model Grafting Module (CMGM) is proposed to enable CNN branch to combine broken detailed information more holistically.
We contribute a new Ultra-High-Resolution Saliency Detection dataset UHRSD, containing 5,920 images at 4K-8K resolutions.
arXiv Detail & Related papers (2022-04-11T12:22:21Z) - Context-Preserving Instance-Level Augmentation and Deformable
Convolution Networks for SAR Ship Detection [50.53262868498824]
Shape deformation of targets in SAR image due to random orientation and partial information loss is an essential challenge in SAR ship detection.
We propose a data augmentation method to train a deep network that is robust to partial information loss within the targets.
arXiv Detail & Related papers (2022-02-14T07:01:01Z) - Anchor Retouching via Model Interaction for Robust Object Detection in
Aerial Images [15.404024559652534]
We present an effective Dynamic Enhancement Anchor (DEA) network to construct a novel training sample generator.
Our method achieves state-of-the-art performance in accuracy with moderate inference speed and computational overhead for training.
arXiv Detail & Related papers (2021-12-13T14:37:20Z) - RRNet: Relational Reasoning Network with Parallel Multi-scale Attention
for Salient Object Detection in Optical Remote Sensing Images [82.1679766706423]
Salient object detection (SOD) for optical remote sensing images (RSIs) aims at locating and extracting visually distinctive objects/regions from the optical RSIs.
We propose a relational reasoning network with parallel multi-scale attention for SOD in optical RSIs.
Our proposed RRNet outperforms the existing state-of-the-art SOD competitors both qualitatively and quantitatively.
arXiv Detail & Related papers (2021-10-27T07:18:32Z) - Infrared Small-Dim Target Detection with Transformer under Complex
Backgrounds [155.388487263872]
We propose a new infrared small-dim target detection method with the transformer.
We adopt the self-attention mechanism of the transformer to learn the interaction information of image features in a larger range.
We also design a feature enhancement module to learn more features of small-dim targets.
arXiv Detail & Related papers (2021-09-29T12:23:41Z) - A Light-Weight Object Detection Framework with FPA Module for Optical
Remote Sensing Imagery [12.762588615997624]
We propose an efficient anchor free object detector, CenterFPANet.
To pursue speed, we use a lightweight backbone and introduce the asymmetric revolution block.
This strategy can improve the accuracy of remote sensing image object detection without reducing the detection speed.
arXiv Detail & Related papers (2020-09-07T12:41:17Z) - Progressively Guided Alternate Refinement Network for RGB-D Salient
Object Detection [63.18846475183332]
We aim to develop an efficient and compact deep network for RGB-D salient object detection.
We propose a progressively guided alternate refinement network to refine it.
Our model outperforms existing state-of-the-art approaches by a large margin.
arXiv Detail & Related papers (2020-08-17T02:55:06Z) - A Single Stream Network for Robust and Real-time RGB-D Salient Object
Detection [89.88222217065858]
We design a single stream network to use the depth map to guide early fusion and middle fusion between RGB and depth.
This model is 55.5% lighter than the current lightest model and runs at a real-time speed of 32 FPS when processing a $384 times 384$ image.
arXiv Detail & Related papers (2020-07-14T04:40:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.