Related papers: DPNet: Dynamic Pooling Network for Tiny Object Detection

DPNet: Dynamic Pooling Network for Tiny Object Detection

URL: http://arxiv.org/abs/2505.02797v1
Date: Mon, 05 May 2025 17:13:35 GMT
Title: DPNet: Dynamic Pooling Network for Tiny Object Detection
Authors: Luqi Gong, Haotian Chen, Yikun Chen, Tianliang Yao, Chao Li, Shuai Zhao, Guangjie Han,
Abstract summary: Resizing images is a common strategy to improve detection accuracy, particularly for small objects.<n>This paper proposes a Dynamic Pooling Network (DPNet) for tiny object detection to mitigate these issues.<n> Experiments on the TinyCOCO and TinyPerson datasets show that DPNet can save over 35% and 25% GFLOPs, respectively.
Score: 12.331699924062196
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In unmanned aerial systems, especially in complex environments, accurately detecting tiny objects is crucial. Resizing images is a common strategy to improve detection accuracy, particularly for small objects. However, simply enlarging images significantly increases computational costs and the number of negative samples, severely degrading detection performance and limiting its applicability. This paper proposes a Dynamic Pooling Network (DPNet) for tiny object detection to mitigate these issues. DPNet employs a flexible down-sampling strategy by introducing a factor (df) to relax the fixed downsampling process of the feature map to an adjustable one. Furthermore, we design a lightweight predictor to predict df for each input image, which is used to decrease the resolution of feature maps in the backbone. Thus, we achieve input-aware downsampling. We also design an Adaptive Normalization Module (ANM) to make a unified detector compatible with different dfs. A guidance loss supervises the predictor's training. DPNet dynamically allocates computing resources to trade off between detection accuracy and efficiency. Experiments on the TinyCOCO and TinyPerson datasets show that DPNet can save over 35% and 25% GFLOPs, respectively, while maintaining comparable detection performance. The code will be made publicly available.

Related papers

RFWNet: A Lightweight Remote Sensing Object Detector Integrating Multiscale Receptive Fields and Foreground Focus Mechanism [10.997183129304409]
This study proposes an efficient and lightweight RSOD algorithm integrating multiscale receptive fields and foreground focus mechanism.<n>The comprehensive experimental results demonstrate that RFWNet achieved 95.3% and 73.2% mean average precision.
arXiv Detail & Related papers (2025-03-01T16:02:15Z)
Scale-Invariant Object Detection by Adaptive Convolution with Unified Global-Local Context [3.061662434597098]
We propose an object detection model using a Switchable (adaptive) Atrous Convolutional Network (SAC-Net) based on the efficientDet model.<n>The proposed SAC-Net encapsulates the benefits of both low-level and high-level features to achieve improved performance on multi-scale object detection tasks.<n>Our experiments on benchmark datasets demonstrate that the proposed SAC-Net outperforms the state-of-the-art models by a significant margin in terms of accuracy.
arXiv Detail & Related papers (2024-09-17T10:08:37Z)
ESOD: Efficient Small Object Detection on High-Resolution Images [36.80623357577051]
Small objects are usually sparsely distributed and locally clustered.<n>Massive feature extraction computations are wasted on the non-target background area of images.<n>We propose to reuse the detector's backbone to conduct feature-level object-seeking and patch-slicing.
arXiv Detail & Related papers (2024-07-23T12:21:23Z)
DyRA: Portable Dynamic Resolution Adjustment Network for Existing Detectors [0.669087470775851]
This paper introduces DyRA, a dynamic resolution adjustment network providing an image-specific scale factor for existing detectors. Loss function is devised to minimize the accuracy drop across contrasting objectives of different-sized objects for scaling.
arXiv Detail & Related papers (2023-11-28T07:52:41Z)
The Importance of Anti-Aliasing in Tiny Object Detection [0.0]
This paper applies an existing approach WaveCNet for anti-aliasing to tiny object detection. We modify the original WaveCNet to apply Wavelet Pooling layers, effectively suppressing aliasing. We also propose a bottom-heavy version of the backbone, which further improves the performance of tiny object detection.
arXiv Detail & Related papers (2023-10-22T08:02:01Z)
Global Context Aggregation Network for Lightweight Saliency Detection of Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure. First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module. The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z)
3D Small Object Detection with Dynamic Spatial Pruning [62.72638845817799]
We propose an efficient feature pruning strategy for 3D small object detection. We present a multi-level 3D detector named DSPDet3D which benefits from high spatial resolution. It takes less than 2s to directly process a whole building consisting of more than 4500k points while detecting out almost all objects.
arXiv Detail & Related papers (2023-05-05T17:57:04Z)
SALISA: Saliency-based Input Sampling for Efficient Video Object Detection [58.22508131162269]
We propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detection. We show that SALISA significantly improves the detection of small objects.
arXiv Detail & Related papers (2022-04-05T17:59:51Z)
Embracing Single Stride 3D Object Detector with Sparse Transformer [63.179720817019096]
In LiDAR-based 3D object detection for autonomous driving, the ratio of the object size to input scene size is significantly smaller compared to 2D detection cases. Many 3D detectors directly follow the common practice of 2D detectors, which downsample the feature maps even after quantizing the point clouds. We propose Single-stride Sparse Transformer (SST) to maintain the original resolution from the beginning to the end of the network.
arXiv Detail & Related papers (2021-12-13T02:12:02Z)
You Better Look Twice: a new perspective for designing accurate detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture. It reduces computations by separating objects from background using a very lite first-stage. Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z)
Resolution Adaptive Networks for Efficient Inference [53.04907454606711]
We propose a novel Resolution Adaptive Network (RANet), which is inspired by the intuition that low-resolution representations are sufficient for classifying "easy" inputs. In RANet, the input images are first routed to a lightweight sub-network that efficiently extracts low-resolution representations. High-resolution paths in the network maintain the capability to recognize the "hard" samples.
arXiv Detail & Related papers (2020-03-16T16:54:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.