Learning to Borrow Features for Improved Detection of Small Objects in Single-Shot Detectors
- URL: http://arxiv.org/abs/2505.00044v1
- Date: Wed, 30 Apr 2025 01:18:33 GMT
- Title: Learning to Borrow Features for Improved Detection of Small Objects in Single-Shot Detectors
- Authors: Richard Schmit,
- Abstract summary: We propose a novel framework that enables small object representations to "borrow" discriminative features from larger, semantically richer instances within the same class.<n>Our approach significantly boosts small object detection accuracy over baseline methods, offering a promising direction for robust object detection in complex visual environments.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Detecting small objects remains a significant challenge in single-shot object detectors due to the inherent trade-off between spatial resolution and semantic richness in convolutional feature maps. To address this issue, we propose a novel framework that enables small object representations to "borrow" discriminative features from larger, semantically richer instances within the same class. Our architecture introduces three key components: the Feature Matching Block (FMB) to identify semantically similar descriptors across layers, the Feature Representing Block (FRB) to generate enhanced shallow features through weighted aggregation, and the Feature Fusion Block (FFB) to refine feature maps by integrating original, borrowed, and context information. Built upon the SSD framework, our method improves the descriptive capacity of shallow layers while maintaining real-time detection performance. Experimental results demonstrate that our approach significantly boosts small object detection accuracy over baseline methods, offering a promising direction for robust object detection in complex visual environments.
Related papers
- Efficient Feature Fusion for UAV Object Detection [9.632727117779178]
Small objects, in particular, occupy small portions of images, making their accurate detection difficult.<n>Existing multi-scale feature fusion methods address these challenges by aggregating features across different resolutions.<n>We propose a novel feature fusion framework specifically designed for UAV object detection tasks.
arXiv Detail & Related papers (2025-01-29T20:39:16Z) - Learning Spatial-Semantic Features for Robust Video Object Segmentation [108.045326229865]
We propose a robust video object segmentation framework that learns spatial-semantic features and discriminative object queries.<n>The proposed method achieves state-of-the-art performance on benchmark data sets, including the DAVIS 2017 test (textbf87.8%), YoutubeVOS 2019 (textbf88.1%), MOSE val (textbf74.0%), and LVOS test (textbf73.0%)
arXiv Detail & Related papers (2024-07-10T15:36:00Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Hi-ResNet: Edge Detail Enhancement for High-Resolution Remote Sensing Segmentation [10.919956120261539]
High-resolution remote sensing (HRS) semantic segmentation extracts key objects from high-resolution coverage areas.
objects of the same category within HRS images show significant differences in scale and shape across diverse geographical environments.
We propose a High-resolution remote sensing network (Hi-ResNet) with efficient network structure designs.
arXiv Detail & Related papers (2023-05-22T03:58:25Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder.
We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets.
We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z) - VIN: Voxel-based Implicit Network for Joint 3D Object Detection and
Segmentation for Lidars [12.343333815270402]
A unified neural network structure is presented for joint 3D object detection and point cloud segmentation.
We leverage rich supervision from both detection and segmentation labels rather than using just one of them.
arXiv Detail & Related papers (2021-07-07T02:16:20Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - MultiResolution Attention Extractor for Small Object Detection [40.74232149130456]
Small objects are difficult to detect because of their low resolution and small size.
Inspired by human vision "attention" mechanism, we exploit two feature extraction methods to mine the most useful information of small objects.
arXiv Detail & Related papers (2020-06-10T16:47:56Z) - Cross-layer Feature Pyramid Network for Salient Object Detection [102.20031050972429]
We propose a novel Cross-layer Feature Pyramid Network to improve the progressive fusion in salient object detection.
The distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information.
arXiv Detail & Related papers (2020-02-25T14:06:27Z) - Pixel-Semantic Revise of Position Learning A One-Stage Object Detector
with A Shared Encoder-Decoder [5.371825910267909]
We analyze that different methods detect objects adaptively.
Some state-of-the-art detectors combine different feature pyramids with many mechanisms to enhance multi-level semantic information.
This work addresses that by an anchor-free detector with shared encoder-decoder with attention mechanism.
arXiv Detail & Related papers (2020-01-04T08:55:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.