NETNet: Neighbor Erasing and Transferring Network for Better Single Shot
Object Detection
- URL: http://arxiv.org/abs/2001.06690v1
- Date: Sat, 18 Jan 2020 15:21:29 GMT
- Title: NETNet: Neighbor Erasing and Transferring Network for Better Single Shot
Object Detection
- Authors: Yazhao Li, Yanwei Pang, Jianbing Shen, Jiale Cao, Ling Shao
- Abstract summary: We propose a new Neighbor Erasing and Transferring (NET) mechanism to reconfigure the pyramid features and explore scale-aware features.
A single-shot network called NETNet is constructed for scale-aware object detection.
- Score: 170.30694322460045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the advantages of real-time detection and improved performance,
single-shot detectors have gained great attention recently. To solve the
complex scale variations, single-shot detectors make scale-aware predictions
based on multiple pyramid layers. However, the features in the pyramid are not
scale-aware enough, which limits the detection performance. Two common problems
in single-shot detectors caused by object scale variations can be observed: (1)
small objects are easily missed; (2) the salient part of a large object is
sometimes detected as an object. With this observation, we propose a new
Neighbor Erasing and Transferring (NET) mechanism to reconfigure the pyramid
features and explore scale-aware features. In NET, a Neighbor Erasing Module
(NEM) is designed to erase the salient features of large objects and emphasize
the features of small objects in shallow layers. A Neighbor Transferring Module
(NTM) is introduced to transfer the erased features and highlight large objects
in deep layers. With this mechanism, a single-shot network called NETNet is
constructed for scale-aware object detection. In addition, we propose to
aggregate nearest neighboring pyramid features to enhance our NET. NETNet
achieves 38.5% AP at a speed of 27 FPS and 32.0% AP at a speed of 55 FPS on MS
COCO dataset. As a result, NETNet achieves a better trade-off for real-time and
accurate object detection.
Related papers
- Scale-Invariant Object Detection by Adaptive Convolution with Unified Global-Local Context [3.061662434597098]
We propose an object detection model using a Switchable (adaptive) Atrous Convolutional Network (SAC-Net) based on the efficientDet model.
The proposed SAC-Net encapsulates the benefits of both low-level and high-level features to achieve improved performance on multi-scale object detection tasks.
Our experiments on benchmark datasets demonstrate that the proposed SAC-Net outperforms the state-of-the-art models by a significant margin in terms of accuracy.
arXiv Detail & Related papers (2024-09-17T10:08:37Z) - Visible and Clear: Finding Tiny Objects in Difference Map [50.54061010335082]
We introduce a self-reconstruction mechanism in the detection model, and discover the strong correlation between it and the tiny objects.
Specifically, we impose a reconstruction head in-between the neck of a detector, constructing a difference map of the reconstructed image and the input, which shows high sensitivity to tiny objects.
We further develop a Difference Map Guided Feature Enhancement (DGFE) module to make the tiny feature representation more clear.
arXiv Detail & Related papers (2024-05-18T12:22:26Z) - 3D Small Object Detection with Dynamic Spatial Pruning [62.72638845817799]
We propose an efficient feature pruning strategy for 3D small object detection.
We present a multi-level 3D detector named DSPDet3D which benefits from high spatial resolution.
It takes less than 2s to directly process a whole building consisting of more than 4500k points while detecting out almost all objects.
arXiv Detail & Related papers (2023-05-05T17:57:04Z) - Hierarchical Point Attention for Indoor 3D Object Detection [111.04397308495618]
This work proposes two novel attention operations as generic hierarchical designs for point-based transformer detectors.
First, we propose Multi-Scale Attention (MS-A) that builds multi-scale tokens from a single-scale input feature to enable more fine-grained feature learning.
Second, we propose Size-Adaptive Local Attention (Local-A) with adaptive attention regions for localized feature aggregation within bounding box proposals.
arXiv Detail & Related papers (2023-01-06T18:52:12Z) - Enhanced Single-shot Detector for Small Object Detection in Remote
Sensing Images [33.84369068593722]
We propose image pyramid single-shot detector (IPSSD) for small-scale object detection.
In IPSSD, single-shot detector is adopted combined with an image pyramid network to extract semantically strong features for generating candidate regions.
The proposed network can enhance the small-scale features from a feature pyramid network.
arXiv Detail & Related papers (2022-05-12T07:35:07Z) - Lightweight Salient Object Detection in Optical Remote Sensing Images
via Feature Correlation [93.80710126516405]
We propose a novel lightweight ORSI-SOD solution, named CorrNet, to address these issues.
By reducing the parameters and computations of each component, CorrNet ends up having only 4.09M parameters and running with 21.09G FLOPs.
Experimental results on two public datasets demonstrate that our lightweight CorrNet achieves competitive or even better performance compared with 26 state-of-the-art methods.
arXiv Detail & Related papers (2022-01-20T08:28:01Z) - Rethinking the Aligned and Misaligned Features in One-stage Object
Detection [9.270523894683278]
One-stage object detectors rely on the point feature to predict the detection results.
We propose a simple and plug-in operator that could generate aligned and disentangled features for each task.
Based on the object-aligned and task-disentangled operator (OAT), we propose OAT-Net, which explicitly exploits point-set features for more accurate detection results.
arXiv Detail & Related papers (2021-08-27T08:40:37Z) - Multi-patch Feature Pyramid Network for Weakly Supervised Object
Detection in Optical Remote Sensing Images [39.25541709228373]
We propose a new architecture for object detection with a multiple patch feature pyramid network (MPFP-Net)
MPFP-Net is different from the current models that during training only pursue the most discriminative patches.
We introduce an effective method to regularize the residual values and make the fusion transition layers strictly norm-preserving.
arXiv Detail & Related papers (2021-08-18T09:25:39Z) - Cross-layer Feature Pyramid Network for Salient Object Detection [102.20031050972429]
We propose a novel Cross-layer Feature Pyramid Network to improve the progressive fusion in salient object detection.
The distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information.
arXiv Detail & Related papers (2020-02-25T14:06:27Z) - PENet: Object Detection using Points Estimation in Aerial Images [9.33900415971554]
A novel network structure, Points Estimated Network (PENet), is proposed in this work to answer these challenges.
PENet uses a Mask Resampling Module (MRM) to augment the imbalanced datasets, a coarse anchor-free detector (CPEN) to effectively predict the center points of the small object clusters, and a fine anchor-free detector FPEN to locate the precise positions of the small objects.
Our experiments on aerial datasets visDrone and UAVDT showed that PENet achieved higher precision results than existing state-of-the-art approaches.
arXiv Detail & Related papers (2020-01-22T19:43:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.