Pixel-Semantic Revise of Position Learning A One-Stage Object Detector
with A Shared Encoder-Decoder
- URL: http://arxiv.org/abs/2001.01057v2
- Date: Tue, 29 Sep 2020 02:28:34 GMT
- Title: Pixel-Semantic Revise of Position Learning A One-Stage Object Detector
with A Shared Encoder-Decoder
- Authors: Qian Li, Nan Guo, Xiaochun Ye, Dongrui Fan, and Zhimin Tang
- Abstract summary: We analyze that different methods detect objects adaptively.
Some state-of-the-art detectors combine different feature pyramids with many mechanisms to enhance multi-level semantic information.
This work addresses that by an anchor-free detector with shared encoder-decoder with attention mechanism.
- Score: 5.371825910267909
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, many methods have been proposed for object detection. They cannot
detect objects by semantic features, adaptively. In this work, according to
channel and spatial attention mechanisms, we mainly analyze that different
methods detect objects adaptively. Some state-of-the-art detectors combine
different feature pyramids with many mechanisms to enhance multi-level semantic
information. However, they require more cost. This work addresses that by an
anchor-free detector with shared encoder-decoder with attention mechanism,
extracting shared features. We consider features of different levels from
backbone (e.g., ResNet-50) as the basis features. Then, we feed the features
into a simple module, followed by a detector header to detect objects.
Meantime, we use the semantic features to revise geometric locations, and the
detector is a pixel-semantic revising of position. More importantly, this work
analyzes the impact of different pooling strategies (e.g., mean, maximum or
minimum) on multi-scale objects, and finds the minimum pooling improve
detection performance on small objects better. Compared with state-of-the-art
MNC based on ResNet-101 for the standard MSCOCO 2014 baseline, our method
improves detection AP of 3.8%.
Related papers
- Visible and Clear: Finding Tiny Objects in Difference Map [50.54061010335082]
We introduce a self-reconstruction mechanism in the detection model, and discover the strong correlation between it and the tiny objects.
Specifically, we impose a reconstruction head in-between the neck of a detector, constructing a difference map of the reconstructed image and the input, which shows high sensitivity to tiny objects.
We further develop a Difference Map Guided Feature Enhancement (DGFE) module to make the tiny feature representation more clear.
arXiv Detail & Related papers (2024-05-18T12:22:26Z) - Skipped Feature Pyramid Network with Grid Anchor for Object Detection [6.99246486061412]
We propose a skipped connection to obtain stronger semantics at each level of the feature pyramid.
In our method, the lower-level feature only connects with the feature at the highest level, making it more reasonable that each level is responsible for detecting objects with fixed scales.
arXiv Detail & Related papers (2023-10-22T23:27:05Z) - Fast and Accurate Object Detection on Asymmetrical Receptive Field [0.0]
This article proposes methods for improving object detection accuracy from the perspective of changing receptive fields.
The structure of the head part of YOLOv5 is modified by adding asymmetrical pooling layers.
The performances of the new model in this article are compared with original YOLOv5 model and analyzed from several parameters.
arXiv Detail & Related papers (2023-03-15T23:59:18Z) - Adaptive Rotated Convolution for Rotated Object Detection [96.94590550217718]
We present Adaptive Rotated Convolution (ARC) module to handle rotated object detection problem.
In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images.
The proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77% mAP.
arXiv Detail & Related papers (2023-03-14T11:53:12Z) - Hierarchical Point Attention for Indoor 3D Object Detection [111.04397308495618]
This work proposes two novel attention operations as generic hierarchical designs for point-based transformer detectors.
First, we propose Multi-Scale Attention (MS-A) that builds multi-scale tokens from a single-scale input feature to enable more fine-grained feature learning.
Second, we propose Size-Adaptive Local Attention (Local-A) with adaptive attention regions for localized feature aggregation within bounding box proposals.
arXiv Detail & Related papers (2023-01-06T18:52:12Z) - Rethinking the Detection Head Configuration for Traffic Object Detection [11.526701794026641]
We propose a lightweight traffic object detection network based on matching between detection head and object distribution.
The proposed model achieves more competitive performance than other models on BDD100K dataset and our proposed ETFOD-v2 dataset.
arXiv Detail & Related papers (2022-10-08T02:23:57Z) - Multi-patch Feature Pyramid Network for Weakly Supervised Object
Detection in Optical Remote Sensing Images [39.25541709228373]
We propose a new architecture for object detection with a multiple patch feature pyramid network (MPFP-Net)
MPFP-Net is different from the current models that during training only pursue the most discriminative patches.
We introduce an effective method to regularize the residual values and make the fusion transition layers strictly norm-preserving.
arXiv Detail & Related papers (2021-08-18T09:25:39Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z) - MultiResolution Attention Extractor for Small Object Detection [40.74232149130456]
Small objects are difficult to detect because of their low resolution and small size.
Inspired by human vision "attention" mechanism, we exploit two feature extraction methods to mine the most useful information of small objects.
arXiv Detail & Related papers (2020-06-10T16:47:56Z) - Hit-Detector: Hierarchical Trinity Architecture Search for Object
Detection [67.84976857449263]
We propose a hierarchical trinity search framework to simultaneously discover efficient architectures for all components of object detector.
We employ a novel scheme to automatically screen different sub search spaces for different components so as to perform the end-to-end search for each component efficiently.
Our searched architecture, namely Hit-Detector, achieves 41.4% mAP on COCO minival set with 27M parameters.
arXiv Detail & Related papers (2020-03-26T10:20:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.