Skipped Feature Pyramid Network with Grid Anchor for Object Detection
- URL: http://arxiv.org/abs/2310.14453v1
- Date: Sun, 22 Oct 2023 23:27:05 GMT
- Title: Skipped Feature Pyramid Network with Grid Anchor for Object Detection
- Authors: Li Pengfei, Wei Wei, Yan Yu, Zhu Rong, Zhou Liguo
- Abstract summary: We propose a skipped connection to obtain stronger semantics at each level of the feature pyramid.
In our method, the lower-level feature only connects with the feature at the highest level, making it more reasonable that each level is responsible for detecting objects with fixed scales.
- Score: 6.99246486061412
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: CNN-based object detection methods have achieved significant progress in
recent years. The classic structures of CNNs produce pyramid-like feature maps
due to the pooling or other re-scale operations. The feature maps in different
levels of the feature pyramid are used to detect objects with different scales.
For more accurate object detection, the highest-level feature, which has the
lowest resolution and contains the strongest semantics, is up-scaled and
connected with the lower-level features to enhance the semantics in the
lower-level features. However, the classic mode of feature connection combines
the feature of lower-level with all the features above it, which may result in
semantics degradation. In this paper, we propose a skipped connection to obtain
stronger semantics at each level of the feature pyramid. In our method, the
lower-level feature only connects with the feature at the highest level, making
it more reasonable that each level is responsible for detecting objects with
fixed scales. In addition, we simplify the generation of anchor for bounding
box regression, which can further improve the accuracy of object detection. The
experiments on the MS COCO and Wider Face demonstrate that our method
outperforms the state-of-the-art methods.
Related papers
- AFPN: Asymptotic Feature Pyramid Network for Object Detection [16.86715579071991]
This paper proposes an feature pyramid network (AFPN) to support direct interaction at non-adjacent levels.
AFPN is initiated by fusing two adjacent low-level features and achieves higher-level features into the fusion process.
We incorporate the proposed AFPN into both two-stage and one-stage object detection frameworks and evaluate with the MS-COCO 2017 validation and test datasets.
arXiv Detail & Related papers (2023-06-28T07:58:49Z) - Multistep feature aggregation framework for salient object detection [0.0]
We introduce a multistep feature aggregation framework for salient object detection.
It is composed of three modules, including the Diverse Reception (DR) module, multiscale interaction (MSI) module and Feature Enhancement (FE) module.
Experimental results on six benchmark datasets demonstrate that MSFA achieves state-of-the-art performance.
arXiv Detail & Related papers (2022-11-12T16:13:16Z) - Improving Semantic Segmentation in Transformers using Hierarchical
Inter-Level Attention [68.7861229363712]
Hierarchical Inter-Level Attention (HILA) is an attention-based method that captures Bottom-Up and Top-Down Updates between features of different levels.
HILA extends hierarchical vision transformer architectures by adding local connections between features of higher and lower levels to the backbone encoder.
We show notable improvements in accuracy in semantic segmentation with fewer parameters and FLOPS.
arXiv Detail & Related papers (2022-07-05T15:47:31Z) - Multi-patch Feature Pyramid Network for Weakly Supervised Object
Detection in Optical Remote Sensing Images [39.25541709228373]
We propose a new architecture for object detection with a multiple patch feature pyramid network (MPFP-Net)
MPFP-Net is different from the current models that during training only pursue the most discriminative patches.
We introduce an effective method to regularize the residual values and make the fusion transition layers strictly norm-preserving.
arXiv Detail & Related papers (2021-08-18T09:25:39Z) - Fine-Grained Dynamic Head for Object Detection [68.70628757217939]
We propose a fine-grained dynamic head to conditionally select a pixel-level combination of FPN features from different scales for each instance.
Experiments demonstrate the effectiveness and efficiency of the proposed method on several state-of-the-art detection benchmarks.
arXiv Detail & Related papers (2020-12-07T08:16:32Z) - Adaptive Linear Span Network for Object Skeleton Detection [56.78705071830965]
We propose adaptive linear span network (AdaLSN) to automatically configure and integrate scale-aware features for object skeleton detection.
AdaLSN substantiates its versatility by achieving significantly higher accuracy and latency trade-off.
It also demonstrates general applicability to image-to-mask tasks such as edge detection and road extraction.
arXiv Detail & Related papers (2020-11-08T12:51:14Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z) - Cross-layer Feature Pyramid Network for Salient Object Detection [102.20031050972429]
We propose a novel Cross-layer Feature Pyramid Network to improve the progressive fusion in salient object detection.
The distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information.
arXiv Detail & Related papers (2020-02-25T14:06:27Z) - Pixel-Semantic Revise of Position Learning A One-Stage Object Detector
with A Shared Encoder-Decoder [5.371825910267909]
We analyze that different methods detect objects adaptively.
Some state-of-the-art detectors combine different feature pyramids with many mechanisms to enhance multi-level semantic information.
This work addresses that by an anchor-free detector with shared encoder-decoder with attention mechanism.
arXiv Detail & Related papers (2020-01-04T08:55:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.