FaPN: Feature-aligned Pyramid Network for Dense Image Prediction
- URL: http://arxiv.org/abs/2108.07058v2
- Date: Tue, 17 Aug 2021 13:11:34 GMT
- Title: FaPN: Feature-aligned Pyramid Network for Dense Image Prediction
- Authors: Shihua Huang, Zhichao Lu, Ran Cheng, Cheng He
- Abstract summary: We propose a feature alignment module that learns transformation offsets of pixels to contextually align upsampled features.
We then integrate these two modules in a top-down pyramidal architecture and present the Feature-aligned Pyramid Network (FaPN)
In particular, our FaPN achieves the state-of-the-art of 56.7% mIoU on ADE20K when integrated within Mask-Former.
- Score: 6.613724825924151
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in deep neural networks have made remarkable
leap-forwards in dense image prediction. However, the issue of feature
alignment remains as neglected by most existing approaches for simplicity.
Direct pixel addition between upsampled and local features leads to feature
maps with misaligned contexts that, in turn, translate to mis-classifications
in prediction, especially on object boundaries. In this paper, we propose a
feature alignment module that learns transformation offsets of pixels to
contextually align upsampled higher-level features; and another feature
selection module to emphasize the lower-level features with rich spatial
details. We then integrate these two modules in a top-down pyramidal
architecture and present the Feature-aligned Pyramid Network (FaPN). Extensive
experimental evaluations on four dense prediction tasks and four datasets have
demonstrated the efficacy of FaPN, yielding an overall improvement of 1.2 - 2.6
points in AP / mIoU over FPN when paired with Faster / Mask R-CNN. In
particular, our FaPN achieves the state-of-the-art of 56.7% mIoU on ADE20K when
integrated within Mask-Former. The code is available from
https://github.com/EMI-Group/FaPN.
Related papers
- An Efficient MLP-based Point-guided Segmentation Network for Ore Images
with Ambiguous Boundary [12.258442550351178]
This paper proposes a lightweight framework based on Multi-Layer Perceptron (MLP), which focuses on solving the problem of edge burring.
Our approach achieves a remarkable processing speed of over 27 frames per second with a model size of only 73 MB.
Our method delivers a consistently high level of accuracy, with impressive performance scores of 60.4 and 48.9 in$AP_50box$ and$AP_50mask$ respectively.
arXiv Detail & Related papers (2024-02-27T10:09:29Z) - CoFiI2P: Coarse-to-Fine Correspondences for Image-to-Point Cloud Registration [9.57539651520755]
CoFiI2P is a novel I2P registration network that extracts correspondences in a coarse-to-fine manner.
In the coarse matching phase, a novel I2P transformer module is employed to capture both homogeneous and heterogeneous global information.
In the fine matching module, point/pixel pairs are established with the guidance of super-point/super-pixel correspondences.
arXiv Detail & Related papers (2023-09-26T04:32:38Z) - Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs.
Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z) - AFPN: Asymptotic Feature Pyramid Network for Object Detection [16.86715579071991]
This paper proposes an feature pyramid network (AFPN) to support direct interaction at non-adjacent levels.
AFPN is initiated by fusing two adjacent low-level features and achieves higher-level features into the fusion process.
We incorporate the proposed AFPN into both two-stage and one-stage object detection frameworks and evaluate with the MS-COCO 2017 validation and test datasets.
arXiv Detail & Related papers (2023-06-28T07:58:49Z) - Learning Implicit Feature Alignment Function for Semantic Segmentation [51.36809814890326]
Implicit Feature Alignment function (IFA) is inspired by the rapidly expanding topic of implicit neural representations.
We show that IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions.
Our method can be combined with improvement on various architectures, and it achieves state-of-the-art accuracy trade-off on common benchmarks.
arXiv Detail & Related papers (2022-06-17T09:40:14Z) - A^2-FPN: Attention Aggregation based Feature Pyramid Network for
Instance Segmentation [68.10621089649486]
We propose Attention Aggregation based Feature Pyramid Network (A2-FPN) to improve multi-scale feature learning.
A2-FPN achieves an improvement of 2.0% and 1.4% mask AP when integrated into the strong baselines such as Cascade Mask R-CNN and Hybrid Task Cascade.
arXiv Detail & Related papers (2021-05-07T11:51:08Z) - RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for
Image Recognition [123.59890802196797]
We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition.
We construct convolutional layers inside a RepMLP during training and merge them into the FC for inference.
By inserting RepMLP in traditional CNN, we improve ResNets by 1.8% accuracy on ImageNet, 2.9% for face recognition, and 2.3% mIoU on Cityscapes with lower FLOPs.
arXiv Detail & Related papers (2021-05-05T06:17:40Z) - Regularized Densely-connected Pyramid Network for Salient Instance
Segmentation [73.17802158095813]
We propose a new pipeline for end-to-end salient instance segmentation (SIS)
To better use the rich feature hierarchies in deep networks, we propose the regularized dense connections.
A novel multi-level RoIAlign based decoder is introduced to adaptively aggregate multi-level features for better mask predictions.
arXiv Detail & Related papers (2020-08-28T00:13:30Z) - ResFPN: Residual Skip Connections in Multi-Resolution Feature Pyramid
Networks for Accurate Dense Pixel Matching [10.303618438296981]
Feature Pyramid Networks (FPN) have proven to be a suitable feature extractor for CNN-based dense matching tasks.
We present ResFPN -- a multi-resolution feature pyramid network with multiple residual skip connections.
In our ablation study, we demonstrate the effectiveness of our novel architecture with clearly higher accuracy than FPN.
arXiv Detail & Related papers (2020-06-22T13:31:31Z) - Dense Residual Network: Enhancing Global Dense Feature Flow for
Character Recognition [75.4027660840568]
This paper explores how to enhance the local and global dense feature flow by exploiting hierarchical features fully from all the convolution layers.
Technically, we propose an efficient and effective CNN framework, i.e., Fast Dense Residual Network (FDRN) for text recognition.
arXiv Detail & Related papers (2020-01-23T06:55:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.