ResFPN: Residual Skip Connections in Multi-Resolution Feature Pyramid
Networks for Accurate Dense Pixel Matching
- URL: http://arxiv.org/abs/2006.12235v1
- Date: Mon, 22 Jun 2020 13:31:31 GMT
- Title: ResFPN: Residual Skip Connections in Multi-Resolution Feature Pyramid
Networks for Accurate Dense Pixel Matching
- Authors: Rishav, Ren\'e Schuster, Ramy Battrawy, Oliver Wasenm\"uller, Didier
Stricker
- Abstract summary: Feature Pyramid Networks (FPN) have proven to be a suitable feature extractor for CNN-based dense matching tasks.
We present ResFPN -- a multi-resolution feature pyramid network with multiple residual skip connections.
In our ablation study, we demonstrate the effectiveness of our novel architecture with clearly higher accuracy than FPN.
- Score: 10.303618438296981
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dense pixel matching is required for many computer vision algorithms such as
disparity, optical flow or scene flow estimation. Feature Pyramid Networks
(FPN) have proven to be a suitable feature extractor for CNN-based dense
matching tasks. FPN generates well localized and semantically strong features
at multiple scales. However, the generic FPN is not utilizing its full
potential, due to its reasonable but limited localization accuracy. Thus, we
present ResFPN -- a multi-resolution feature pyramid network with multiple
residual skip connections, where at any scale, we leverage the information from
higher resolution maps for stronger and better localized features. In our
ablation study, we demonstrate the effectiveness of our novel architecture with
clearly higher accuracy than FPN. In addition, we verify the superior accuracy
of ResFPN in many different pixel matching applications on established datasets
like KITTI, Sintel, and FlyingThings3D.
Related papers
- Learning Unified Representations for Multi-Resolution Face Recognition [0.7378853859331619]
Branch-to-Trunk network (BTNet) is a representation learning method for multi-resolution face recognition.
Our experiments demonstrate strong performance on face recognition benchmarks, both for multi-resolution identity matching and feature aggregation, with much less amount and parameter storage.
arXiv Detail & Related papers (2023-10-14T11:26:43Z) - MultiRes-NetVLAD: Augmenting Place Recognition Training with
Low-Resolution Imagery [28.875236694573815]
We augment NetVLAD representation learning with low-resolution image pyramid encoding.
The resultant multi-resolution feature pyramid can be conveniently aggregated through VLAD into a single compact representation.
We show that the underlying learnt feature tensor can be combined with existing multi-scale approaches to improve their baseline performance.
arXiv Detail & Related papers (2022-02-18T11:53:01Z) - Trident Pyramid Networks: The importance of processing at the feature
pyramid level for better object detection [50.008529403150206]
We present a new core architecture called Trident Pyramid Network (TPN)
TPN allows for a deeper design and for a better balance between communication-based processing and self-processing.
We show consistent improvements when using our TPN core on the object detection benchmark, outperforming the popular BiFPN baseline by 1.5 AP.
arXiv Detail & Related papers (2021-10-08T09:59:59Z) - FaPN: Feature-aligned Pyramid Network for Dense Image Prediction [6.613724825924151]
We propose a feature alignment module that learns transformation offsets of pixels to contextually align upsampled features.
We then integrate these two modules in a top-down pyramidal architecture and present the Feature-aligned Pyramid Network (FaPN)
In particular, our FaPN achieves the state-of-the-art of 56.7% mIoU on ADE20K when integrated within Mask-Former.
arXiv Detail & Related papers (2021-08-16T12:52:42Z) - A^2-FPN: Attention Aggregation based Feature Pyramid Network for
Instance Segmentation [68.10621089649486]
We propose Attention Aggregation based Feature Pyramid Network (A2-FPN) to improve multi-scale feature learning.
A2-FPN achieves an improvement of 2.0% and 1.4% mask AP when integrated into the strong baselines such as Cascade Mask R-CNN and Hybrid Task Cascade.
arXiv Detail & Related papers (2021-05-07T11:51:08Z) - Implicit Feature Pyramid Network for Object Detection [22.530998243247154]
We present an implicit feature pyramid network (i-FPN) for object detection.
We propose to use an implicit function, recently introduced in deep equilibrium model (DEQ) to model the transformation of FPN.
Experimental results on MS dataset show that i-FPN can significantly boost detection performance compared to baseline detectors.
arXiv Detail & Related papers (2020-12-25T11:30:27Z) - MRDet: A Multi-Head Network for Accurate Oriented Object Detection in
Aerial Images [51.227489316673484]
We propose an arbitrary-oriented region proposal network (AO-RPN) to generate oriented proposals transformed from horizontal anchors.
To obtain accurate bounding boxes, we decouple the detection task into multiple subtasks and propose a multi-head network.
Each head is specially designed to learn the features optimal for the corresponding task, which allows our network to detect objects accurately.
arXiv Detail & Related papers (2020-12-24T06:36:48Z) - Dynamic Feature Pyramid Networks for Object Detection [40.24111664691307]
We introduce an inception FPN in which each layer contains convolution filters with different kernel sizes to enlarge the receptive field.
We propose a new dynamic FPN (DyFPN) which consists of multiple branches with different computational costs.
Experiments conducted on benchmarks demonstrate that the proposed DyFPN significantly improves performance with the optimal allocation of computation resources.
arXiv Detail & Related papers (2020-12-01T19:03:55Z) - Learning Deep Interleaved Networks with Asymmetric Co-Attention for
Image Restoration [65.11022516031463]
We present a deep interleaved network (DIN) that learns how information at different states should be combined for high-quality (HQ) images reconstruction.
In this paper, we propose asymmetric co-attention (AsyCA) which is attached at each interleaved node to model the feature dependencies.
Our presented DIN can be trained end-to-end and applied to various image restoration tasks.
arXiv Detail & Related papers (2020-10-29T15:32:00Z) - Temporal Pyramid Network for Action Recognition [129.12076009042622]
We propose a generic Temporal Pyramid Network (TPN) at the feature-level, which can be flexibly integrated into 2D or 3D backbone networks.
TPN shows consistent improvements over other challenging baselines on several action recognition datasets.
arXiv Detail & Related papers (2020-04-07T17:17:23Z) - Dense Residual Network: Enhancing Global Dense Feature Flow for
Character Recognition [75.4027660840568]
This paper explores how to enhance the local and global dense feature flow by exploiting hierarchical features fully from all the convolution layers.
Technically, we propose an efficient and effective CNN framework, i.e., Fast Dense Residual Network (FDRN) for text recognition.
arXiv Detail & Related papers (2020-01-23T06:55:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.