Trident Pyramid Networks: The importance of processing at the feature
pyramid level for better object detection
- URL: http://arxiv.org/abs/2110.04004v1
- Date: Fri, 8 Oct 2021 09:59:59 GMT
- Title: Trident Pyramid Networks: The importance of processing at the feature
pyramid level for better object detection
- Authors: C\'edric Picron, Tinne Tuytelaars
- Abstract summary: We present a new core architecture called Trident Pyramid Network (TPN)
TPN allows for a deeper design and for a better balance between communication-based processing and self-processing.
We show consistent improvements when using our TPN core on the object detection benchmark, outperforming the popular BiFPN baseline by 1.5 AP.
- Score: 50.008529403150206
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feature pyramids have become ubiquitous in multi-scale computer vision tasks
such as object detection. Based on their importance, we divide a computer
vision network into three parts: a backbone (generating a feature pyramid), a
core (refining the feature pyramid) and a head (generating the final output).
Most existing networks operating on feature pyramids, named cores, are shallow
and mostly focus on communication-based processing in the form of top-down and
bottom-up operations. We present a new core architecture called Trident Pyramid
Network (TPN), that allows for a deeper design and for a better balance between
communication-based processing and self-processing. We show consistent
improvements when using our TPN core on the COCO object detection benchmark,
outperforming the popular BiFPN baseline by 1.5 AP. Additionally, we
empirically show that it is more beneficial to put additional computation into
the TPN core, rather than into the backbone, by outperforming a ResNet-101+FPN
baseline with our ResNet-50+TPN network by 1.7 AP, while operating under
similar computation budgets. This emphasizes the importance of performing
computation at the feature pyramid level in modern-day object detection
systems. Code will be released.
Related papers
- Active search and coverage using point-cloud reinforcement learning [50.741409008225766]
This paper presents an end-to-end deep reinforcement learning solution for target search and coverage.
We show that deep hierarchical feature learning works for RL and that by using farthest point sampling (FPS) we can reduce the amount of points.
We also show that multi-head attention for point-clouds helps to learn the agent faster but converges to the same outcome.
arXiv Detail & Related papers (2023-12-18T18:16:30Z) - RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object
Detection [10.847953426161924]
We propose RCNet, which consists of Reverse Feature Pyramid (RevFP) and Cross-scale Shift Network (CSN)
RevFP utilizes local bidirectional feature fusion to simplify the bidirectional pyramid inference pipeline.
CSN directly propagates representations to both adjacent and non-adjacent levels to enable multi-scale features more correlative.
arXiv Detail & Related papers (2021-10-23T04:00:25Z) - Siamese Transformer Pyramid Networks for Real-Time UAV Tracking [3.0969191504482243]
We introduce the Siamese Transformer Pyramid Network (SiamTPN), which inherits the advantages from both CNN and Transformer architectures.
Experiments on both aerial and prevalent tracking benchmarks achieve competitive results while operating at high speed.
Our fastest variant tracker operates over 30 Hz on a single CPU-core and obtaining an AUC score of 58.1% on the LaSOT dataset.
arXiv Detail & Related papers (2021-10-17T13:48:31Z) - GraphFPN: Graph Feature Pyramid Network for Object Detection [44.481481251032264]
We propose graph feature pyramid networks that are capable of adapting their topological structures to varying intrinsic image structures.
The proposed graph feature pyramid network can enhance the multiscale features from a convolutional feature pyramid network.
We evaluate our graph feature pyramid network in the object detection task by integrating it into the Faster R-CNN algorithm.
arXiv Detail & Related papers (2021-08-02T01:19:38Z) - P2T: Pyramid Pooling Transformer for Scene Understanding [62.41912463252468]
We build a downstream-task-oriented transformer network, dubbed Pyramid Pooling Transformer (P2T)
Plugged with our pooling-based MHSA, we build a downstream-task-oriented transformer network, dubbed Pyramid Pooling Transformer (P2T)
arXiv Detail & Related papers (2021-06-22T18:28:52Z) - A^2-FPN: Attention Aggregation based Feature Pyramid Network for
Instance Segmentation [68.10621089649486]
We propose Attention Aggregation based Feature Pyramid Network (A2-FPN) to improve multi-scale feature learning.
A2-FPN achieves an improvement of 2.0% and 1.4% mask AP when integrated into the strong baselines such as Cascade Mask R-CNN and Hybrid Task Cascade.
arXiv Detail & Related papers (2021-05-07T11:51:08Z) - Implicit Feature Pyramid Network for Object Detection [22.530998243247154]
We present an implicit feature pyramid network (i-FPN) for object detection.
We propose to use an implicit function, recently introduced in deep equilibrium model (DEQ) to model the transformation of FPN.
Experimental results on MS dataset show that i-FPN can significantly boost detection performance compared to baseline detectors.
arXiv Detail & Related papers (2020-12-25T11:30:27Z) - Hierarchical Neural Architecture Search for Deep Stereo Matching [131.94481111956853]
We propose the first end-to-end hierarchical NAS framework for deep stereo matching.
Our framework incorporates task-specific human knowledge into the neural architecture search framework.
It is ranked at the top 1 accuracy on KITTI stereo 2012, 2015 and Middlebury benchmarks, as well as the top 1 on SceneFlow dataset.
arXiv Detail & Related papers (2020-10-26T11:57:37Z) - ResFPN: Residual Skip Connections in Multi-Resolution Feature Pyramid
Networks for Accurate Dense Pixel Matching [10.303618438296981]
Feature Pyramid Networks (FPN) have proven to be a suitable feature extractor for CNN-based dense matching tasks.
We present ResFPN -- a multi-resolution feature pyramid network with multiple residual skip connections.
In our ablation study, we demonstrate the effectiveness of our novel architecture with clearly higher accuracy than FPN.
arXiv Detail & Related papers (2020-06-22T13:31:31Z) - Feature Pyramid Grids [140.11116687047058]
We present Feature Pyramid Grids (FPG), a deep multi-pathway feature pyramid.
FPG can improve single-pathway feature pyramid networks by significantly increasing its performance at similar computation cost.
arXiv Detail & Related papers (2020-04-07T17:59:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.