A novel Region of Interest Extraction Layer for Instance Segmentation
- URL: http://arxiv.org/abs/2004.13665v2
- Date: Thu, 1 Oct 2020 14:12:03 GMT
- Title: A novel Region of Interest Extraction Layer for Instance Segmentation
- Authors: Leonardo Rossi, Akbar Karimi, Andrea Prati
- Abstract summary: This paper is motivated by the need to overcome the limitations of existing RoI extractors.
The proposed layer (called Generic RoI Extractor - GRoIE) introduces non-local building blocks and attention mechanisms to boost the performance.
GRoIE can be integrated seamlessly with every two-stage architecture for both object detection and instance segmentation tasks.
- Score: 3.5493798890908104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Given the wide diffusion of deep neural network architectures for computer
vision tasks, several new applications are nowadays more and more feasible.
Among them, a particular attention has been recently given to instance
segmentation, by exploiting the results achievable by two-stage networks (such
as Mask R-CNN or Faster R-CNN), derived from R-CNN. In these complex
architectures, a crucial role is played by the Region of Interest (RoI)
extraction layer, devoted to extracting a coherent subset of features from a
single Feature Pyramid Network (FPN) layer attached on top of a backbone.
This paper is motivated by the need to overcome the limitations of existing
RoI extractors which select only one (the best) layer from FPN. Our intuition
is that all the layers of FPN retain useful information. Therefore, the
proposed layer (called Generic RoI Extractor - GRoIE) introduces non-local
building blocks and attention mechanisms to boost the performance.
A comprehensive ablation study at component level is conducted to find the
best set of algorithms and parameters for the GRoIE layer. Moreover, GRoIE can
be integrated seamlessly with every two-stage architecture for both object
detection and instance segmentation tasks. Therefore, the improvements brought
about by the use of GRoIE in different state-of-the-art architectures are also
evaluated. The proposed layer leads up to gain a 1.1% AP improvement on
bounding box detection and 1.7% AP improvement on instance segmentation.
The code is publicly available on GitHub repository at
https://github.com/IMPLabUniPr/mmdetection/tree/groie_dev
Related papers
- Dynamic Perceiver for Efficient Visual Recognition [87.08210214417309]
We propose Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure and the early classification task.
A feature branch serves to extract image features, while a classification branch processes a latent code assigned for classification tasks.
Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.
arXiv Detail & Related papers (2023-06-20T03:00:22Z) - A^2-FPN: Attention Aggregation based Feature Pyramid Network for
Instance Segmentation [68.10621089649486]
We propose Attention Aggregation based Feature Pyramid Network (A2-FPN) to improve multi-scale feature learning.
A2-FPN achieves an improvement of 2.0% and 1.4% mask AP when integrated into the strong baselines such as Cascade Mask R-CNN and Hybrid Task Cascade.
arXiv Detail & Related papers (2021-05-07T11:51:08Z) - Auto-Panoptic: Cooperative Multi-Component Architecture Search for
Panoptic Segmentation [144.50154657257605]
We propose an efficient framework to simultaneously search for all main components including backbone, segmentation branches, and feature fusion module.
Our searched architecture, namely Auto-Panoptic, achieves the new state-of-the-art on the challenging COCO and ADE20K benchmarks.
arXiv Detail & Related papers (2020-10-30T08:34:35Z) - DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures.
We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z) - When Residual Learning Meets Dense Aggregation: Rethinking the
Aggregation of Deep Neural Networks [57.0502745301132]
We propose Micro-Dense Nets, a novel architecture with global residual learning and local micro-dense aggregations.
Our micro-dense block can be integrated with neural architecture search based models to boost their performance.
arXiv Detail & Related papers (2020-04-19T08:34:52Z) - Dense Residual Network: Enhancing Global Dense Feature Flow for
Character Recognition [75.4027660840568]
This paper explores how to enhance the local and global dense feature flow by exploiting hierarchical features fully from all the convolution layers.
Technically, we propose an efficient and effective CNN framework, i.e., Fast Dense Residual Network (FDRN) for text recognition.
arXiv Detail & Related papers (2020-01-23T06:55:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.