Regularized Densely-connected Pyramid Network for Salient Instance
Segmentation
- URL: http://arxiv.org/abs/2008.12416v2
- Date: Fri, 12 Mar 2021 03:21:50 GMT
- Title: Regularized Densely-connected Pyramid Network for Salient Instance
Segmentation
- Authors: Yu-Huan Wu, Yun Liu, Le Zhang, Wang Gao, and Ming-Ming Cheng
- Abstract summary: We propose a new pipeline for end-to-end salient instance segmentation (SIS)
To better use the rich feature hierarchies in deep networks, we propose the regularized dense connections.
A novel multi-level RoIAlign based decoder is introduced to adaptively aggregate multi-level features for better mask predictions.
- Score: 73.17802158095813
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Much of the recent efforts on salient object detection (SOD) have been
devoted to producing accurate saliency maps without being aware of their
instance labels. To this end, we propose a new pipeline for end-to-end salient
instance segmentation (SIS) that predicts a class-agnostic mask for each
detected salient instance. To better use the rich feature hierarchies in deep
networks and enhance the side predictions, we propose the regularized dense
connections, which attentively promote informative features and suppress
non-informative ones from all feature pyramids. A novel multi-level RoIAlign
based decoder is introduced to adaptively aggregate multi-level features for
better mask predictions. Such strategies can be well-encapsulated into the Mask
R-CNN pipeline. Extensive experiments on popular benchmarks demonstrate that
our design significantly outperforms existing \sArt competitors by 6.3\%
(58.6\% vs. 52.3\%) in terms of the AP metric.The code is available at
https://github.com/yuhuan-wu/RDPNet.
Related papers
- Bridge the Points: Graph-based Few-shot Segment Anything Semantically [79.1519244940518]
Recent advancements in pre-training techniques have enhanced the capabilities of vision foundation models.
Recent studies extend the SAM to Few-shot Semantic segmentation (FSS)
We propose a simple yet effective approach based on graph analysis.
arXiv Detail & Related papers (2024-10-09T15:02:28Z) - M$^3$Net: Multilevel, Mixed and Multistage Attention Network for Salient
Object Detection [22.60675416709486]
M$3$Net is an attention network for Salient Object Detection.
Cross-attention approach to achieve the interaction between multilevel features.
Mixed Attention Block aims at modeling context at both global and local levels.
Multilevel supervision strategy to optimize the aggregated feature stage-by-stage.
arXiv Detail & Related papers (2023-09-15T12:46:14Z) - Position-Guided Point Cloud Panoptic Segmentation Transformer [118.17651196656178]
This work begins by applying this appealing paradigm to LiDAR-based point cloud segmentation and obtains a simple yet effective baseline.
We observe that instances in the sparse point clouds are relatively small to the whole scene and often have similar geometry but lack distinctive appearance for segmentation, which are rare in the image domain.
The method, named Position-guided Point cloud Panoptic segmentation transFormer (P3Former), outperforms previous state-of-the-art methods by 3.4% and 1.2% on Semantic KITTI and nuScenes benchmark, respectively.
arXiv Detail & Related papers (2023-03-23T17:59:02Z) - Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From
Learned Pairwise Affinity [59.1823948436411]
We propose a novel approach for mask proposals, Generic Grouping Networks (GGNs)
Our approach combines a local measure of pixel affinity with instance-level mask supervision, producing a training regimen designed to make the model as generic as the data diversity allows.
arXiv Detail & Related papers (2022-04-12T22:37:49Z) - Pyramid Fusion Transformer for Semantic Segmentation [44.57867861592341]
We propose a transformer-based Pyramid Fusion Transformer (PFT) for per-mask approach semantic segmentation with multi-scale features.
We achieve competitive performance on three widely used semantic segmentation datasets.
arXiv Detail & Related papers (2022-01-11T16:09:25Z) - MD-CSDNetwork: Multi-Domain Cross Stitched Network for Deepfake
Detection [80.83725644958633]
Current deepfake generation methods leave discriminative artifacts in the frequency spectrum of fake images and videos.
We present a novel approach, termed as MD-CSDNetwork, for combining the features in the spatial and frequency domains to mine a shared discriminative representation.
arXiv Detail & Related papers (2021-09-15T14:11:53Z) - FaPN: Feature-aligned Pyramid Network for Dense Image Prediction [6.613724825924151]
We propose a feature alignment module that learns transformation offsets of pixels to contextually align upsampled features.
We then integrate these two modules in a top-down pyramidal architecture and present the Feature-aligned Pyramid Network (FaPN)
In particular, our FaPN achieves the state-of-the-art of 56.7% mIoU on ADE20K when integrated within Mask-Former.
arXiv Detail & Related papers (2021-08-16T12:52:42Z) - Structure-Consistent Weakly Supervised Salient Object Detection with
Local Saliency Coherence [14.79639149658596]
We propose a one-round end-to-end training approach for weakly supervised salient object detection via scribble annotations.
Our method achieves a new state-of-the-art performance on six benchmarks.
arXiv Detail & Related papers (2020-12-08T12:49:40Z) - PointINS: Point-based Instance Segmentation [117.38579097923052]
Mask representation in instance segmentation with Point-of-Interest (PoI) features is challenging because learning a high-dimensional mask feature for each instance requires a heavy computing burden.
We propose an instance-aware convolution, which decomposes this mask representation learning task into two tractable modules.
Along with instance-aware convolution, we propose PointINS, a simple and practical instance segmentation approach.
arXiv Detail & Related papers (2020-03-13T08:24:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.