Related papers: SE-PSNet: Silhouette-based Enhancement Feature for Panoptic Segmentation Network

SE-PSNet: Silhouette-based Enhancement Feature for Panoptic Segmentation Network

URL: http://arxiv.org/abs/2107.05093v1
Date: Sun, 11 Jul 2021 17:20:32 GMT
Title: SE-PSNet: Silhouette-based Enhancement Feature for Panoptic Segmentation Network
Authors: Shuo-En Chang, Yi-Cheng Yang, En-Ting Lin, Pei-Yung Hsiao, Li-Chen Fu
Abstract summary: We propose a solution to tackle the panoptic segmentation task. The structure combines the bottom-up method and the top-down method. The network mainly pays attention to the quality of the mask.
Score: 5.353718408751182
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Recently, there has been a panoptic segmentation task combining semantic and instance segmentation, in which the goal is to classify each pixel with the corresponding instance ID. In this work, we propose a solution to tackle the panoptic segmentation task. The overall structure combines the bottom-up method and the top-down method. Therefore, not only can there be better performance, but also the execution speed can be maintained. The network mainly pays attention to the quality of the mask. In the previous work, we can see that the uneven contour of the object is more likely to appear, resulting in low-quality prediction. Accordingly, we propose enhancement features and corresponding loss functions for the silhouette of objects and backgrounds to improve the mask. Meanwhile, we use the new proposed confidence score to solve the occlusion problem and make the network tend to use higher quality masks as prediction results. To verify our research, we used the COCO dataset and CityScapes dataset to do experiments and obtained competitive results with fast inference time.

Related papers

Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection [54.21851618853518]
We present a concise yet effective approach called Patch Generation-to-Selection to enhance CLIP's training efficiency. Our approach, CLIP-PGS, sets new state-of-the-art results in zero-shot classification and retrieval tasks.
arXiv Detail & Related papers (2025-03-21T12:10:38Z)
LAC-Net: Linear-Fusion Attention-Guided Convolutional Network for Accurate Robotic Grasping Under the Occlusion [79.22197702626542]
This paper introduces a framework that explores amodal segmentation for robotic grasping in cluttered scenes. We propose a Linear-fusion Attention-guided Convolutional Network (LAC-Net) The results on different datasets show that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-08-06T14:50:48Z)
Location-Aware Self-Supervised Transformers [74.76585889813207]
We propose to pretrain networks for semantic segmentation by predicting the relative location of image parts. We control the difficulty of the task by masking a subset of the reference patch features visible to those of the query. Our experiments show that this location-aware pretraining leads to representations that transfer competitively to several challenging semantic segmentation benchmarks.
arXiv Detail & Related papers (2022-12-05T16:24:29Z)
Simple Contrastive Graph Clustering [41.396185271303956]
We propose a Simple Contrastive Graph Clustering (SCGC) algorithm to improve the existing methods. Our algorithm outperforms the recent contrastive deep clustering competitors with at least seven times speedup on average.
arXiv Detail & Related papers (2022-05-11T06:45:19Z)
PFENet++: Boosting Few-shot Semantic Segmentation with the Noise-filtered Context-aware Prior Mask [62.37727055343632]
We revisit the prior mask guidance proposed in Guided Feature Enrichment Network for Few-Shot'' We propose the Context-aware Prior Mask (CAPM) that leverages additional nearby semantic cues for better locating the objects in query images. We take one step further by incorporating a lightweight Noise Suppression Module (NSM) to screen out the unnecessary responses.
arXiv Detail & Related papers (2021-09-28T15:07:43Z)
Self-Supervised Visual Representations Learning by Contrastive Mask Prediction [129.25459808288025]
We propose a novel contrastive mask prediction (CMP) task for visual representation learning. MaskCo contrasts region-level features instead of view-level features, which makes it possible to identify the positive sample without any assumptions. We evaluate MaskCo on training datasets beyond ImageNet and compare its performance with MoCo V2.
arXiv Detail & Related papers (2021-08-18T02:50:33Z)
An Efficient Multitask Neural Network for Face Alignment, Head Pose Estimation and Face Tracking [9.39854778804018]
We propose an efficient multitask face alignment, face tracking and head pose estimation network (ATPN) ATPN achieves improved performance compared to previous state-of-the-art methods while having less number of parameters and FLOPS.
arXiv Detail & Related papers (2021-03-13T04:41:15Z)
Spatiotemporal Graph Neural Network based Mask Reconstruction for Video Object Segmentation [70.97625552643493]
This paper addresses the task of segmenting class-agnostic objects in semi-supervised setting. We propose a novel graph neuralS network (TG-Net) which captures the local contexts by utilizing all proposals.
arXiv Detail & Related papers (2020-12-10T07:57:44Z)
CRNet: Cross-Reference Networks for Few-Shot Segmentation [59.85183776573642]
Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images. With a cross-reference mechanism, our network can better find the co-occurrent objects in the two images. Experiments on the PASCAL VOC 2012 dataset show that our network achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-03-24T04:55:43Z)
EPSNet: Efficient Panoptic Segmentation Network with Cross-layer Attention Fusion [5.815742965809424]
We propose an Efficient Panoptic Network (EPSNet) to tackle the panoptic segmentation tasks with fast inference speed. Basically, EPSNet generates masks based on simple linear combination of prototype masks and mask coefficients. To enhance the quality of shared prototypes, we adopt a module called "cross-layer attention fusion module"
arXiv Detail & Related papers (2020-03-23T09:11:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.