Unifying Training and Inference for Panoptic Segmentation
- URL: http://arxiv.org/abs/2001.04982v2
- Date: Tue, 26 May 2020 18:44:52 GMT
- Title: Unifying Training and Inference for Panoptic Segmentation
- Authors: Qizhu Li, Xiaojuan Qi, Philip H.S. Torr
- Abstract summary: We present an end-to-end network to bridge the gap between training and inference for panoptic segmentation.
Our system sets new records on the popular street scene dataset, Cityscapes, achieving 61.4 PQ with a ResNet-50 backbone.
Our network flexibly works with and without object mask cues, performing competitively under both settings.
- Score: 111.44758195510838
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an end-to-end network to bridge the gap between training and
inference pipeline for panoptic segmentation, a task that seeks to partition an
image into semantic regions for "stuff" and object instances for "things". In
contrast to recent works, our network exploits a parametrised, yet lightweight
panoptic segmentation submodule, powered by an end-to-end learnt dense instance
affinity, to capture the probability that any pair of pixels belong to the same
instance. This panoptic submodule gives rise to a novel propagation mechanism
for panoptic logits and enables the network to output a coherent panoptic
segmentation map for both "stuff" and "thing" classes, without any
post-processing. Reaping the benefits of end-to-end training, our full system
sets new records on the popular street scene dataset, Cityscapes, achieving
61.4 PQ with a ResNet-50 backbone using only the fine annotations. On the
challenging COCO dataset, our ResNet-50-based network also delivers
state-of-the-art accuracy of 43.4 PQ. Moreover, our network flexibly works with
and without object mask cues, performing competitively under both settings,
which is of interest for applications with computation budgets.
Related papers
- Early Fusion of Features for Semantic Segmentation [10.362589129094975]
This paper introduces a novel segmentation framework that integrates a classifier network with a reverse HRNet architecture for efficient image segmentation.
Our methodology is rigorously tested across several benchmark datasets including Mapillary Vistas, Cityscapes, CamVid, COCO, and PASCAL-VOC2012.
The results demonstrate the effectiveness of our proposed model in achieving high segmentation accuracy, indicating its potential for various applications in image analysis.
arXiv Detail & Related papers (2024-02-08T22:58:06Z) - You Only Segment Once: Towards Real-Time Panoptic Segmentation [68.91492389185744]
YOSO is a real-time panoptic segmentation framework.
YOSO predicts masks via dynamic convolutions between panoptic kernels and image feature maps.
YOSO achieves 46.4 PQ, 45.6 FPS on COCO; 52.5 PQ, 22.6 FPS on Cityscapes; 38.0 PQ, 35.4 FPS on ADE20K.
arXiv Detail & Related papers (2023-03-26T07:55:35Z) - Panoptic Segmentation Meets Remote Sensing [0.0]
Panoptic segmentation combines instance and semantic predictions, allowing the detection of "things" and "stuff" simultaneously.
This study aims to solve and increase the operability of panoptic segmentation in remote sensing.
arXiv Detail & Related papers (2021-11-23T19:48:55Z) - K-Net: Towards Unified Image Segmentation [78.32096542571257]
The framework, named K-Net, segments both instances and semantic categories consistently by a group of learnable kernels.
K-Net can be trained in an end-to-end manner with bipartite matching, and its training and inference are naturally NMS-free and box-free.
arXiv Detail & Related papers (2021-06-28T17:18:21Z) - Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning.
Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector.
We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z) - CRNet: Cross-Reference Networks for Few-Shot Segmentation [59.85183776573642]
Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images.
With a cross-reference mechanism, our network can better find the co-occurrent objects in the two images.
Experiments on the PASCAL VOC 2012 dataset show that our network achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-03-24T04:55:43Z) - Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning [86.45526827323954]
Weakly-supervised semantic segmentation is a challenging task as no pixel-wise label information is provided for training.
We propose an iterative algorithm to learn such pairwise relations.
We show that the proposed algorithm performs favorably against the state-of-the-art methods.
arXiv Detail & Related papers (2020-02-19T10:32:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.