Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation
- URL: http://arxiv.org/abs/2012.05007v1
- Date: Wed, 9 Dec 2020 12:40:13 GMT
- Title: Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation
- Authors: Xueyi Li, Tianfei Zhou, Jianwu Li, Yi Zhou, Zhaoxiang Zhang
- Abstract summary: This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
- Score: 49.90178055521207
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Acquiring sufficient ground-truth supervision to train deep visual models has
been a bottleneck over the years due to the data-hungry nature of deep
learning. This is exacerbated in some structured prediction tasks, such as
semantic segmentation, which requires pixel-level annotations. This work
addresses weakly supervised semantic segmentation (WSSS), with the goal of
bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models
semantic dependencies in a group of images to estimate more reliable pseudo
ground-truths, which can be used for training more accurate segmentation
models. In particular, we devise a graph neural network (GNN) for group-wise
semantic mining, wherein input images are represented as graph nodes, and the
underlying relations between a pair of images are characterized by an efficient
co-attention mechanism. Moreover, in order to prevent the model from paying
excessive attention to common semantics only, we further propose a graph
dropout layer, encouraging the model to learn more accurate and complete object
responses. The whole network is end-to-end trainable by iterative message
passing, which propagates interaction cues over the images to progressively
improve the performance. We conduct experiments on the popular PASCAL VOC 2012
and COCO benchmarks, and our model yields state-of-the-art performance. Our
code is available at: https://github.com/Lixy1997/Group-WSSS.
Related papers
- Location-Aware Self-Supervised Transformers [74.76585889813207]
We propose to pretrain networks for semantic segmentation by predicting the relative location of image parts.
We control the difficulty of the task by masking a subset of the reference patch features visible to those of the query.
Our experiments show that this location-aware pretraining leads to representations that transfer competitively to several challenging semantic segmentation benchmarks.
arXiv Detail & Related papers (2022-12-05T16:24:29Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - Distilling Ensemble of Explanations for Weakly-Supervised Pre-Training
of Image Segmentation Models [54.49581189337848]
We propose a method to enable the end-to-end pre-training for image segmentation models based on classification datasets.
The proposed method leverages a weighted segmentation learning procedure to pre-train the segmentation network en masse.
Experiment results show that, with ImageNet accompanied by PSSL as the source dataset, the proposed end-to-end pre-training strategy successfully boosts the performance of various segmentation models.
arXiv Detail & Related papers (2022-07-04T13:02:32Z) - Maximize the Exploration of Congeneric Semantics for Weakly Supervised
Semantic Segmentation [27.155133686127474]
We construct a graph neural network (P-GNN) based on the self-detected patches from different images that contain the same class labels.
We conduct experiments on the popular PASCAL VOC 2012 benchmarks, and our model yields state-of-the-art performance.
arXiv Detail & Related papers (2021-10-08T08:59:16Z) - SCG-Net: Self-Constructing Graph Neural Networks for Semantic
Segmentation [23.623276007011373]
We propose a module that learns a long-range dependency graph directly from the image and uses it to propagate contextual information efficiently.
The module is optimised via a novel adaptive diagonal enhancement method and a variational lower bound.
When incorporated into a neural network (SCG-Net), semantic segmentation is performed in an end-to-end manner and competitive performance.
arXiv Detail & Related papers (2020-09-03T12:13:09Z) - Pairwise Relation Learning for Semi-supervised Gland Segmentation [90.45303394358493]
We propose a pairwise relation-based semi-supervised (PRS2) model for gland segmentation on histology images.
This model consists of a segmentation network (S-Net) and a pairwise relation network (PR-Net)
We evaluate our model against five recent methods on the GlaS dataset and three recent methods on the CRAG dataset.
arXiv Detail & Related papers (2020-08-06T15:02:38Z) - CRNet: Cross-Reference Networks for Few-Shot Segmentation [59.85183776573642]
Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images.
With a cross-reference mechanism, our network can better find the co-occurrent objects in the two images.
Experiments on the PASCAL VOC 2012 dataset show that our network achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-03-24T04:55:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.