Affinity Attention Graph Neural Network for Weakly Supervised Semantic
Segmentation
- URL: http://arxiv.org/abs/2106.04054v1
- Date: Tue, 8 Jun 2021 02:19:21 GMT
- Title: Affinity Attention Graph Neural Network for Weakly Supervised Semantic
Segmentation
- Authors: Bingfeng Zhang, Jimin Xiao, Jianbo Jiao, Yunchao Wei, Yao Zhao
- Abstract summary: We propose Affinity Attention Graph Neural Network ($A2$GNN) for weakly supervised semantic segmentation.
Our approach achieves new state-of-the-art performances on Pascal VOC 2012 datasets.
- Score: 86.44301443789763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weakly supervised semantic segmentation is receiving great attention due to
its low human annotation cost. In this paper, we aim to tackle bounding box
supervised semantic segmentation, i.e., training accurate semantic segmentation
models using bounding box annotations as supervision. To this end, we propose
Affinity Attention Graph Neural Network ($A^2$GNN). Following previous
practices, we first generate pseudo semantic-aware seeds, which are then formed
into semantic graphs based on our newly proposed affinity Convolutional Neural
Network (CNN). Then the built graphs are input to our $A^2$GNN, in which an
affinity attention layer is designed to acquire the short- and long- distance
information from soft graph edges to accurately propagate semantic labels from
the confident seeds to the unlabeled pixels. However, to guarantee the
precision of the seeds, we only adopt a limited number of confident pixel seed
labels for $A^2$GNN, which may lead to insufficient supervision for training.
To alleviate this issue, we further introduce a new loss function and a
consistency-checking mechanism to leverage the bounding box constraint, so that
more reliable guidance can be included for the model optimization. Experiments
show that our approach achieves new state-of-the-art performances on Pascal VOC
2012 datasets (val: 76.5\%, test: 75.2\%). More importantly, our approach can
be readily applied to bounding box supervised instance segmentation task or
other weakly supervised semantic segmentation tasks, with state-of-the-art or
comparable performance among almot all weakly supervised tasks on PASCAL VOC or
COCO dataset. Our source code will be available at
https://github.com/zbf1991/A2GNN.
Related papers
- Edge-aware Plug-and-play Scheme for Semantic Segmentation [4.297988192695948]
The proposed method can be seamlessly integrated into any state-of-the-art (SOTA) models with zero modification.
The experimental results indicate that the proposed method can be seamlessly integrated into any state-of-the-art (SOTA) models with zero modification.
arXiv Detail & Related papers (2023-03-18T02:17:37Z) - LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds [62.49198183539889]
We propose a label-efficient semantic segmentation pipeline for outdoor scenes with LiDAR point clouds.
Our method co-designs an efficient labeling process with semi/weakly supervised learning.
Our proposed method is even highly competitive compared to the fully supervised counterpart with 100% labels.
arXiv Detail & Related papers (2022-10-14T19:13:36Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised
Semantic Segmentation [27.50216933606052]
We address the problem of weakly-supervised semantic segmentation using bounding box annotations.
Background regions are perceptually consistent in part within an image, and this can be leveraged to discriminate foreground and background regions inside object bounding boxes.
We introduce a noise-aware loss (NAL) that makes the networks less susceptible to incorrect labels.
arXiv Detail & Related papers (2021-04-02T06:38:41Z) - Scribble-Supervised Semantic Segmentation by Uncertainty Reduction on
Neural Representation and Self-Supervision on Neural Eigenspace [21.321005898976253]
Scribble-supervised semantic segmentation has gained much attention recently for its promising performance without high-quality annotations.
This work aims to achieve semantic segmentation by scribble annotations directly without extra information and other limitations.
We propose holistic operations, including minimizing entropy and a network embedded random walk on neural representation to reduce uncertainty.
arXiv Detail & Related papers (2021-02-19T12:33:57Z) - Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals [78.12377360145078]
We introduce a novel two-step framework that adopts a predetermined prior in a contrastive optimization objective to learn pixel embeddings.
This marks a large deviation from existing works that relied on proxy tasks or end-to-end clustering.
In particular, when fine-tuning the learned representations using just 1% of labeled examples on PASCAL, we outperform supervised ImageNet pre-training by 7.1% mIoU.
arXiv Detail & Related papers (2021-02-11T18:54:47Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z) - Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning.
Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector.
We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.