Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly
Supervised Semantic Segmentation
- URL: http://arxiv.org/abs/2105.08965v1
- Date: Wed, 19 May 2021 07:31:11 GMT
- Title: Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly
Supervised Semantic Segmentation
- Authors: Seungho Lee, Minhyun Lee, Jongwuk Lee and Hyunjung Shim
- Abstract summary: Explicit Pseudo-pixel Supervision (EPS) learns from pixel-level feedback by combining two weak supervisions.
We devise a joint training strategy to fully utilize the complementary relationship between both information.
Our method can obtain accurate object boundaries and discard co-occurring pixels, thereby significantly improving the quality of pseudo-masks.
- Score: 16.560870740946275
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing studies in weakly-supervised semantic segmentation (WSSS) using
image-level weak supervision have several limitations: sparse object coverage,
inaccurate object boundaries, and co-occurring pixels from non-target objects.
To overcome these challenges, we propose a novel framework, namely Explicit
Pseudo-pixel Supervision (EPS), which learns from pixel-level feedback by
combining two weak supervisions; the image-level label provides the object
identity via the localization map and the saliency map from the off-the-shelf
saliency detection model offers rich boundaries. We devise a joint training
strategy to fully utilize the complementary relationship between both
information. Our method can obtain accurate object boundaries and discard
co-occurring pixels, thereby significantly improving the quality of
pseudo-masks. Experimental results show that the proposed method remarkably
outperforms existing methods by resolving key challenges of WSSS and achieves
the new state-of-the-art performance on both PASCAL VOC 2012 and MS COCO 2014
datasets.
Related papers
- Pixel-Level Clustering Network for Unsupervised Image Segmentation [3.69853388955692]
We present a pixel-level clustering framework for segmenting images into regions without using ground truth annotations.
We also propose a training strategy that utilizes intra-consistency within each superpixel, inter-similarity/dissimilarity between neighboring superpixels, and structural similarity between images.
arXiv Detail & Related papers (2023-10-24T23:06:29Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - De-coupling and De-positioning Dense Self-supervised Learning [65.56679416475943]
Dense Self-Supervised Learning (SSL) methods address the limitations of using image-level feature representations when handling images with multiple objects.
We show that they suffer from coupling and positional bias, which arise from the receptive field increasing with layer depth and zero-padding.
We demonstrate the benefits of our method on COCO and on a new challenging benchmark, OpenImage-MINI, for object classification, semantic segmentation, and object detection.
arXiv Detail & Related papers (2023-03-29T18:07:25Z) - Pointly-Supervised Panoptic Segmentation [106.68888377104886]
We propose a new approach to applying point-level annotations for weakly-supervised panoptic segmentation.
Instead of the dense pixel-level labels used by fully supervised methods, point-level labels only provide a single point for each target as supervision.
We formulate the problem in an end-to-end framework by simultaneously generating panoptic pseudo-masks from point-level labels and learning from them.
arXiv Detail & Related papers (2022-10-25T12:03:51Z) - Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast [43.40192909920495]
Cross-view feature semantic consistency and intra(inter)-class compactness(dispersion) are explored.
We propose two novel pixel-to-prototype contrast regularization terms that are conducted cross different views and within per single view of an image.
Our method can be seamlessly incorporated into existing WSSS models without any changes to the base network.
arXiv Detail & Related papers (2021-10-14T01:44:57Z) - Mixed-supervised segmentation: Confidence maximization helps knowledge
distillation [24.892332859630518]
In this work, we propose a dual-branch architecture for deep neural networks.
The upper branch (teacher) receives strong annotations, while the bottom one (student) is driven by limited supervision and guided by the upper branch.
We show that the synergy between the entropy and KL divergence yields substantial improvements in performance.
arXiv Detail & Related papers (2021-09-21T20:06:13Z) - Semi-supervised Semantic Segmentation with Directional Context-aware
Consistency [66.49995436833667]
We focus on the semi-supervised segmentation problem where only a small set of labeled data is provided with a much larger collection of totally unlabeled images.
A preferred high-level representation should capture the contextual information while not losing self-awareness.
We present the Directional Contrastive Loss (DC Loss) to accomplish the consistency in a pixel-to-pixel manner.
arXiv Detail & Related papers (2021-06-27T03:42:40Z) - A Weakly-Supervised Semantic Segmentation Approach based on the Centroid
Loss: Application to Quality Control and Inspection [6.101839518775968]
We propose and assess a new weakly-supervised semantic segmentation approach making use of a novel loss function.
The performance of the approach is evaluated against datasets from two different industry-related case studies.
arXiv Detail & Related papers (2020-10-26T09:08:21Z) - Self-supervised Equivariant Attention Mechanism for Weakly Supervised
Semantic Segmentation [93.83369981759996]
We propose a self-supervised equivariant attention mechanism (SEAM) to discover additional supervision and narrow the gap.
Our method is based on the observation that equivariance is an implicit constraint in fully supervised semantic segmentation.
We propose consistency regularization on predicted CAMs from various transformed images to provide self-supervision for network learning.
arXiv Detail & Related papers (2020-04-09T14:57:57Z) - Semi-Supervised StyleGAN for Disentanglement Learning [79.01988132442064]
Current disentanglement methods face several inherent limitations.
We design new architectures and loss functions based on StyleGAN for semi-supervised high-resolution disentanglement learning.
arXiv Detail & Related papers (2020-03-06T22:54:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.