Learning Self-Supervised Low-Rank Network for Single-Stage Weakly and
Semi-Supervised Semantic Segmentation
- URL: http://arxiv.org/abs/2203.10278v1
- Date: Sat, 19 Mar 2022 09:19:55 GMT
- Title: Learning Self-Supervised Low-Rank Network for Single-Stage Weakly and
Semi-Supervised Semantic Segmentation
- Authors: Junwen Pan, Pengfei Zhu, Kaihua Zhang, Bing Cao, Yu Wang, Dingwen
Zhang, Junwei Han, Qinghua Hu
- Abstract summary: This paper presents a Self-supervised Low-Rank Network ( SLRNet) for single-stage weakly supervised semantic segmentation (WSSS) and semi-supervised semantic segmentation (SSSS)
SLRNet uses cross-view self-supervision, that is, it simultaneously predicts several attentive LR representations from different views of an image to learn precise pseudo-labels.
Experiments on the Pascal VOC 2012, COCO, and L2ID datasets demonstrate that our SLRNet outperforms both state-of-the-art WSSS and SSSS methods with a variety of different settings.
- Score: 119.009033745244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation with limited annotations, such as weakly supervised
semantic segmentation (WSSS) and semi-supervised semantic segmentation (SSSS),
is a challenging task that has attracted much attention recently. Most leading
WSSS methods employ a sophisticated multi-stage training strategy to estimate
pseudo-labels as precise as possible, but they suffer from high model
complexity. In contrast, there exists another research line that trains a
single network with image-level labels in one training cycle. However, such a
single-stage strategy often performs poorly because of the compounding effect
caused by inaccurate pseudo-label estimation. To address this issue, this paper
presents a Self-supervised Low-Rank Network (SLRNet) for single-stage WSSS and
SSSS. The SLRNet uses cross-view self-supervision, that is, it simultaneously
predicts several complementary attentive LR representations from different
views of an image to learn precise pseudo-labels. Specifically, we reformulate
the LR representation learning as a collective matrix factorization problem and
optimize it jointly with the network learning in an end-to-end manner. The
resulting LR representation deprecates noisy information while capturing stable
semantics across different views, making it robust to the input variations,
thereby reducing overfitting to self-supervision errors. The SLRNet can provide
a unified single-stage framework for various label-efficient semantic
segmentation settings: 1) WSSS with image-level labeled data, 2) SSSS with a
few pixel-level labeled data, and 3) SSSS with a few pixel-level labeled data
and many image-level labeled data. Extensive experiments on the Pascal VOC
2012, COCO, and L2ID datasets demonstrate that our SLRNet outperforms both
state-of-the-art WSSS and SSSS methods with a variety of different settings,
proving its good generalizability and efficacy.
Related papers
- Multi-Label Self-Supervised Learning with Scene Images [21.549234013998255]
This paper shows that quality image representations can be learned by treating scene/multi-label image SSL simply as a multi-label classification problem.
The proposed method is named Multi-Label Self-supervised learning (MLS)
arXiv Detail & Related papers (2023-08-07T04:04:22Z) - Multi-Granularity Denoising and Bidirectional Alignment for Weakly
Supervised Semantic Segmentation [75.32213865436442]
We propose an end-to-end multi-granularity denoising and bidirectional alignment (MDBA) model to alleviate the noisy label and multi-class generalization issues.
The MDBA model can reach the mIoU of 69.5% and 70.2% on validation and test sets for the PASCAL VOC 2012 dataset.
arXiv Detail & Related papers (2023-05-09T03:33:43Z) - Distilling Ensemble of Explanations for Weakly-Supervised Pre-Training
of Image Segmentation Models [54.49581189337848]
We propose a method to enable the end-to-end pre-training for image segmentation models based on classification datasets.
The proposed method leverages a weighted segmentation learning procedure to pre-train the segmentation network en masse.
Experiment results show that, with ImageNet accompanied by PSSL as the source dataset, the proposed end-to-end pre-training strategy successfully boosts the performance of various segmentation models.
arXiv Detail & Related papers (2022-07-04T13:02:32Z) - Masked Unsupervised Self-training for Zero-shot Image Classification [98.23094305347709]
Masked Unsupervised Self-Training (MUST) is a new approach which leverages two different and complimentary sources of supervision: pseudo-labels and raw images.
MUST improves upon CLIP by a large margin and narrows the performance gap between unsupervised and supervised classification.
arXiv Detail & Related papers (2022-06-07T02:03:06Z) - Exploring Smoothness and Class-Separation for Semi-supervised Medical
Image Segmentation [39.068698033394064]
We propose the SS-Net for semi-supervised medical image segmentation tasks.
pixel-level smoothness forces the model to generate invariant results under adversarial perturbations.
The inter-class separation constrains individual class features should approach their corresponding high-quality prototypes.
arXiv Detail & Related papers (2022-03-02T08:38:09Z) - Semi-weakly Supervised Contrastive Representation Learning for Retinal
Fundus Images [0.2538209532048867]
We propose a semi-weakly supervised contrastive learning framework for representation learning using semi-weakly annotated images.
We empirically validate the transfer learning performance of SWCL on seven public retinal fundus datasets.
arXiv Detail & Related papers (2021-08-04T15:50:09Z) - Semi-supervised Semantic Segmentation with Directional Context-aware
Consistency [66.49995436833667]
We focus on the semi-supervised segmentation problem where only a small set of labeled data is provided with a much larger collection of totally unlabeled images.
A preferred high-level representation should capture the contextual information while not losing self-awareness.
We present the Directional Contrastive Loss (DC Loss) to accomplish the consistency in a pixel-to-pixel manner.
arXiv Detail & Related papers (2021-06-27T03:42:40Z) - Large-scale Unsupervised Semantic Segmentation [163.3568726730319]
We propose a new problem of large-scale unsupervised semantic segmentation (LUSS) with a newly created benchmark dataset to track the research progress.
Based on the ImageNet dataset, we propose the ImageNet-S dataset with 1.2 million training images and 40k high-quality semantic segmentation annotations for evaluation.
arXiv Detail & Related papers (2021-06-06T15:02:11Z) - Remote Sensing Image Scene Classification with Self-Supervised Paradigm
under Limited Labeled Samples [11.025191332244919]
We introduce new self-supervised learning (SSL) mechanism to obtain the high-performance pre-training model for RSIs scene classification from large unlabeled data.
Experiments on three commonly used RSIs scene classification datasets demonstrated that this new learning paradigm outperforms the traditional dominant ImageNet pre-trained model.
The insights distilled from our studies can help to foster the development of SSL in the remote sensing community.
arXiv Detail & Related papers (2020-10-02T09:27:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.