Fully Self-Supervised Learning for Semantic Segmentation
- URL: http://arxiv.org/abs/2202.11981v1
- Date: Thu, 24 Feb 2022 09:38:22 GMT
- Title: Fully Self-Supervised Learning for Semantic Segmentation
- Authors: Yuan Wang, Wei Zhuo, Yucong Li, Zhi Wang, Qi Ju, Wenwu Zhu
- Abstract summary: We present a fully self-supervised framework for semantic segmentation(FS4).
We propose a bootstrapped training scheme for semantic segmentation, which fully leveraged the global semantic knowledge for self-supervision.
We evaluate our method on the large-scale COCO-Stuff dataset and achieved 7.19 mIoU improvements on both things and stuff objects.
- Score: 46.6602159197283
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we present a fully self-supervised framework for semantic
segmentation(FS^4). A fully bootstrapped strategy for semantic segmentation,
which saves efforts for the huge amount of annotation, is crucial for building
customized models from end-to-end for open-world domains. This application is
eagerly needed in realistic scenarios. Even though recent self-supervised
semantic segmentation methods have gained great progress, these works however
heavily depend on the fully-supervised pretrained model and make it impossible
a fully self-supervised pipeline. To solve this problem, we proposed a
bootstrapped training scheme for semantic segmentation, which fully leveraged
the global semantic knowledge for self-supervision with our proposed PGG
strategy and CAE module. In particular, we perform pixel clustering and
assignments for segmentation supervision. Preventing it from clustering a mess,
we proposed 1) a pyramid-global-guided (PGG) training strategy to supervise the
learning with pyramid image/patch-level pseudo labels, which are generated by
grouping the unsupervised features. The stable global and pyramid semantic
pseudo labels can prevent the segmentation from learning too many clutter
regions or degrading to one background region; 2) in addition, we proposed
context-aware embedding (CAE) module to generate global feature embedding in
view of its neighbors close both in space and appearance in a non-trivial way.
We evaluate our method on the large-scale COCO-Stuff dataset and achieved 7.19
mIoU improvements on both things and stuff objects
Related papers
- Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach [7.012760526318993]
Weakly-Supervised Semantic (WSSS) offers a cost-efficient workaround to extensive labeling.
Existing WSSS methods have difficulties in learning the boundaries of objects leading to poor segmentation results.
We propose a novel and effective framework that addresses these issues by leveraging visual foundation models inside the bounding box.
arXiv Detail & Related papers (2024-05-10T16:42:25Z) - SOHES: Self-supervised Open-world Hierarchical Entity Segmentation [82.45303116125021]
This work presents Self-supervised Open-world Hierarchical Entities (SOHES), a novel approach that eliminates the need for human annotations.
We produce abundant high-quality pseudo-labels through visual feature clustering, and rectify the noises in pseudo-labels via a teacher- mutual-learning procedure.
Using raw images as the sole training data, our method achieves unprecedented performance in self-supervised open-world segmentation.
arXiv Detail & Related papers (2024-04-18T17:59:46Z) - OMH: Structured Sparsity via Optimally Matched Hierarchy for Unsupervised Semantic Segmentation [69.37484603556307]
Un Semantic segmenting (USS) involves segmenting images without relying on predefined labels.
We introduce a novel approach called Optimally Matched Hierarchy (OMH) to simultaneously address the above issues.
Our OMH yields better unsupervised segmentation performance compared to existing USS methods.
arXiv Detail & Related papers (2024-03-11T09:46:41Z) - Unsupervised Universal Image Segmentation [59.0383635597103]
We propose an Unsupervised Universal model (U2Seg) adept at performing various image segmentation tasks.
U2Seg generates pseudo semantic labels for these segmentation tasks via leveraging self-supervised models.
We then self-train the model on these pseudo semantic labels, yielding substantial performance gains.
arXiv Detail & Related papers (2023-12-28T18:59:04Z) - A Lightweight Clustering Framework for Unsupervised Semantic
Segmentation [28.907274978550493]
Unsupervised semantic segmentation aims to categorize each pixel in an image into a corresponding class without the use of annotated data.
We propose a lightweight clustering framework for unsupervised semantic segmentation.
Our framework achieves state-of-the-art results on PASCAL VOC and MS COCO datasets.
arXiv Detail & Related papers (2023-11-30T15:33:42Z) - Global and Local Features through Gaussian Mixture Models on Image
Semantic Segmentation [0.38073142980732994]
We propose an internal structure for the feature representations while extracting a global representation that supports the former.
During training, we predict a Gaussian Mixture Model from the data, which, merged with the skip connections and the decoding stage, helps avoid wrong inductive biases.
Our results show that we can improve semantic segmentation by providing both learning representations (global and local) with a clustering behavior and combining them.
arXiv Detail & Related papers (2022-07-19T10:10:49Z) - TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic
Segmentation [44.75300205362518]
Unsupervised semantic segmentation aims to obtain high-level semantic representation on low-level visual features without manual annotations.
We propose the first top-down unsupervised semantic segmentation framework for fine-grained segmentation in extremely complicated scenarios.
Our results show that our top-down unsupervised segmentation is robust to both object-centric and scene-centric datasets.
arXiv Detail & Related papers (2021-12-02T18:59:03Z) - Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with
Self-Supervised Depth Estimation [94.16816278191477]
We present a framework for semi-adaptive and domain-supervised semantic segmentation.
It is enhanced by self-supervised monocular depth estimation trained only on unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset.
arXiv Detail & Related papers (2021-08-28T01:33:38Z) - Global Aggregation then Local Distribution for Scene Parsing [99.1095068574454]
We show that our approach can be modularized as an end-to-end trainable block and easily plugged into existing semantic segmentation networks.
Our approach allows us to build new state of the art on major semantic segmentation benchmarks including Cityscapes, ADE20K, Pascal Context, Camvid and COCO-stuff.
arXiv Detail & Related papers (2021-07-28T03:46:57Z) - Labels4Free: Unsupervised Segmentation using StyleGAN [40.39780497423365]
We propose an unsupervised segmentation framework for StyleGAN generated objects.
We report comparable results against state-of-the-art supervised segmentation networks.
arXiv Detail & Related papers (2021-03-27T18:59:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.