Segmentation in Style: Unsupervised Semantic Image Segmentation with
Stylegan and CLIP
- URL: http://arxiv.org/abs/2107.12518v1
- Date: Mon, 26 Jul 2021 23:48:34 GMT
- Title: Segmentation in Style: Unsupervised Semantic Image Segmentation with
Stylegan and CLIP
- Authors: Daniil Pakhomov, Sanchit Hira, Narayani Wagle, Kemar E. Green, Nassir
Navab
- Abstract summary: We introduce a method that allows to automatically segment images into semantically meaningful regions without human supervision.
Derived regions are consistent across different images and coincide with human-defined semantic classes on some datasets.
We test our method on publicly available datasets and show state-of-the-art results.
- Score: 39.0946507389324
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a method that allows to automatically segment images into
semantically meaningful regions without human supervision. Derived regions are
consistent across different images and coincide with human-defined semantic
classes on some datasets. In cases where semantic regions might be hard for
human to define and consistently label, our method is still able to find
meaningful and consistent semantic classes. In our work, we use pretrained
StyleGAN2~\cite{karras2020analyzing} generative model: clustering in the
feature space of the generative model allows to discover semantic classes. Once
classes are discovered, a synthetic dataset with generated images and
corresponding segmentation masks can be created. After that a segmentation
model is trained on the synthetic dataset and is able to generalize to real
images. Additionally, by using CLIP~\cite{radford2021learning} we are able to
use prompts defined in a natural language to discover some desired semantic
classes. We test our method on publicly available datasets and show
state-of-the-art results.
Related papers
- Vocabulary-free Image Classification and Semantic Segmentation [71.78089106671581]
We introduce the Vocabulary-free Image Classification (VIC) task, which aims to assign a class from an un-constrained language-induced semantic space to an input image without needing a known vocabulary.
VIC is challenging due to the vastness of the semantic space, which contains millions of concepts, including fine-grained categories.
We propose Category Search from External Databases (CaSED), a training-free method that leverages a pre-trained vision-language model and an external database.
arXiv Detail & Related papers (2024-04-16T19:27:21Z) - SemPLeS: Semantic Prompt Learning for Weakly-Supervised Semantic
Segmentation [36.41778553250247]
Weakly-Supervised Semantic (WSSS) aims to train segmentation models using image data with only image-level supervision.
We propose a Semantic Prompt Learning for WSSS (SemPLeS) framework, which learns to effectively prompt the CLIP latent space.
SemPLeS can perform better semantic alignment between object regions and the associated class labels.
arXiv Detail & Related papers (2024-01-22T09:41:05Z) - Primitive Generation and Semantic-related Alignment for Universal
Zero-Shot Segmentation [13.001629605405954]
We study universal zero-shot segmentation in this work to achieve panoptic, instance, and semantic segmentation for novel categories without any training samples.
We introduce a generative model to synthesize features for unseen categories, which links semantic and visual spaces.
The proposed approach achieves impressively state-of-the-art performance on zero-shot panoptic segmentation, instance segmentation, and semantic segmentation.
arXiv Detail & Related papers (2023-06-19T17:59:16Z) - Topological Semantic Mapping by Consolidation of Deep Visual Features [0.0]
This work introduces a topological semantic mapping method that uses deep visual features extracted by a CNN, the GoogLeNet, from 2D images captured in multiple views of the environment as the robot operates.
The experiments, performed using a real-world indoor dataset, showed that the method is able to consolidate the visual features of regions and use them to recognize objects and place categories as semantic properties.
arXiv Detail & Related papers (2021-06-24T01:10:03Z) - Remote Sensing Images Semantic Segmentation with General Remote Sensing
Vision Model via a Self-Supervised Contrastive Learning Method [13.479068312825781]
We propose Global style and Local matching Contrastive Learning Network (GLCNet) for remote sensing semantic segmentation.
Specifically, the global style contrastive module is used to learn an image-level representation better.
The local features matching contrastive module is designed to learn representations of local regions which is beneficial for semantic segmentation.
arXiv Detail & Related papers (2021-06-20T03:03:40Z) - Adversarial Semantic Hallucination for Domain Generalized Semantic
Segmentation [50.14933487082085]
We propose an adversarial hallucination approach, which combines a class-wise hallucination module and a semantic segmentation module.
Experiments on state of the art domain adaptation work demonstrate the efficacy of our proposed method when no target domain data are available for training.
arXiv Detail & Related papers (2021-06-08T07:07:45Z) - A Closer Look at Self-training for Zero-Label Semantic Segmentation [53.4488444382874]
Being able to segment unseen classes not observed during training is an important technical challenge in deep learning.
Prior zero-label semantic segmentation works approach this task by learning visual-semantic embeddings or generative models.
We propose a consistency regularizer to filter out noisy pseudo-labels by taking the intersections of the pseudo-labels generated from different augmentations of the same image.
arXiv Detail & Related papers (2021-04-21T14:34:33Z) - Exploring Cross-Image Pixel Contrast for Semantic Segmentation [130.22216825377618]
We propose a pixel-wise contrastive framework for semantic segmentation in the fully supervised setting.
The core idea is to enforce pixel embeddings belonging to a same semantic class to be more similar than embeddings from different classes.
Our method can be effortlessly incorporated into existing segmentation frameworks without extra overhead during testing.
arXiv Detail & Related papers (2021-01-28T11:35:32Z) - Hierarchical Image Classification using Entailment Cone Embeddings [68.82490011036263]
We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier.
We empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
arXiv Detail & Related papers (2020-04-02T10:22:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.