Related papers: GANSeg: Learning to Segment by Unsupervised Hierarchical Image Generation

GANSeg: Learning to Segment by Unsupervised Hierarchical Image Generation

URL: http://arxiv.org/abs/2112.01036v1
Date: Thu, 2 Dec 2021 07:57:56 GMT
Title: GANSeg: Learning to Segment by Unsupervised Hierarchical Image Generation
Authors: Xingzhe He, Bastian Wandt, Helge Rhodin
Abstract summary: We propose a GAN-based approach that generates images conditioned on latent masks. We show that such mask-conditioned image generation can be learned faithfully when conditioning the masks in a hierarchical manner. It also lets us generate image-mask pairs for training a segmentation network, which outperforms the state-of-the-art unsupervised segmentation methods on established benchmarks.
Score: 16.900404701997502
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Segmenting an image into its parts is a frequent preprocess for high-level vision tasks such as image editing. However, annotating masks for supervised training is expensive. Weakly-supervised and unsupervised methods exist, but they depend on the comparison of pairs of images, such as from multi-views, frames of videos, and image transformations of single images, which limits their applicability. To address this, we propose a GAN-based approach that generates images conditioned on latent masks, thereby alleviating full or weak annotations required in previous approaches. We show that such mask-conditioned image generation can be learned faithfully when conditioning the masks in a hierarchical manner on latent keypoints that define the position of parts explicitly. Without requiring supervision of masks or points, this strategy increases robustness to viewpoint and object positions changes. It also lets us generate image-mask pairs for training a segmentation network, which outperforms the state-of-the-art unsupervised segmentation methods on established benchmarks.

Related papers

SeeDiff: Off-the-Shelf Seeded Mask Generation from Diffusion Models [6.0870128457015715]
We show that cross-attention alone provides very coarse object localization, which however can provide initial seeds.<n>We also observe that a simple-text-guided synthetic image often has a uniform background, which is easier to find correspondences.<n>Our proposed method, dubbed SeeDiff, generates high-quality masks off-the-shelf from Stable Diffusion.
arXiv Detail & Related papers (2025-07-26T05:44:00Z)
Evolved Hierarchical Masking for Self-Supervised Learning [49.77271430882176]
Existing Masked Image Modeling methods apply fixed mask patterns to guide the self-supervised training. This paper introduces an evolved hierarchical masking method to pursue general visual cues modeling in self-supervised learning.
arXiv Detail & Related papers (2025-04-12T09:40:14Z)
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model [6.641903410779405]
We propose the Hierarchical Mask Tokenizer (HiMTok), which represents segmentation masks with up to 32 tokens. HiMTok allows for compact and coarse-to-fine mask representations, aligning well with the next-token-prediction paradigm. We develop a 3-stage training recipe for progressive learning of segmentation and visual capabilities, featuring a hierarchical mask loss for effective coarse-to-fine learning.
arXiv Detail & Related papers (2025-03-17T10:29:08Z)
From Pixels to Components: Eigenvector Masking for Visual Representation Learning [55.567395509598065]
Predicting masked from visible parts of an image is a powerful self-supervised approach for visual representation learning. We propose an alternative masking strategy that operates on a suitable transformation of the data rather than on the raw pixels.
arXiv Detail & Related papers (2025-02-10T10:06:46Z)
Masked Image Modeling Boosting Semi-Supervised Semantic Segmentation [38.55611683982936]
We introduce a novel class-wise masked image modeling that independently reconstructs different image regions according to their respective classes. We develop a feature aggregation strategy that minimizes the distances between features corresponding to the masked and visible parts within the same class. In semantic space, we explore the application of masked image modeling to enhance regularization.
arXiv Detail & Related papers (2024-11-13T16:42:07Z)
Open-Vocabulary Segmentation with Unpaired Mask-Text Supervision [87.15580604023555]
Unpair-Seg is a novel weakly-supervised open-vocabulary segmentation framework. It learns from unpaired image-mask and image-text pairs, which can be independently and efficiently collected. It achieves 14.6% and 19.5% mIoU on the ADE-847 and PASCAL Context-459 datasets.
arXiv Detail & Related papers (2024-02-14T06:01:44Z)
Improving Masked Autoencoders by Learning Where to Mask [65.89510231743692]
Masked image modeling is a promising self-supervised learning method for visual data. We present AutoMAE, a framework that uses Gumbel-Softmax to interlink an adversarially-trained mask generator and a mask-guided image modeling process. In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.
arXiv Detail & Related papers (2023-03-12T05:28:55Z)
MaskSketch: Unpaired Structure-guided Masked Image Generation [56.88038469743742]
MaskSketch is an image generation method that allows spatial conditioning of the generation result using a guiding sketch as an extra conditioning signal during sampling. We show that intermediate self-attention maps of a masked generative transformer encode important structural information of the input image. Our results show that MaskSketch achieves high image realism and fidelity to the guiding structure.
arXiv Detail & Related papers (2023-02-10T20:27:02Z)
Differentiable Soft-Masked Attention [115.5770357189209]
"Differentiable Soft-Masked Attention" is used for the task of WeaklySupervised Video Object. We develop a transformer-based network for training, but can also benefit from cycle consistency training on a video with just one annotated frame.
arXiv Detail & Related papers (2022-06-01T02:05:13Z)
What You See is What You Classify: Black Box Attributions [61.998683569022006]
We train a deep network, the Explainer, to predict attributions for a pre-trained black-box classifier, the Explanandum. Unlike most existing approaches, ours is capable of directly generating very distinct class-specific masks. We show that our attributions are superior to established methods both visually and quantitatively.
arXiv Detail & Related papers (2022-05-23T12:30:04Z)
Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling [61.03262873980619]
Open-vocabulary instance segmentation aims at segmenting novel classes without mask annotations. We propose a cross-modal pseudo-labeling framework, which generates training pseudo masks by aligning word semantics in captions with visual features of object masks in images. Our framework is capable of labeling novel classes in captions via their word semantics to self-train a student model.
arXiv Detail & Related papers (2021-11-24T18:50:47Z)
Few-shot Semantic Image Synthesis Using StyleGAN Prior [8.528384027684192]
We present a training strategy that performs pseudo labeling of semantic masks using the StyleGAN prior. Our key idea is to construct a simple mapping between the StyleGAN feature and each semantic class from a few examples of semantic masks. Although the pseudo semantic masks might be too coarse for previous approaches that require pixel-aligned masks, our framework can synthesize high-quality images from not only dense semantic masks but also sparse inputs such as landmarks and scribbles.
arXiv Detail & Related papers (2021-03-27T11:04:22Z)
Automatic Image Labelling at Pixel Level [21.59653873040243]
We propose an interesting learning approach to generate pixel-level image labellings automatically. A Guided Filter Network (GFN) is first developed to learn the segmentation knowledge from a source domain. GFN then transfers such segmentation knowledge to generate coarse object masks in the target domain.
arXiv Detail & Related papers (2020-07-15T00:34:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.