Dataset Diffusion: Diffusion-based Synthetic Dataset Generation for
Pixel-Level Semantic Segmentation
- URL: http://arxiv.org/abs/2309.14303v4
- Date: Mon, 13 Nov 2023 05:11:52 GMT
- Title: Dataset Diffusion: Diffusion-based Synthetic Dataset Generation for
Pixel-Level Semantic Segmentation
- Authors: Quang Nguyen, Truong Vu, Anh Tran, Khoi Nguyen
- Abstract summary: We propose a novel method for generating pixel-level semantic segmentation labels using the text-to-image generative model Stable Diffusion.
By utilizing the text prompts, cross-attention, and self-attention of SD, we introduce three new techniques: class-prompt appending, class-prompt cross-attention, and self-attention exponentiation.
These techniques enable us to generate segmentation maps corresponding to synthetic images.
- Score: 6.82236459614491
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Preparing training data for deep vision models is a labor-intensive task. To
address this, generative models have emerged as an effective solution for
generating synthetic data. While current generative models produce image-level
category labels, we propose a novel method for generating pixel-level semantic
segmentation labels using the text-to-image generative model Stable Diffusion
(SD). By utilizing the text prompts, cross-attention, and self-attention of SD,
we introduce three new techniques: class-prompt appending, class-prompt
cross-attention, and self-attention exponentiation. These techniques enable us
to generate segmentation maps corresponding to synthetic images. These maps
serve as pseudo-labels for training semantic segmenters, eliminating the need
for labor-intensive pixel-wise annotation. To account for the imperfections in
our pseudo-labels, we incorporate uncertainty regions into the segmentation,
allowing us to disregard loss from those regions. We conduct evaluations on two
datasets, PASCAL VOC and MSCOCO, and our approach significantly outperforms
concurrent work. Our benchmarks and code will be released at
https://github.com/VinAIResearch/Dataset-Diffusion
Related papers
- Enhanced Generative Data Augmentation for Semantic Segmentation via Stronger Guidance [1.2923961938782627]
We introduce an effective data augmentation method for semantic segmentation using the Controllable Diffusion Model.
Our proposed method includes efficient prompt generation using Class-Prompt Appending and Visual Prior Combination.
We evaluate our method on the PASCAL VOC datasets and found it highly effective for synthesizing images in semantic segmentation.
arXiv Detail & Related papers (2024-09-09T19:01:14Z) - Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets [51.74296438621836]
We introduce Scribbles for All, a label and training data generation algorithm for semantic segmentation trained on scribble labels.
The main limitation of scribbles as source for weak supervision is the lack of challenging datasets for scribble segmentation.
Scribbles for All provides scribble labels for several popular segmentation datasets and provides an algorithm to automatically generate scribble labels for any dataset with dense annotations.
arXiv Detail & Related papers (2024-08-22T15:29:08Z) - DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion
Models [61.906934570771256]
We present a generic dataset generation model that can produce diverse synthetic images and perception annotations.
Our method builds upon the pre-trained diffusion model and extends text-guided image synthesis to perception data generation.
We show that the rich latent code of the diffusion model can be effectively decoded as accurate perception annotations using a decoder module.
arXiv Detail & Related papers (2023-08-11T14:38:11Z) - CorrMatch: Label Propagation via Correlation Matching for
Semi-Supervised Semantic Segmentation [73.89509052503222]
This paper presents a simple but performant semi-supervised semantic segmentation approach, called CorrMatch.
We observe that the correlation maps not only enable clustering pixels of the same category easily but also contain good shape information.
We propose to conduct pixel propagation by modeling the pairwise similarities of pixels to spread the high-confidence pixels and dig out more.
Then, we perform region propagation to enhance the pseudo labels with accurate class-agnostic masks extracted from the correlation maps.
arXiv Detail & Related papers (2023-06-07T10:02:29Z) - HandsOff: Labeled Dataset Generation With No Additional Human
Annotations [13.11411442720668]
We introduce the HandsOff framework, a technique capable of producing an unlimited number of synthetic images and corresponding labels.
Our framework avoids the practical drawbacks of prior work by unifying the field of GAN inversion with dataset generation.
We generate datasets with rich pixel-wise labels in multiple challenging domains such as faces, cars, full-body human poses, and urban driving scenes.
arXiv Detail & Related papers (2022-12-24T03:37:02Z) - A Closer Look at Self-training for Zero-Label Semantic Segmentation [53.4488444382874]
Being able to segment unseen classes not observed during training is an important technical challenge in deep learning.
Prior zero-label semantic segmentation works approach this task by learning visual-semantic embeddings or generative models.
We propose a consistency regularizer to filter out noisy pseudo-labels by taking the intersections of the pseudo-labels generated from different augmentations of the same image.
arXiv Detail & Related papers (2021-04-21T14:34:33Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.