BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations
- URL: http://arxiv.org/abs/2201.04684v1
- Date: Wed, 12 Jan 2022 20:28:34 GMT
- Title: BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations
- Authors: Daiqing Li, Huan Ling, Seung Wook Kim, Karsten Kreis, Adela Barriuso,
Sanja Fidler, Antonio Torralba
- Abstract summary: We synthesize a large labeled dataset via a generative adversarial network (GAN)
We take image samples from the class-conditional generative model BigGAN trained on ImageNet, and manually annotate 5 images per class, for all 1k classes.
We create a new ImageNet benchmark by labeling an additional set of 8k real images and evaluate segmentation performance in a variety of settings.
- Score: 89.42397034542189
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Annotating images with pixel-wise labels is a time-consuming and costly
process. Recently, DatasetGAN showcased a promising alternative - to synthesize
a large labeled dataset via a generative adversarial network (GAN) by
exploiting a small set of manually labeled, GAN-generated images. Here, we
scale DatasetGAN to ImageNet scale of class diversity. We take image samples
from the class-conditional generative model BigGAN trained on ImageNet, and
manually annotate 5 images per class, for all 1k classes. By training an
effective feature segmentation architecture on top of BigGAN, we turn BigGAN
into a labeled dataset generator. We further show that VQGAN can similarly
serve as a dataset generator, leveraging the already annotated data. We create
a new ImageNet benchmark by labeling an additional set of 8k real images and
evaluate segmentation performance in a variety of settings. Through an
extensive ablation study we show big gains in leveraging a large generated
dataset to train different supervised and self-supervised backbone models on
pixel-wise tasks. Furthermore, we demonstrate that using our synthesized
datasets for pre-training leads to improvements over standard ImageNet
pre-training on several downstream datasets, such as PASCAL-VOC, MS-COCO,
Cityscapes and chest X-ray, as well as tasks (detection, segmentation). Our
benchmark will be made public and maintain a leaderboard for this challenging
task. Project Page: https://nv-tlabs.github.io/big-datasetgan/
Related papers
- GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition [37.02054260449195]
Multi-Label Image Recognition (MLIR) is a challenging task that aims to predict multiple object labels in a single image.
We present the first fully graph convolutional model, Group K-nearest neighbor based Graph convolutional Network (GKGNet)
Our experiments demonstrate that GKGNet achieves state-of-the-art performance with significantly lower computational costs.
arXiv Detail & Related papers (2023-08-28T07:50:04Z) - DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion
Models [61.906934570771256]
We present a generic dataset generation model that can produce diverse synthetic images and perception annotations.
Our method builds upon the pre-trained diffusion model and extends text-guided image synthesis to perception data generation.
We show that the rich latent code of the diffusion model can be effectively decoded as accurate perception annotations using a decoder module.
arXiv Detail & Related papers (2023-08-11T14:38:11Z) - DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort [117.41383937100751]
Current deep networks are extremely data-hungry, benefiting from training on large-scale datasets.
We show how the GAN latent code can be decoded to produce a semantic segmentation of the image.
These generated datasets can then be used for training any computer vision architecture just as real datasets are.
arXiv Detail & Related papers (2021-04-13T20:08:29Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - Learning High-Resolution Domain-Specific Representations with a GAN
Generator [5.8720142291102135]
We show that representations learnt by a GAN generator can be easily projected onto semantic segmentation map using a lightweight decoder.
We propose LayerMatch scheme for approximating the representation of a GAN generator that can be used for unsupervised domain-specific pretraining.
We find that the use of LayerMatch-pretrained backbone leads to superior accuracy compared to standard supervised pretraining on ImageNet.
arXiv Detail & Related papers (2020-06-18T11:57:18Z) - From ImageNet to Image Classification: Contextualizing Progress on
Benchmarks [99.19183528305598]
We study how specific design choices in the ImageNet creation process impact the fidelity of the resulting dataset.
Our analysis pinpoints how a noisy data collection pipeline can lead to a systematic misalignment between the resulting benchmark and the real-world task it serves as a proxy for.
arXiv Detail & Related papers (2020-05-22T17:39:16Z) - Semantically Multi-modal Image Synthesis [58.87967932525891]
We focus on semantically multi-modal image synthesis (SMIS) task, namely, generating multi-modal images at the semantic level.
We propose a novel Group Decreasing Network (GroupDNet) that leverages group convolutions in the generator and progressively decreases the group numbers of the convolutions in the decoder.
GroupDNet is armed with much more controllability on translating semantic labels to natural images and has plausible high-quality yields for datasets with many classes.
arXiv Detail & Related papers (2020-03-28T04:03:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.