PetsGAN: Rethinking Priors for Single Image Generation
- URL: http://arxiv.org/abs/2203.01488v1
- Date: Thu, 3 Mar 2022 02:31:50 GMT
- Title: PetsGAN: Rethinking Priors for Single Image Generation
- Authors: Zicheng Zhang, Yinglu Liu, Congying Han, Hailin Shi, Tiande Guo, Bowen
Zhou
- Abstract summary: SinGAN builds a pyramid of GANs to progressively learn the internal patch distribution of the single image.
Due to the lack of high-level information, SinGAN cannot handle the object images well.
Our method gets rid of the time-consuming progressive training scheme and can be trained end-to-end.
- Score: 29.213169685303495
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single image generation (SIG), described as generating diverse samples that
have similar visual content with the given single image, is first introduced by
SinGAN which builds a pyramid of GANs to progressively learn the internal patch
distribution of the single image. It also shows great potentials in a wide
range of image manipulation tasks. However, the paradigm of SinGAN has
limitations in terms of generation quality and training time. Firstly, due to
the lack of high-level information, SinGAN cannot handle the object images well
as it does on the scene and texture images. Secondly, the separate progressive
training scheme is time-consuming and easy to cause artifact accumulation. To
tackle these problems, in this paper, we dig into the SIG problem and improve
SinGAN by fully-utilization of internal and external priors. The main
contributions of this paper include: 1) We introduce to SIG a regularized
latent variable model. To the best of our knowledge, it is the first time to
give a clear formulation and optimization goal of SIG, and all the existing
methods for SIG can be regarded as special cases of this model. 2) We design a
novel Prior-based end-to-end training GAN (PetsGAN) to overcome the problems of
SinGAN. Our method gets rid of the time-consuming progressive training scheme
and can be trained end-to-end. 3) We construct abundant qualitative and
quantitative experiments to show the superiority of our method on both
generated image quality, diversity, and the training speed. Moreover, we apply
our method to other image manipulation tasks (e.g., style transfer,
harmonization), and the results further prove the effectiveness and efficiency
of our method.
Related papers
- Unified Autoregressive Visual Generation and Understanding with Continuous Tokens [52.21981295470491]
We present UniFluid, a unified autoregressive framework for joint visual generation and understanding.
Our unified autoregressive architecture processes multimodal image and text inputs, generating discrete tokens for text and continuous tokens for image.
We find though there is an inherent trade-off between the image generation and understanding task, a carefully tuned training recipe enables them to improve each other.
arXiv Detail & Related papers (2025-03-17T17:58:30Z) - EZIGen: Enhancing zero-shot personalized image generation with precise subject encoding and decoupled guidance [20.430259028981094]
Zero-shot personalized image generation models aim to produce images that align with both a given text prompt and subject image.
Existing methods often struggle to capture fine-grained subject details and frequently prioritize one form of guidance over the other.
We introduce a new approach, EZIGen, that employs two main components: leveraging a fixed pre-trained Diffusion UNet itself as subject encoder.
arXiv Detail & Related papers (2024-09-12T14:44:45Z) - TcGAN: Semantic-Aware and Structure-Preserved GANs with Individual
Vision Transformer for Fast Arbitrary One-Shot Image Generation [11.207512995742999]
One-shot image generation (OSG) with generative adversarial networks that learn from the internal patches of a given image has attracted world wide attention.
We propose a novel structure-preserved method TcGAN with individual vision transformer to overcome the shortcomings of the existing one-shot image generation methods.
arXiv Detail & Related papers (2023-02-16T03:05:59Z) - A Survey on Leveraging Pre-trained Generative Adversarial Networks for
Image Editing and Restoration [72.17890189820665]
Generative adversarial networks (GANs) have drawn enormous attention due to the simple yet effective training mechanism and superior image generation quality.
Recent GAN models have greatly narrowed the gaps between the generated images and the real ones.
Many recent works show emerging interest to take advantage of pre-trained GAN models by exploiting the well-disentangled latent space and the learned GAN priors.
arXiv Detail & Related papers (2022-07-21T05:05:58Z) - FewGAN: Generating from the Joint Distribution of a Few Images [95.6635227371479]
We introduce FewGAN, a generative model for generating novel, high-quality and diverse images.
FewGAN is a hierarchical patch-GAN that applies quantization at the first coarse scale, followed by a pyramid of residual fully convolutional GANs at finer scales.
In an extensive set of experiments, it is shown that FewGAN outperforms baselines both quantitatively and qualitatively.
arXiv Detail & Related papers (2022-07-18T07:11:28Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - ExSinGAN: Learning an Explainable Generative Model from a Single Image [0.0]
We propose a hierarchical framework that simplifies the learning of the intricate conditional distributions through the successive learning of the distributions about structure, semantics and texture.
We design ExSinGAN composed of three cascaded GANs for learning an explainable generative model from a given image.
ExSinGAN is learned not only from the internal patches of the given image as the previous works did, but also from the external prior obtained by the GAN inversion technique.
arXiv Detail & Related papers (2021-05-16T04:38:46Z) - Ensembling with Deep Generative Views [72.70801582346344]
generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose.
Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification.
We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
arXiv Detail & Related papers (2021-04-29T17:58:35Z) - IMAGINE: Image Synthesis by Image-Guided Model Inversion [79.4691654458141]
We introduce an inversion based method, denoted as IMAge-Guided model INvErsion (IMAGINE), to generate high-quality and diverse images.
We leverage the knowledge of image semantics from a pre-trained classifier to achieve plausible generations.
IMAGINE enables the synthesis procedure to simultaneously 1) enforce semantic specificity constraints during the synthesis, 2) produce realistic images without generator training, and 3) give users intuitive control over the generation process.
arXiv Detail & Related papers (2021-04-13T02:00:24Z) - Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then
Training It Toughly [114.81028176850404]
Training generative adversarial networks (GANs) with limited data generally results in deteriorated performance and collapsed models.
We decompose the data-hungry GAN training into two sequential sub-problems.
Such a coordinated framework enables us to focus on lower-complexity and more data-efficient sub-problems.
arXiv Detail & Related papers (2021-02-28T05:20:29Z) - Blind Motion Deblurring through SinGAN Architecture [21.104218472462907]
Blind motion deblurring involves reconstructing a sharp image from an observation that is blurry.
SinGAN is a generative model that is unconditional and could be learned from a single natural image.
arXiv Detail & Related papers (2020-11-07T06:09:16Z) - InfoMax-GAN: Improved Adversarial Image Generation via Information
Maximization and Contrastive Learning [39.316605441868944]
Generative Adversarial Networks (GANs) are fundamental to many generative modelling applications.
We propose a principled framework to simultaneously mitigate two fundamental issues in GANs: catastrophic forgetting of the discriminator and mode collapse of the generator.
Our approach significantly stabilizes GAN training and improves GAN performance for image synthesis across five datasets.
arXiv Detail & Related papers (2020-07-09T06:56:11Z) - Training End-to-end Single Image Generators without GANs [27.393821783237186]
AugurOne is a novel approach for training single image generative models.
Our approach trains an upscaling neural network using non-affine augmentations of the (single) input image.
A compact latent space is jointly learned allowing for controlled image synthesis.
arXiv Detail & Related papers (2020-04-07T17:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.