ExSinGAN: Learning an Explainable Generative Model from a Single Image
- URL: http://arxiv.org/abs/2105.07350v1
- Date: Sun, 16 May 2021 04:38:46 GMT
- Title: ExSinGAN: Learning an Explainable Generative Model from a Single Image
- Authors: ZiCheng Zhang, CongYing Han, TianDe Guo
- Abstract summary: We propose a hierarchical framework that simplifies the learning of the intricate conditional distributions through the successive learning of the distributions about structure, semantics and texture.
We design ExSinGAN composed of three cascaded GANs for learning an explainable generative model from a given image.
ExSinGAN is learned not only from the internal patches of the given image as the previous works did, but also from the external prior obtained by the GAN inversion technique.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating images from a single sample, as a newly developing branch of image
synthesis, has attracted extensive attention. In this paper, we formulate this
problem as sampling from the conditional distribution of a single image, and
propose a hierarchical framework that simplifies the learning of the intricate
conditional distributions through the successive learning of the distributions
about structure, semantics and texture, making the process of learning and
generation comprehensible. On this basis, we design ExSinGAN composed of three
cascaded GANs for learning an explainable generative model from a given image,
where the cascaded GANs model the distributions about structure, semantics and
texture successively. ExSinGAN is learned not only from the internal patches of
the given image as the previous works did, but also from the external prior
obtained by the GAN inversion technique. Benefiting from the appropriate
combination of internal and external information, ExSinGAN has a more powerful
capability of generation and competitive generalization ability for the image
manipulation tasks compared with prior works.
Related papers
- d-Sketch: Improving Visual Fidelity of Sketch-to-Image Translation with Pretrained Latent Diffusion Models without Retraining [18.73832646369506]
We introduce a technique for sketch-to-image translation by exploiting the feature generalization capabilities of a large-scale diffusion model without retraining.
Experimental results demonstrate that the proposed method outperforms the existing techniques in qualitative and quantitative benchmarks.
arXiv Detail & Related papers (2025-02-19T11:54:45Z) - Mechanisms of Generative Image-to-Image Translation Networks [1.602820210496921]
We propose a streamlined image-to-image translation network with a simpler architecture compared to existing models.
We show that adversarial for GAN models yields results comparable to those of existing methods without additional complex loss penalties.
arXiv Detail & Related papers (2024-11-15T17:17:46Z) - SODA: Bottleneck Diffusion Models for Representation Learning [75.7331354734152]
We introduce SODA, a self-supervised diffusion model, designed for representation learning.
The model incorporates an image encoder, which distills a source view into a compact representation, that guides the generation of related novel views.
We show that by imposing a tight bottleneck between the encoder and a denoising decoder, we can turn diffusion models into strong representation learners.
arXiv Detail & Related papers (2023-11-29T18:53:34Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - TcGAN: Semantic-Aware and Structure-Preserved GANs with Individual
Vision Transformer for Fast Arbitrary One-Shot Image Generation [11.207512995742999]
One-shot image generation (OSG) with generative adversarial networks that learn from the internal patches of a given image has attracted world wide attention.
We propose a novel structure-preserved method TcGAN with individual vision transformer to overcome the shortcomings of the existing one-shot image generation methods.
arXiv Detail & Related papers (2023-02-16T03:05:59Z) - FewGAN: Generating from the Joint Distribution of a Few Images [95.6635227371479]
We introduce FewGAN, a generative model for generating novel, high-quality and diverse images.
FewGAN is a hierarchical patch-GAN that applies quantization at the first coarse scale, followed by a pyramid of residual fully convolutional GANs at finer scales.
In an extensive set of experiments, it is shown that FewGAN outperforms baselines both quantitatively and qualitatively.
arXiv Detail & Related papers (2022-07-18T07:11:28Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - IMAGINE: Image Synthesis by Image-Guided Model Inversion [79.4691654458141]
We introduce an inversion based method, denoted as IMAge-Guided model INvErsion (IMAGINE), to generate high-quality and diverse images.
We leverage the knowledge of image semantics from a pre-trained classifier to achieve plausible generations.
IMAGINE enables the synthesis procedure to simultaneously 1) enforce semantic specificity constraints during the synthesis, 2) produce realistic images without generator training, and 3) give users intuitive control over the generation process.
arXiv Detail & Related papers (2021-04-13T02:00:24Z) - MOGAN: Morphologic-structure-aware Generative Learning from a Single
Image [59.59698650663925]
Recently proposed generative models complete training based on only one image.
We introduce a MOrphologic-structure-aware Generative Adversarial Network named MOGAN that produces random samples with diverse appearances.
Our approach focuses on internal features including the maintenance of rational structures and variation on appearance.
arXiv Detail & Related papers (2021-03-04T12:45:23Z) - High-Fidelity Synthesis with Disentangled Representation [60.19657080953252]
We propose an Information-Distillation Generative Adrial Network (ID-GAN) for disentanglement learning and high-fidelity synthesis.
Our method learns disentangled representation using VAE-based models, and distills the learned representation with an additional nuisance variable to the separate GAN-based generator for high-fidelity synthesis.
Despite the simplicity, we show that the proposed method is highly effective, achieving comparable image generation quality to the state-of-the-art methods using the disentangled representation.
arXiv Detail & Related papers (2020-01-13T14:39:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.