AVAE: Adversarial Variational Auto Encoder
        - URL: http://arxiv.org/abs/2012.11551v1
- Date: Mon, 21 Dec 2020 18:29:56 GMT
- Title: AVAE: Adversarial Variational Auto Encoder
- Authors: Antoine Plumerault, Herv\'e Le Borgne, C\'eline Hudelot
- Abstract summary: We introduce a new framework that combines VAE and GAN in a novel and complementary way to produce an auto-encoding model.
We evaluate our approach both qualitatively and quantitatively on five image datasets.
- Score: 2.1485350418225244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Among the wide variety of image generative models, two models stand out:
Variational Auto Encoders (VAE) and Generative Adversarial Networks (GAN). GANs
can produce realistic images, but they suffer from mode collapse and do not
provide simple ways to get the latent representation of an image. On the other
hand, VAEs do not have these problems, but they often generate images less
realistic than GANs. In this article, we explain that this lack of realism is
partially due to a common underestimation of the natural image manifold
dimensionality. To solve this issue we introduce a new framework that combines
VAE and GAN in a novel and complementary way to produce an auto-encoding model
that keeps VAEs properties while generating images of GAN-quality. We evaluate
our approach both qualitatively and quantitatively on five image datasets.
 
      
        Related papers
        - Can We Generate Realistic Hands Only Using Convolution? [0.0]
 Image generative models can't recreate intricate geometric features, such as those present in human hands and fingers.
In this paper, we demonstrate how this problem can be mitigated by augmenting convolution layers geometric capabilities.
We show that this drastically improves quality of hand and face images generated by GANs and Variational AutoEncoders (VAE)
 arXiv  Detail & Related papers  (2024-01-03T19:27:20Z)
- Unlocking Pre-trained Image Backbones for Semantic Image Synthesis [29.688029979801577]
 We propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images.
Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the input label maps on ADE-20K, COCO-Stuff, and Cityscapes.
 arXiv  Detail & Related papers  (2023-12-20T09:39:19Z)
- A Bayesian Non-parametric Approach to Generative Models: Integrating
  Variational Autoencoder and Generative Adversarial Networks using Wasserstein
  and Maximum Mean Discrepancy [2.966338139852619]
 Generative adversarial networks (GANs) and variational autoencoders (VAEs) are two of the most prominent and widely studied generative models.
We employ a Bayesian non-parametric (BNP) approach to merge GANs and VAEs.
By fusing the discriminative power of GANs with the reconstruction capabilities of VAEs, our novel model achieves superior performance in various generative tasks.
 arXiv  Detail & Related papers  (2023-08-27T08:58:31Z)
- In-N-Out: Faithful 3D GAN Inversion with Volumetric Decomposition for   Face Editing [28.790900756506833]
 3D-aware GANs offer new capabilities for view synthesis while preserving the editing functionalities of their 2D counterparts.
GAN inversion is a crucial step that seeks the latent code to reconstruct input images or videos, subsequently enabling diverse editing tasks through manipulation of this latent code.
We address this issue by explicitly modeling OOD objects from the input in 3D-aware GANs.
 arXiv  Detail & Related papers  (2023-02-09T18:59:56Z)
- NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real
  Image Animation [66.0838349951456]
 Nerf-based Generative models have shown impressive capacity in generating high-quality images with consistent 3D geometry.
We propose a universal method to surgically fine-tune these NeRF-GAN models in order to achieve high-fidelity animation of real subjects only by a single image.
 arXiv  Detail & Related papers  (2022-11-30T18:36:45Z)
- DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder [73.1010640692609]
 We propose a VQ-VAE architecture model with a diffusion decoder (DiVAE) to work as the reconstructing component in image synthesis.
Our model achieves state-of-the-art results and generates more photorealistic images specifically.
 arXiv  Detail & Related papers  (2022-06-01T10:39:12Z)
- A Shared Representation for Photorealistic Driving Simulators [83.5985178314263]
 We propose to improve the quality of generated images by rethinking the discriminator architecture.
The focus is on the class of problems where images are generated given semantic inputs, such as scene segmentation maps or human body poses.
We aim to learn a shared latent representation that encodes enough information to jointly do semantic segmentation, content reconstruction, along with a coarse-to-fine grained adversarial reasoning.
 arXiv  Detail & Related papers  (2021-12-09T18:59:21Z)
- InvGAN: Invertible GANs [88.58338626299837]
 InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
 arXiv  Detail & Related papers  (2021-12-08T21:39:00Z)
- Global Context with Discrete Diffusion in Vector Quantised Modelling for
  Image Generation [19.156223720614186]
 The integration of Vector Quantised Variational AutoEncoder with autoregressive models as generation part has yielded high-quality results on image generation.
We show that with the help of a content-rich discrete visual codebook from VQ-VAE, the discrete diffusion model can also generate high fidelity images with global context.
 arXiv  Detail & Related papers  (2021-12-03T09:09:34Z)
- Ensembling with Deep Generative Views [72.70801582346344]
 generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose.
Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification.
We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
 arXiv  Detail & Related papers  (2021-04-29T17:58:35Z)
- Deep Variational Network Toward Blind Image Restoration [60.45350399661175]
 Blind image restoration is a common yet challenging problem in computer vision.
We propose a novel blind image restoration method, aiming to integrate both the advantages of them.
 Experiments on two typical blind IR tasks, namely image denoising and super-resolution, demonstrate that the proposed method achieves superior performance over current state-of-the-arts.
 arXiv  Detail & Related papers  (2020-08-25T03:30:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.