Semantically-aware Mask CycleGAN for Translating Artistic Portraits to
Photo-realistic Visualizations
- URL: http://arxiv.org/abs/2306.06577v1
- Date: Sun, 11 Jun 2023 03:58:09 GMT
- Title: Semantically-aware Mask CycleGAN for Translating Artistic Portraits to
Photo-realistic Visualizations
- Authors: Zhuohao Yin
- Abstract summary: I propose the Semantic-aware Mask CycleGAN architecture which can translate artistic portraits to visualizations.
This model can generate realistic human portraits by feeding the discriminators semantically masked fake samples.
Experiments have shown that the SMCycleGAN generate images with significantly increased realism and minimal loss of content representations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image-to-image translation (I2I) is defined as a computer vision task where
the aim is to transfer images in a source domain to a target domain with
minimal loss or alteration of the content representations. Major progress has
been made since I2I was proposed with the invention of a variety of
revolutionary generative models. Among them, GAN-based models perform
exceptionally well as they are mostly tailor-made for specific domains or
tasks. However, few works proposed a tailor-made method for the artistic
domain. In this project, I propose the Semantic-aware Mask CycleGAN
(SMCycleGAN) architecture which can translate artistic portraits to
photo-realistic visualizations. This model can generate realistic human
portraits by feeding the discriminators semantically masked fake samples, thus
enforcing them to make discriminative decisions with partial information so
that the generators can be optimized to synthesize more realistic human
portraits instead of increasing the similarity of other irrelevant components,
such as the background. Experiments have shown that the SMCycleGAN generate
images with significantly increased realism and minimal loss of content
representations.
Related papers
- FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models [14.596090302381647]
This paper studies photorealism enhancement of rendered images, leveraging generative power from diffusion models on the controlled basis of rendering.
We introduce a novel framework to translate rendered images into their realistic counterparts, which consists of two stages: Domain Knowledge Injection (DKI) and Realistic Image Generation (RIG)
arXiv Detail & Related papers (2024-10-18T12:48:22Z) - Unlocking Pre-trained Image Backbones for Semantic Image Synthesis [29.688029979801577]
We propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images.
Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the input label maps on ADE-20K, COCO-Stuff, and Cityscapes.
arXiv Detail & Related papers (2023-12-20T09:39:19Z) - Dual Pyramid Generative Adversarial Networks for Semantic Image
Synthesis [94.76988562653845]
The goal of semantic image synthesis is to generate photo-realistic images from semantic label maps.
Current state-of-the-art approaches, however, still struggle to generate realistic objects in images at various scales.
We propose a Dual Pyramid Generative Adversarial Network (DP-GAN) that learns the conditioning of spatially-adaptive normalization blocks at all scales jointly.
arXiv Detail & Related papers (2022-10-08T18:45:44Z) - Explicitly Controllable 3D-Aware Portrait Generation [42.30481422714532]
We propose a 3D portrait generation network that produces consistent portraits according to semantic parameters regarding pose, identity, expression and lighting.
Our method outperforms prior arts in extensive experiments, producing realistic portraits with vivid expression in natural lighting when viewed in free viewpoint.
arXiv Detail & Related papers (2022-09-12T17:40:08Z) - CtlGAN: Few-shot Artistic Portraits Generation with Contrastive Transfer
Learning [77.27821665339492]
CtlGAN is a new few-shot artistic portraits generation model with a novel contrastive transfer learning strategy.
We adapt a pretrained StyleGAN in the source domain to a target artistic domain with no more than 10 artistic faces.
We propose a new encoder which embeds real faces into Z+ space and proposes a dual-path training strategy to better cope with the adapted decoder.
arXiv Detail & Related papers (2022-03-16T13:28:17Z) - A Shared Representation for Photorealistic Driving Simulators [83.5985178314263]
We propose to improve the quality of generated images by rethinking the discriminator architecture.
The focus is on the class of problems where images are generated given semantic inputs, such as scene segmentation maps or human body poses.
We aim to learn a shared latent representation that encodes enough information to jointly do semantic segmentation, content reconstruction, along with a coarse-to-fine grained adversarial reasoning.
arXiv Detail & Related papers (2021-12-09T18:59:21Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid [102.24539566851809]
Restoring reasonable and realistic content for arbitrary missing regions in images is an important yet challenging task.
Recent image inpainting models have made significant progress in generating vivid visual details, but they can still lead to texture blurring or structural distortions.
We propose the Semantic Pyramid Network (SPN) motivated by the idea that learning multi-scale semantic priors can greatly benefit the recovery of locally missing content in images.
arXiv Detail & Related papers (2021-12-08T04:33:33Z) - Inverting Generative Adversarial Renderer for Face Reconstruction [58.45125455811038]
In this work, we introduce a novel Generative Adversa Renderer (GAR)
GAR learns to model the complicated real-world image, instead of relying on the graphics rules, it is capable of producing realistic images.
Our method achieves state-of-the-art performances on multiple face reconstruction.
arXiv Detail & Related papers (2021-05-06T04:16:06Z) - High Resolution Zero-Shot Domain Adaptation of Synthetically Rendered
Face Images [10.03187850132035]
We propose an algorithm that matches a non-photorealistic, synthetically generated image to a latent vector of a pretrained StyleGAN2 model.
In contrast to most previous work, we require no synthetic training data.
This is the first algorithm of its kind to work at a resolution of 1K and represents a significant leap forward in visual realism.
arXiv Detail & Related papers (2020-06-26T15:00:04Z) - CONFIG: Controllable Neural Face Image Generation [10.443563719622645]
ConfigNet is a neural face model that allows for controlling individual aspects of output images in meaningful ways.
Our novel method uses synthetic data to factorize the latent space into elements that correspond to the inputs of a traditional rendering pipeline.
arXiv Detail & Related papers (2020-05-06T09:19:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.