cGANs for Cartoon to Real-life Images
- URL: http://arxiv.org/abs/2101.09793v1
- Date: Sun, 24 Jan 2021 20:26:31 GMT
- Title: cGANs for Cartoon to Real-life Images
- Authors: Pranjal Singh Rajput, Kanya Satis, Sonnya Dellarosa, Wenxuan Huang,
Obinna Agba
- Abstract summary: The project aims to evaluate the robustness of the Pix2Pix model by applying it to datasets consisting of cartoonized images.
It should be possible to train the network to generate real-life images from the cartoonized images.
- Score: 0.4724825031148411
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The image-to-image translation is a learning task to establish a visual
mapping between an input and output image. The task has several variations
differentiated based on the purpose of the translation, such as synthetic to
real translation, photo to caricature translation, and many others. The problem
has been tackled using different approaches, either through traditional
computer vision methods, as well as deep learning approaches in recent trends.
One approach currently deemed popular and effective is using the conditional
generative adversarial network, also known shortly as cGAN. It is adapted to
perform image-to-image translation tasks with typically two networks: a
generator and a discriminator. This project aims to evaluate the robustness of
the Pix2Pix model by applying the Pix2Pix model to datasets consisting of
cartoonized images. Using the Pix2Pix model, it should be possible to train the
network to generate real-life images from the cartoonized images.
Related papers
- Translatotron-V(ison): An End-to-End Model for In-Image Machine Translation [81.45400849638347]
In-image machine translation (IIMT) aims to translate an image containing texts in source language into an image containing translations in target language.
In this paper, we propose an end-to-end IIMT model consisting of four modules.
Our model achieves competitive performance compared to cascaded models with only 70.9% of parameters, and significantly outperforms the pixel-level end-to-end IIMT model.
arXiv Detail & Related papers (2024-07-03T08:15:39Z) - MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data [50.94623170336122]
We bootstrap a multimodal dataset by extracting semantically meaningful image crops corresponding to words in the captions of synthetically generated and publicly available text-image data.
Our model, MUMU, is composed of a vision-language model encoder with a diffusion decoder and is trained on a single 8xH100 GPU node.
arXiv Detail & Related papers (2024-06-26T23:21:42Z) - Mapping New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation [4.767259403145913]
This paper explores a novel application of Pix2Pix to transform abstract map images into realistic ground truth images.
We detail the Pix2Pix model's utilization for generating high-fidelity datasets, supported by a dataset of paired map and aerial images.
arXiv Detail & Related papers (2024-04-30T05:11:32Z) - High-Resolution Image Translation Model Based on Grayscale Redefinition [3.6996084306161277]
We propose an innovative method for image translation between different domains.
For high-resolution image translation tasks, we use a grayscale adjustment method to achieve pixel-level translation.
For other tasks, we utilize the Pix2PixHD model with a coarse-to-fine generator, multi-scale discriminator, and improved loss to enhance the image translation performance.
arXiv Detail & Related papers (2024-03-26T12:21:47Z) - SCONE-GAN: Semantic Contrastive learning-based Generative Adversarial
Network for an end-to-end image translation [18.93434486338439]
SCONE-GAN is shown to be effective for learning to generate realistic and diverse scenery images.
For more realistic and diverse image generation we introduce style reference image.
We validate the proposed algorithm for image-to-image translation and stylizing outdoor images.
arXiv Detail & Related papers (2023-11-07T10:29:16Z) - Multi-domain Unsupervised Image-to-Image Translation with Appearance
Adaptive Convolution [62.4972011636884]
We propose a novel multi-domain unsupervised image-to-image translation (MDUIT) framework.
We exploit the decomposed content feature and appearance adaptive convolution to translate an image into a target appearance.
We show that the proposed method produces visually diverse and plausible results in multiple domains compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-02-06T14:12:34Z) - Font Completion and Manipulation by Cycling Between Multi-Modality
Representations [113.26243126754704]
We innovate to explore the generation of font glyphs as 2D graphic objects with the graph as an intermediate representation.
We formulate a cross-modality cycled image-to-image structure with a graph between an image encoder and an image.
Our model generates improved results than both image-to-image baseline and previous state-of-the-art methods for glyph completion.
arXiv Detail & Related papers (2021-08-30T02:43:29Z) - toon2real: Translating Cartoon Images to Realistic Images [1.4419517737536707]
We apply several state-of-the-art models to perform this task; however, they fail to perform good quality translations.
We propose a method based on CycleGAN model for image translation from cartoon domain to photo-realistic domain.
We demonstrate our experimental results and show that our proposed model has achieved the lowest Frechet Inception Distance score and better results compared to another state-of-the-art technique, UNIT.
arXiv Detail & Related papers (2021-02-01T20:22:05Z) - Unpaired Image-to-Image Translation via Latent Energy Transport [61.62293304236371]
Image-to-image translation aims to preserve source contents while translating to discriminative target styles between two visual domains.
In this paper, we propose to deploy an energy-based model (EBM) in the latent space of a pretrained autoencoder for this task.
Our model is the first to be applicable to 1024$times$1024-resolution unpaired image translation.
arXiv Detail & Related papers (2020-12-01T17:18:58Z) - Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2
Network [73.5062435623908]
We propose a new I2I translation method that generates a new model in the target domain via a series of model transformations.
By feeding the latent vector into the generated model, we can perform I2I translation between the source domain and target domain.
arXiv Detail & Related papers (2020-10-12T13:51:40Z) - Generating Embroidery Patterns Using Image-to-Image Translation [2.055949720959582]
We propose two machine learning techniques to solve the embroidery image-to-image translation.
Our goal is to generate a preview image which looks similar to an embroidered image, from a user-uploaded image.
Empirical results show that these techniques successfully generate an approximate preview of an embroidered version of a user image.
arXiv Detail & Related papers (2020-03-05T20:32:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.