Related papers: SCONE-GAN: Semantic Contrastive learning-based Generative Adversarial Network for an end-to-end image translation

SCONE-GAN: Semantic Contrastive learning-based Generative Adversarial Network for an end-to-end image translation

URL: http://arxiv.org/abs/2311.03866v1
Date: Tue, 7 Nov 2023 10:29:16 GMT
Title: SCONE-GAN: Semantic Contrastive learning-based Generative Adversarial Network for an end-to-end image translation
Authors: Iman Abbasnejad, Fabio Zambetta, Flora Salim, Timothy Wiley, Jeffrey Chan, Russell Gallagher, Ehsan Abbasnejad
Abstract summary: SCONE-GAN is shown to be effective for learning to generate realistic and diverse scenery images. For more realistic and diverse image generation we introduce style reference image. We validate the proposed algorithm for image-to-image translation and stylizing outdoor images.
Score: 18.93434486338439
License: http://creativecommons.org/licenses/by/4.0/
Abstract: SCONE-GAN presents an end-to-end image translation, which is shown to be effective for learning to generate realistic and diverse scenery images. Most current image-to-image translation approaches are devised as two mappings: a translation from the source to target domain and another to represent its inverse. While successful in many applications, these approaches may suffer from generating trivial solutions with limited diversity. That is because these methods learn more frequent associations rather than the scene structures. To mitigate the problem, we propose SCONE-GAN that utilises graph convolutional networks to learn the objects dependencies, maintain the image structure and preserve its semantics while transferring images into the target domain. For more realistic and diverse image generation we introduce style reference image. We enforce the model to maximize the mutual information between the style image and output. The proposed method explicitly maximizes the mutual information between the related patches, thus encouraging the generator to produce more diverse images. We validate the proposed algorithm for image-to-image translation and stylizing outdoor images. Both qualitative and quantitative results demonstrate the effectiveness of our approach on four dataset.

Related papers

I2I-Galip: Unsupervised Medical Image Translation Using Generative Adversarial CLIP [30.506544165999564]
Unpaired image-to-image translation is a challenging task due to the absence of paired examples. We propose a new image-to-image translation framework named Image-to-Image-Generative-Adversarial-CLIP (I2I-Galip)
arXiv Detail & Related papers (2024-09-19T01:44:50Z)
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality [50.48859793121308]
Contrastively trained vision-language models have achieved remarkable progress in vision and language representation learning. Recent research has highlighted severe limitations in their ability to perform compositional reasoning over objects, attributes, and relations.
arXiv Detail & Related papers (2023-05-23T08:28:38Z)
Domain Agnostic Image-to-image Translation using Low-Resolution Conditioning [6.470760375991825]
We propose a domain-agnostic i2i method for fine-grained problems, where the domains are related. We present a novel approach that relies on training the generative model to produce images that both share distinctive information of the associated source image. We validate our method on the CelebA-HQ and AFHQ datasets by demonstrating improvements in terms of visual quality.
arXiv Detail & Related papers (2023-05-08T19:58:49Z)
Multi-cropping Contrastive Learning and Domain Consistency for Unsupervised Image-to-Image Translation [5.562419999563734]
We propose a novel unsupervised image-to-image translation framework based on multi-cropping contrastive learning and domain consistency, called MCDUT. In many image-to-image translation tasks, our method achieves state-of-the-art results, and the advantages of our method have been proven through comparison experiments and ablation research.
arXiv Detail & Related papers (2023-04-24T16:20:28Z)
Multi-domain Unsupervised Image-to-Image Translation with Appearance Adaptive Convolution [62.4972011636884]
We propose a novel multi-domain unsupervised image-to-image translation (MDUIT) framework. We exploit the decomposed content feature and appearance adaptive convolution to translate an image into a target appearance. We show that the proposed method produces visually diverse and plausible results in multiple domains compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-02-06T14:12:34Z)
StEP: Style-based Encoder Pre-training for Multi-modal Image Synthesis [68.3787368024951]
We propose a novel approach for multi-modal Image-to-image (I2I) translation. We learn a latent embedding, jointly with the generator, that models the variability of the output domain. Specifically, we pre-train a generic style encoder using a novel proxy task to learn an embedding of images, from arbitrary domains, into a low-dimensional style latent space.
arXiv Detail & Related papers (2021-04-14T19:58:24Z)
Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2 Network [73.5062435623908]
We propose a new I2I translation method that generates a new model in the target domain via a series of model transformations. By feeding the latent vector into the generated model, we can perform I2I translation between the source domain and target domain.
arXiv Detail & Related papers (2020-10-12T13:51:40Z)
Retrieval Guided Unsupervised Multi-domain Image-to-Image Translation [59.73535607392732]
Image to image translation aims to learn a mapping that transforms an image from one visual domain to another. We propose the use of an image retrieval system to assist the image-to-image translation task.
arXiv Detail & Related papers (2020-08-11T20:11:53Z)
Multimodal Image-to-Image Translation via Mutual Information Estimation and Maximization [16.54980086211836]
Multimodal image-to-image translation (I2IT) aims to learn a conditional distribution that explores multiple possible images in the target domain given an input image in the source domain. Conditional generative adversarial networks (cGANs) are often adopted for modeling such a conditional distribution. We propose a method that explicitly estimates and maximizes the mutual information between the latent code and the output image in cGANs.
arXiv Detail & Related papers (2020-08-08T14:09:23Z)
Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis [194.1452124186117]
We propose a novel ECGAN for the challenging semantic image synthesis task. Our ECGAN achieves significantly better results than state-of-the-art methods.
arXiv Detail & Related papers (2020-03-31T01:23:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.