Vit-GAN: Image-to-image Translation with Vision Transformes and
Conditional GANS
- URL: http://arxiv.org/abs/2110.09305v1
- Date: Mon, 11 Oct 2021 18:09:16 GMT
- Title: Vit-GAN: Image-to-image Translation with Vision Transformes and
Conditional GANS
- Authors: Yi\u{g}it G\"und\"u\c{c}
- Abstract summary: In this paper, we have developed a general-purpose architecture, Vit-Gan, capable of performing most of the image-to-image translation tasks.
It is observed that the obtained results are more realistic than the commonly used architectures.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we have developed a general-purpose architecture, Vit-Gan,
capable of performing most of the image-to-image translation tasks from
semantic image segmentation to single image depth perception. This paper is a
follow-up paper, an extension of generator-based model [1] in which the
obtained results were very promising. This opened the possibility of further
improvements with adversarial architecture. We used a unique vision
transformers-based generator architecture and Conditional GANs(cGANs) with a
Markovian Discriminator (PatchGAN) (https://github.com/YigitGunduc/vit-gan). In
the present work, we use images as conditioning arguments. It is observed that
the obtained results are more realistic than the commonly used architectures.
Related papers
- Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval [50.72924579220149]
Composed Image Retrieval (CIR) is a task that retrieves images similar to a query, based on a provided textual modification.
Current techniques rely on supervised learning for CIR models using labeled triplets of the reference image, text, target image.
We propose a new semi-supervised CIR approach where we search for a reference and its related target images in auxiliary data.
arXiv Detail & Related papers (2024-04-23T21:00:22Z) - EGAIN: Extended GAn INversion [5.602947425285195]
Generative Adversarial Networks (GANs) have witnessed significant advances in recent years.
Recent GANs have proven to encode features in a disentangled latent space.
GAN inversion opens the door for the manipulation of facial semantics of real face images.
arXiv Detail & Related papers (2023-12-22T23:25:17Z) - In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - Guided Image-to-Image Translation by Discriminator-Generator
Communication [71.86347329356244]
The goal of Image-to-image (I2I) translation is to transfer an image from a source domain to a target domain.
One major branch of this research is to formulate I2I translation based on Generative Adversarial Network (GAN)
arXiv Detail & Related papers (2023-03-07T02:29:36Z) - GH-Feat: Learning Versatile Generative Hierarchical Features from GANs [61.208757845344074]
We show that a generative feature learned from image synthesis exhibits great potentials in solving a wide range of computer vision tasks.
We first train an encoder by considering the pretrained StyleGAN generator as a learned loss function.
The visual features produced by our encoder, termed as Generative Hierarchical Features (GH-Feat), highly align with the layer-wise GAN representations.
arXiv Detail & Related papers (2023-01-12T21:59:46Z) - Text to Image Synthesis using Stacked Conditional Variational
Autoencoders and Conditional Generative Adversarial Networks [0.0]
Current text to image synthesis approaches falls short of producing a highresolution image that represent a text descriptor.
This study uses Conditional VAEs as an initial generator to produce a high-level sketch of the text descriptor.
The proposed architecture benefits from a conditioning augmentation and a residual block on the Conditional GAN network to achieve the results.
arXiv Detail & Related papers (2022-07-06T13:43:56Z) - UVCGAN: UNet Vision Transformer cycle-consistent GAN for unpaired
image-to-image translation [7.998209482848582]
Image-to-image translation has broad applications in art, design, and scientific simulations.
This work examines if equipping CycleGAN with a vision transformer (ViT) and employing advanced generative adversarial network (GAN) training techniques can achieve better performance.
arXiv Detail & Related papers (2022-03-04T20:27:16Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - One Model to Reconstruct Them All: A Novel Way to Use the Stochastic
Noise in StyleGAN [10.810541849249821]
We present a novel StyleGAN-based autoencoder architecture, which can reconstruct images with very high quality across several data domains.
Our proposed architecture can handle up to 40 images per second on a single GPU, which is approximately 28x faster than previous approaches.
arXiv Detail & Related papers (2020-10-21T16:24:07Z) - Image-to-image Mapping with Many Domains by Sparse Attribute Transfer [71.28847881318013]
Unsupervised image-to-image translation consists of learning a pair of mappings between two domains without known pairwise correspondences between points.
Current convention is to approach this task with cycle-consistent GANs.
We propose an alternate approach that directly restricts the generator to performing a simple sparse transformation in a latent layer.
arXiv Detail & Related papers (2020-06-23T19:52:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.