Transformer-based Generative Adversarial Networks in Computer Vision: A
Comprehensive Survey
- URL: http://arxiv.org/abs/2302.08641v1
- Date: Fri, 17 Feb 2023 01:13:58 GMT
- Title: Transformer-based Generative Adversarial Networks in Computer Vision: A
Comprehensive Survey
- Authors: Shiv Ram Dubey, Satish Kumar Singh
- Abstract summary: Generative Adversarial Networks (GANs) have been very successful for synthesizing the images in a given dataset.
Recent works have tried to exploit the Transformers in GAN framework for the image/video synthesis.
This paper presents a comprehensive survey on the developments and advancements in GANs utilizing the Transformer networks for computer vision applications.
- Score: 26.114550071165628
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative Adversarial Networks (GANs) have been very successful for
synthesizing the images in a given dataset. The artificially generated images
by GANs are very realistic. The GANs have shown potential usability in several
computer vision applications, including image generation, image-to-image
translation, video synthesis, and others. Conventionally, the generator network
is the backbone of GANs, which generates the samples and the discriminator
network is used to facilitate the training of the generator network. The
discriminator network is usually a Convolutional Neural Network (CNN). Whereas,
the generator network is usually either an Up-CNN for image generation or an
Encoder-Decoder network for image-to-image translation. The convolution-based
networks exploit the local relationship in a layer, which requires the deep
networks to extract the abstract features. Hence, CNNs suffer to exploit the
global relationship in the feature space. However, recently developed
Transformer networks are able to exploit the global relationship at every
layer. The Transformer networks have shown tremendous performance improvement
for several problems in computer vision. Motivated from the success of
Transformer networks and GANs, recent works have tried to exploit the
Transformers in GAN framework for the image/video synthesis. This paper
presents a comprehensive survey on the developments and advancements in GANs
utilizing the Transformer networks for computer vision applications. The
performance comparison for several applications on benchmark datasets is also
performed and analyzed. The conducted survey will be very useful to deep
learning and computer vision community to understand the research trends \&
gaps related with Transformer-based GANs and to develop the advanced GAN
architectures by exploiting the global and local relationships for different
applications.
Related papers
- SRTransGAN: Image Super-Resolution using Transformer based Generative
Adversarial Network [16.243363392717434]
We propose a transformer-based encoder-decoder network as a generator to generate 2x images and 4x images.
The proposed SRTransGAN outperforms the existing methods by 4.38 % on an average of PSNR and SSIM scores.
arXiv Detail & Related papers (2023-12-04T16:22:39Z) - NAR-Former V2: Rethinking Transformer for Universal Neural Network
Representation Learning [25.197394237526865]
We propose a modified Transformer-based universal neural network representation learning model NAR-Former V2.
Specifically, we take the network as a graph and design a straightforward tokenizer to encode the network into a sequence.
We incorporate the inductive representation learning capability of GNN into Transformer, enabling Transformer to generalize better when encountering unseen architecture.
arXiv Detail & Related papers (2023-06-19T09:11:04Z) - Distilling Representations from GAN Generator via Squeeze and Span [55.76208869775715]
We propose to distill knowledge from GAN generators by squeezing and spanning their representations.
We span the distilled representation of the synthetic domain to the real domain by also using real training data to remedy the mode collapse of GANs.
arXiv Detail & Related papers (2022-11-06T01:10:28Z) - Transformer-based SAR Image Despeckling [53.99620005035804]
We introduce a transformer-based network for SAR image despeckling.
The proposed despeckling network comprises of a transformer-based encoder which allows the network to learn global dependencies between different image regions.
Experiments show that the proposed method achieves significant improvements over traditional and convolutional neural network-based despeckling methods.
arXiv Detail & Related papers (2022-01-23T20:09:01Z) - The Nuts and Bolts of Adopting Transformer in GANs [124.30856952272913]
We investigate the properties of Transformer in the generative adversarial network (GAN) framework for high-fidelity image synthesis.
Our study leads to a new alternative design of Transformers in GAN, a convolutional neural network (CNN)-free generator termed as STrans-G.
arXiv Detail & Related papers (2021-10-25T17:01:29Z) - Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation [63.46694853953092]
Swin-Unet is an Unet-like pure Transformer for medical image segmentation.
tokenized image patches are fed into the Transformer-based U-shaped decoder-Decoder architecture.
arXiv Detail & Related papers (2021-05-12T09:30:26Z) - Generative Adversarial Networks (GANs) in Networking: A Comprehensive
Survey & Evaluation [5.196831100533835]
Generative Adversarial Networks (GANs) constitute an extensively researched machine learning sub-field.
GANs are typically used to generate or transform synthetic images.
In this paper, we demonstrate how this branch of machine learning can benefit multiple aspects of computer and communication networks.
arXiv Detail & Related papers (2021-05-10T08:28:36Z) - Transformers in Vision: A Survey [101.07348618962111]
Transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequence.
Transformers require minimal inductive biases for their design and are naturally suited as set-functions.
This survey aims to provide a comprehensive overview of the Transformer models in the computer vision discipline.
arXiv Detail & Related papers (2021-01-04T18:57:24Z) - A U-Net Based Discriminator for Generative Adversarial Networks [86.67102929147592]
We propose an alternative U-Net based discriminator architecture for generative adversarial networks (GANs)
The proposed architecture allows to provide detailed per-pixel feedback to the generator while maintaining the global coherence of synthesized images.
The novel discriminator improves over the state of the art in terms of the standard distribution and image quality metrics.
arXiv Detail & Related papers (2020-02-28T11:16:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.