SDA-GAN: Unsupervised Image Translation Using Spectral Domain
Attention-Guided Generative Adversarial Network
- URL: http://arxiv.org/abs/2110.02873v1
- Date: Wed, 6 Oct 2021 15:59:43 GMT
- Title: SDA-GAN: Unsupervised Image Translation Using Spectral Domain
Attention-Guided Generative Adversarial Network
- Authors: Qizhou Wang, Maksim Makarenko
- Abstract summary: This work introduced a novel GAN architecture for unsupervised image translation on the task of face style transform.
A spectral attention-based mechanism is embedded into the design along with spatial attention on the image contents.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This work introduced a novel GAN architecture for unsupervised image
translation on the task of face style transform. A spectral attention-based
mechanism is embedded into the design along with spatial attention on the image
contents. We proved that neural network has the potential of learning complex
transformations such as Fourier transform, within considerable computational
cost. The model is trained and tested in comparison to the baseline model,
which only uses spatial attention. The performance improvement of our approach
is significant especially when the source and target domain include different
complexity (reduced FID to 49.18 from 142.84). In the translation process, a
spectra filling effect was introduced due to the implementation of FFT and
spectral attention. Another style transfer task and real-world object
translation are also studied in this paper.
Related papers
- IPT-V2: Efficient Image Processing Transformer using Hierarchical Attentions [26.09373405194564]
We present an efficient image processing transformer architecture with hierarchical attentions, called IPTV2.
We adopt a focal context self-attention (FCSA) and a global grid self-attention (GGSA) to obtain adequate token interactions in local and global receptive fields.
Our proposed IPT-V2 achieves state-of-the-art results on various image processing tasks, covering denoising, deblurring, deraining and obtains much better trade-off for performance and computational complexity than previous methods.
arXiv Detail & Related papers (2024-03-31T10:01:20Z) - Spectrum Translation for Refinement of Image Generation (STIG) Based on
Contrastive Learning and Spectral Filter Profile [15.5188527312094]
We propose a framework to mitigate the disparity in frequency domain of the generated images.
This is realized by spectrum translation for the refinement of image generation (STIG) based on contrastive learning.
We evaluate our framework across eight fake image datasets and various cutting-edge models to demonstrate the effectiveness of STIG.
arXiv Detail & Related papers (2024-03-08T06:39:24Z) - In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - ESSAformer: Efficient Transformer for Hyperspectral Image
Super-resolution [76.7408734079706]
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation.
We propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure.
arXiv Detail & Related papers (2023-07-26T07:45:14Z) - Spectral Normalization and Dual Contrastive Regularization for
Image-to-Image Translation [9.029227024451506]
We propose a new unpaired I2I translation framework based on dual contrastive regularization and spectral normalization.
We conduct comprehensive experiments to evaluate the effectiveness of SN-DCR, and the results prove that our method achieves SOTA in multiple tasks.
arXiv Detail & Related papers (2023-04-22T05:22:24Z) - TcGAN: Semantic-Aware and Structure-Preserved GANs with Individual
Vision Transformer for Fast Arbitrary One-Shot Image Generation [11.207512995742999]
One-shot image generation (OSG) with generative adversarial networks that learn from the internal patches of a given image has attracted world wide attention.
We propose a novel structure-preserved method TcGAN with individual vision transformer to overcome the shortcomings of the existing one-shot image generation methods.
arXiv Detail & Related papers (2023-02-16T03:05:59Z) - Demystify Transformers & Convolutions in Modern Image Deep Networks [82.32018252867277]
This paper aims to identify the real gains of popular convolution and attention operators through a detailed study.
We find that the key difference among these feature transformation modules, such as attention or convolution, lies in their spatial feature aggregation approach.
Our experiments on various tasks and an analysis of inductive bias show a significant performance boost due to advanced network-level and block-level designs.
arXiv Detail & Related papers (2022-11-10T18:59:43Z) - Deep Fourier Up-Sampling [100.59885545206744]
Up-sampling in the Fourier domain is more challenging as it does not follow such a local property.
We propose a theoretically sound Deep Fourier Up-Sampling (FourierUp) to solve these issues.
arXiv Detail & Related papers (2022-10-11T06:17:31Z) - Transformer-based SAR Image Despeckling [53.99620005035804]
We introduce a transformer-based network for SAR image despeckling.
The proposed despeckling network comprises of a transformer-based encoder which allows the network to learn global dependencies between different image regions.
Experiments show that the proposed method achieves significant improvements over traditional and convolutional neural network-based despeckling methods.
arXiv Detail & Related papers (2022-01-23T20:09:01Z) - Investigating Expressiveness of Transformer in Spectral Domain for
Graphs [6.092217185687028]
We study and prove the link between the spatial and spectral domain in the realm of the transformer.
We propose FeTA, a framework that aims to perform attention over the entire graph spectrum analogous to the attention in spatial space.
arXiv Detail & Related papers (2022-01-23T18:03:22Z) - SPIN: Structure-Preserving Inner Offset Network for Scene Text
Recognition [48.676064155070556]
Arbitrary text appearance poses a great challenge in scene text recognition tasks.
We introduce a new learnable geometric-unrelated module, the Structure-Preserving Inner Offset Network (SPIN)
SPIN allows the color manipulation of source data within the network.
arXiv Detail & Related papers (2020-05-27T01:47:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.