Spectral Normalization and Dual Contrastive Regularization for
Image-to-Image Translation
- URL: http://arxiv.org/abs/2304.11319v3
- Date: Sat, 9 Mar 2024 11:13:05 GMT
- Title: Spectral Normalization and Dual Contrastive Regularization for
Image-to-Image Translation
- Authors: Chen Zhao, Wei-Ling Cai, Zheng Yuan
- Abstract summary: We propose a new unpaired I2I translation framework based on dual contrastive regularization and spectral normalization.
We conduct comprehensive experiments to evaluate the effectiveness of SN-DCR, and the results prove that our method achieves SOTA in multiple tasks.
- Score: 9.029227024451506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing image-to-image (I2I) translation methods achieve state-of-the-art
performance by incorporating the patch-wise contrastive learning into
Generative Adversarial Networks. However, patch-wise contrastive learning only
focuses on the local content similarity but neglects the global structure
constraint, which affects the quality of the generated images. In this paper,
we propose a new unpaired I2I translation framework based on dual contrastive
regularization and spectral normalization, namely SN-DCR. To maintain
consistency of the global structure and texture, we design the dual contrastive
regularization using different deep feature spaces respectively. In order to
improve the global structure information of the generated images, we formulate
a semantic contrastive loss to make the global semantic structure of the
generated images similar to the real images from the target domain in the
semantic feature space. We use Gram Matrices to extract the style of texture
from images. Similarly, we design a style contrastive loss to improve the
global texture information of the generated images. Moreover, to enhance the
stability of the model, we employ the spectral normalized convolutional network
in the design of our generator. We conduct comprehensive experiments to
evaluate the effectiveness of SN-DCR, and the results prove that our method
achieves SOTA in multiple tasks.
Related papers
- IPT-V2: Efficient Image Processing Transformer using Hierarchical Attentions [26.09373405194564]
We present an efficient image processing transformer architecture with hierarchical attentions, called IPTV2.
We adopt a focal context self-attention (FCSA) and a global grid self-attention (GGSA) to obtain adequate token interactions in local and global receptive fields.
Our proposed IPT-V2 achieves state-of-the-art results on various image processing tasks, covering denoising, deblurring, deraining and obtains much better trade-off for performance and computational complexity than previous methods.
arXiv Detail & Related papers (2024-03-31T10:01:20Z) - Does resistance to style-transfer equal Global Shape Bias? Measuring
network sensitivity to global shape configuration [6.047146237332764]
Current benchmark for evaluating a model's global shape bias is a set of style-transferred images.
We show that networks trained with style-transfer images indeed learn to ignore style, but its shape bias arises primarily from local detail.
arXiv Detail & Related papers (2023-10-11T15:00:11Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - TcGAN: Semantic-Aware and Structure-Preserved GANs with Individual
Vision Transformer for Fast Arbitrary One-Shot Image Generation [11.207512995742999]
One-shot image generation (OSG) with generative adversarial networks that learn from the internal patches of a given image has attracted world wide attention.
We propose a novel structure-preserved method TcGAN with individual vision transformer to overcome the shortcomings of the existing one-shot image generation methods.
arXiv Detail & Related papers (2023-02-16T03:05:59Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - Smoothing the Disentangled Latent Style Space for Unsupervised
Image-to-Image Translation [56.55178339375146]
Image-to-Image (I2I) multi-domain translation models are usually evaluated also using the quality of their semantic results.
We propose a new training protocol based on three specific losses which help a translation network to learn a smooth and disentangled latent style space.
arXiv Detail & Related papers (2021-06-16T17:58:21Z) - Towards Unsupervised Deep Image Enhancement with Generative Adversarial
Network [92.01145655155374]
We present an unsupervised image enhancement generative network (UEGAN)
It learns the corresponding image-to-image mapping from a set of images with desired characteristics in an unsupervised manner.
Results show that the proposed model effectively improves the aesthetic quality of images.
arXiv Detail & Related papers (2020-12-30T03:22:46Z) - Style Intervention: How to Achieve Spatial Disentanglement with
Style-based Generators? [100.60938767993088]
We propose a lightweight optimization-based algorithm which could adapt to arbitrary input images and render natural translation effects under flexible objectives.
We verify the performance of the proposed framework in facial attribute editing on high-resolution images, where both photo-realism and consistency are required.
arXiv Detail & Related papers (2020-11-19T07:37:31Z) - A U-Net Based Discriminator for Generative Adversarial Networks [86.67102929147592]
We propose an alternative U-Net based discriminator architecture for generative adversarial networks (GANs)
The proposed architecture allows to provide detailed per-pixel feedback to the generator while maintaining the global coherence of synthesized images.
The novel discriminator improves over the state of the art in terms of the standard distribution and image quality metrics.
arXiv Detail & Related papers (2020-02-28T11:16:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.