StyTr^2: Unbiased Image Style Transfer with Transformers
- URL: http://arxiv.org/abs/2105.14576v1
- Date: Sun, 30 May 2021 15:57:09 GMT
- Title: StyTr^2: Unbiased Image Style Transfer with Transformers
- Authors: Yingying Deng and Fan Tang and Xingjia Pan, Weiming Dong and
ChongyangMa and Changsheng Xu
- Abstract summary: The goal of image style transfer is to render an image with artistic features guided by a style reference while maintaining the original content.
Traditional neural style transfer methods are usually biased and content leak can be observed by running several times of the style transfer process with the same reference image.
We propose a transformer-based approach, namely StyTr2, to address this critical issue.
- Score: 59.34108877969477
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The goal of image style transfer is to render an image with artistic features
guided by a style reference while maintaining the original content. Due to the
locality and spatial invariance in CNNs, it is difficult to extract and
maintain the global information of input images. Therefore, traditional neural
style transfer methods are usually biased and content leak can be observed by
running several times of the style transfer process with the same reference
style image. To address this critical issue, we take long-range dependencies of
input images into account for unbiased style transfer by proposing a
transformer-based approach, namely StyTr^2. In contrast with visual
transformers for other vision tasks, our StyTr^2 contains two different
transformer encoders to generate domain-specific sequences for content and
style, respectively. Following the encoders, a multi-layer transformer decoder
is adopted to stylize the content sequence according to the style sequence. In
addition, we analyze the deficiency of existing positional encoding methods and
propose the content-aware positional encoding (CAPE) which is scale-invariant
and more suitable for image style transfer task. Qualitative and quantitative
experiments demonstrate the effectiveness of the proposed StyTr^2 compared to
state-of-the-art CNN-based and flow-based approaches.
Related papers
- SwinStyleformer is a favorable choice for image inversion [2.8115030277940947]
This paper proposes the first pure Transformer structure inversion network called SwinStyleformer.
Experiments found that the inversion network with the Transformer backbone could not successfully invert the image.
arXiv Detail & Related papers (2024-06-19T02:08:45Z) - Puff-Net: Efficient Style Transfer with Pure Content and Style Feature Fusion Network [32.12413686394824]
Style transfer aims to render an image with the artistic features of a style image, while maintaining the original structure.
It is difficult for CNN-based methods to handle global information and long-range dependencies between input images.
We propose a novel network termed Puff-Net, i.e., pure content and style feature fusion network.
arXiv Detail & Related papers (2024-05-30T07:41:07Z) - Improving the Transferability of Adversarial Examples with Arbitrary
Style Transfer [32.644062141738246]
A style transfer network can alter the distribution of low-level visual features in an image while preserving semantic content for humans.
We propose a novel attack method named Style Transfer Method (STM) that utilizes a proposed arbitrary style transfer network to transform the images into different domains.
Our proposed method can significantly improve the adversarial transferability on either normally trained models or adversarially trained models.
arXiv Detail & Related papers (2023-08-21T09:58:13Z) - Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot
Artistic Style Transfer [83.1333306079676]
In this paper, we devise a novel Transformer model termed as emphMaster specifically for style transfer.
In the proposed model, different Transformer layers share a common group of parameters, which (1) reduces the total number of parameters, (2) leads to more robust training convergence, and (3) is readily to control the degree of stylization.
Experiments demonstrate the superiority of Master under both zero-shot and few-shot style transfer settings.
arXiv Detail & Related papers (2023-04-24T04:46:39Z) - DiffStyler: Controllable Dual Diffusion for Text-Driven Image
Stylization [66.42741426640633]
DiffStyler is a dual diffusion processing architecture to control the balance between the content and style of diffused results.
We propose a content image-based learnable noise on which the reverse denoising process is based, enabling the stylization results to better preserve the structure information of the content image.
arXiv Detail & Related papers (2022-11-19T12:30:44Z) - Line Search-Based Feature Transformation for Fast, Stable, and Tunable
Content-Style Control in Photorealistic Style Transfer [26.657485176782934]
Photorealistic style transfer is the task of synthesizing a realistic-looking image when adapting the content from one image to appear in the style of another image.
Modern models embed a transformation that fuses features describing the content image and style image and then decodes the resulting feature into a stylized image.
We introduce a general-purpose transformation that enables controlling the balance between how much content is preserved and the strength of the infused style.
arXiv Detail & Related papers (2022-10-12T08:05:49Z) - Fine-Grained Image Style Transfer with Visual Transformers [59.85619519384446]
We propose a novel STyle TRansformer (STTR) network which breaks both content and style images into visual tokens to achieve a fine-grained style transformation.
To compare STTR with existing approaches, we conduct user studies on Amazon Mechanical Turk.
arXiv Detail & Related papers (2022-10-11T06:26:00Z) - Diffusion-based Image Translation using Disentangled Style and Content
Representation [51.188396199083336]
Diffusion-based image translation guided by semantic texts or a single target image has enabled flexible style transfer.
It is often difficult to maintain the original content of the image during the reverse diffusion.
We present a novel diffusion-based unsupervised image translation method using disentangled style and content representation.
Our experimental results show that the proposed method outperforms state-of-the-art baseline models in both text-guided and image-guided translation tasks.
arXiv Detail & Related papers (2022-09-30T06:44:37Z) - Diverse Image Inpainting with Bidirectional and Autoregressive
Transformers [55.21000775547243]
We propose BAT-Fill, an image inpainting framework with a novel bidirectional autoregressive transformer (BAT)
BAT-Fill inherits the merits of transformers and CNNs in a two-stage manner, which allows to generate high-resolution contents without being constrained by the quadratic complexity of attention in transformers.
arXiv Detail & Related papers (2021-04-26T03:52:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.