CLIPstyler: Image Style Transfer with a Single Text Condition
- URL: http://arxiv.org/abs/2112.00374v1
- Date: Wed, 1 Dec 2021 09:48:53 GMT
- Title: CLIPstyler: Image Style Transfer with a Single Text Condition
- Authors: Gihyun Kwon, Jong Chul Ye
- Abstract summary: Existing neural style transfer methods require reference style images to transfer texture information of style images to content images.
We propose a new framework that enables a style transfer without' a style image, but only with a text description of the desired style.
- Score: 34.24876359759408
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing neural style transfer methods require reference style images to
transfer texture information of style images to content images. However, in
many practical situations, users may not have reference style images but still
be interested in transferring styles by just imagining them. In order to deal
with such applications, we propose a new framework that enables a style
transfer `without' a style image, but only with a text description of the
desired style. Using the pre-trained text-image embedding model of CLIP, we
demonstrate the modulation of the style of content images only with a single
text condition. Specifically, we propose a patch-wise text-image matching loss
with multiview augmentations for realistic texture transfer. Extensive
experimental results confirmed the successful image style transfer with
realistic textures that reflect semantic query texts.
Related papers
- Bridging Text and Image for Artist Style Transfer via Contrastive Learning [21.962361974579036]
We propose a Contrastive Learning for Artistic Style Transfer (CLAST) to control arbitrary style transfer.
We introduce a supervised contrastive training strategy to effectively extract style descriptions from the image-text model.
We also propose a novel and efficient adaLN based state space models that explore style-content fusion.
arXiv Detail & Related papers (2024-10-12T15:27:57Z) - StyleMamba : State Space Model for Efficient Text-driven Image Style Transfer [9.010012117838725]
StyleMamba is an efficient image style transfer framework that translates text prompts into corresponding visual styles.
Existing text-guided stylization requires hundreds of training iterations and takes a lot of computing resources.
arXiv Detail & Related papers (2024-05-08T12:57:53Z) - StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter [78.75422651890776]
StyleCrafter is a generic method that enhances pre-trained T2V models with a style control adapter.
To promote content-style disentanglement, we remove style descriptions from the text prompt and extract style information solely from the reference image.
StyleCrafter efficiently generates high-quality stylized videos that align with the content of the texts and resemble the style of the reference images.
arXiv Detail & Related papers (2023-12-01T03:53:21Z) - ControlStyle: Text-Driven Stylized Image Generation Using Diffusion
Priors [105.37795139586075]
We propose a new task for stylizing'' text-to-image models, namely text-driven stylized image generation.
We present a new diffusion model (ControlStyle) via upgrading a pre-trained text-to-image model with a trainable modulation network.
Experiments demonstrate the effectiveness of our ControlStyle in producing more visually pleasing and artistic results.
arXiv Detail & Related papers (2023-11-09T15:50:52Z) - StyleAdapter: A Unified Stylized Image Generation Model [97.24936247688824]
StyleAdapter is a unified stylized image generation model capable of producing a variety of stylized images.
It can be integrated with existing controllable synthesis methods, such as T2I-adapter and ControlNet.
arXiv Detail & Related papers (2023-09-04T19:16:46Z) - Any-to-Any Style Transfer: Making Picasso and Da Vinci Collaborate [58.83278629019384]
Style transfer aims to render the style of a given image for style reference to another given image for content reference.
Existing approaches either apply the holistic style of the style image in a global manner, or migrate local colors and textures of the style image to the content counterparts in a pre-defined way.
We propose Any-to-Any Style Transfer, which enables users to interactively select styles of regions in the style image and apply them to the prescribed content regions.
arXiv Detail & Related papers (2023-04-19T15:15:36Z) - ITstyler: Image-optimized Text-based Style Transfer [25.60521982742093]
We present a text-based style transfer method that does not require optimization at the inference stage.
Specifically, we convert text input to the style space of the pre-trained VGG network to realize a more effective style swap.
Our method can transfer arbitrary new styles of text input in real-time and synthesize high-quality artistic images.
arXiv Detail & Related papers (2023-01-26T03:08:43Z) - DiffStyler: Controllable Dual Diffusion for Text-Driven Image
Stylization [66.42741426640633]
DiffStyler is a dual diffusion processing architecture to control the balance between the content and style of diffused results.
We propose a content image-based learnable noise on which the reverse denoising process is based, enabling the stylization results to better preserve the structure information of the content image.
arXiv Detail & Related papers (2022-11-19T12:30:44Z) - APRNet: Attention-based Pixel-wise Rendering Network for Photo-Realistic
Text Image Generation [11.186226578337125]
Style-guided text image generation tries to synthesize text image by imitating reference image's appearance.
In this paper, we focus on transferring style image's background and foreground color patterns to the content image to generate photo-realistic text image.
arXiv Detail & Related papers (2022-03-15T07:48:34Z) - Name Your Style: An Arbitrary Artist-aware Image Style Transfer [38.41608300670523]
We propose a text-driven image style transfer (TxST) that leverages advanced image-text encoders to control arbitrary style transfer.
We introduce a contrastive training strategy to effectively extract style descriptions from the image-text model.
We also propose a novel and efficient attention module that explores cross-attentions to fuse style and content features.
arXiv Detail & Related papers (2022-02-28T06:21:38Z) - Language-Driven Image Style Transfer [72.36790598245096]
We introduce a new task -- language-driven image style transfer (textttLDIST) -- to manipulate the style of a content image, guided by a text.
The discriminator considers the correlation between language and patches of style images or transferred results to jointly embed style instructions.
Experiments show that our CLVA is effective and achieves superb transferred results on textttLDIST.
arXiv Detail & Related papers (2021-06-01T01:58:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.