DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer
- URL: http://arxiv.org/abs/2410.15007v1
- Date: Sat, 19 Oct 2024 06:42:43 GMT
- Title: DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer
- Authors: Ying Hu, Chenyi Zhuang, Pan Gao,
- Abstract summary: Style transfer aims to fuse the artistic representation of a style image with the structural information of a content image.
Existing methods train specific networks or utilize pre-trained models to learn content and style features.
We propose a novel and training-free approach for style transfer, combining textual embedding with spatial features.
- Score: 13.588643982359413
- License:
- Abstract: Style transfer aims to fuse the artistic representation of a style image with the structural information of a content image. Existing methods train specific networks or utilize pre-trained models to learn content and style features. However, they rely solely on textual or spatial representations that are inadequate to achieve the balance between content and style. In this work, we propose a novel and training-free approach for style transfer, combining textual embedding with spatial features and separating the injection of content or style. Specifically, we adopt the BLIP-2 encoder to extract the textual representation of the style image. We utilize the DDIM inversion technique to extract intermediate embeddings in content and style branches as spatial features. Finally, we harness the step-by-step property of diffusion models by separating the injection of content and style in the target branch, which improves the balance between content preservation and style fusion. Various experiments have demonstrated the effectiveness and robustness of our proposed DiffeseST for achieving balanced and controllable style transfer results, as well as the potential to extend to other tasks.
Related papers
- AEANet: Affinity Enhanced Attentional Networks for Arbitrary Style Transfer [4.639424509503966]
A research area that combines rational academic study with emotive artistic creation.
It aims to create a new image from a content image according to a target artistic style, maintaining the content's textural structural information.
Existing style transfer methods often significantly damage the texture lines of the content image during the style transformation.
We propose affinity-enhanced attentional network, which include the content affinity-enhanced attention (CAEA) module, the style affinity-enhanced attention (SAEA) module, and the hybrid attention (HA) module.
arXiv Detail & Related papers (2024-09-23T01:39:11Z) - ArtWeaver: Advanced Dynamic Style Integration via Diffusion Model [73.95608242322949]
Stylized Text-to-Image Generation (STIG) aims to generate images from text prompts and style reference images.
We present ArtWeaver, a novel framework that leverages pretrained Stable Diffusion to address challenges such as misinterpreted styles and inconsistent semantics.
arXiv Detail & Related papers (2024-05-24T07:19:40Z) - DiffStyler: Diffusion-based Localized Image Style Transfer [0.0]
Image style transfer aims to imbue digital imagery with the distinctive attributes of style targets, such as colors, brushstrokes, shapes.
Despite the advancements in arbitrary style transfer methods, a prevalent challenge remains the delicate equilibrium between content semantics and style attributes.
This paper introduces DiffStyler, a novel approach that facilitates efficient and precise arbitrary image style transfer.
arXiv Detail & Related papers (2024-03-27T11:19:34Z) - ALADIN-NST: Self-supervised disentangled representation learning of
artistic style through Neural Style Transfer [60.6863849241972]
We learn a representation of visual artistic style more strongly disentangled from the semantic content depicted in an image.
We show that strongly addressing the disentanglement of style and content leads to large gains in style-specific metrics.
arXiv Detail & Related papers (2023-04-12T10:33:18Z) - A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive
Learning [84.8813842101747]
Unified Contrastive Arbitrary Style Transfer (UCAST) is a novel style representation learning and transfer framework.
We present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature.
Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
arXiv Detail & Related papers (2023-03-09T04:35:00Z) - DiffStyler: Controllable Dual Diffusion for Text-Driven Image
Stylization [66.42741426640633]
DiffStyler is a dual diffusion processing architecture to control the balance between the content and style of diffused results.
We propose a content image-based learnable noise on which the reverse denoising process is based, enabling the stylization results to better preserve the structure information of the content image.
arXiv Detail & Related papers (2022-11-19T12:30:44Z) - Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning.
Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z) - Arbitrary Style Transfer via Multi-Adaptation Network [109.6765099732799]
A desired style transfer, given a content image and referenced style painting, would render the content image with the color tone and vivid stroke patterns of the style painting.
A new disentanglement loss function enables our network to extract main style patterns and exact content structures to adapt to various input images.
arXiv Detail & Related papers (2020-05-27T08:00:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.