Related papers: DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer

DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer

URL: http://arxiv.org/abs/2410.15007v1
Date: Sat, 19 Oct 2024 06:42:43 GMT
Title: DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer
Authors: Ying Hu, Chenyi Zhuang, Pan Gao,
Abstract summary: Style transfer aims to fuse the artistic representation of a style image with the structural information of a content image. Existing methods train specific networks or utilize pre-trained models to learn content and style features. We propose a novel and training-free approach for style transfer, combining textual embedding with spatial features.
Score: 13.588643982359413
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Style transfer aims to fuse the artistic representation of a style image with the structural information of a content image. Existing methods train specific networks or utilize pre-trained models to learn content and style features. However, they rely solely on textual or spatial representations that are inadequate to achieve the balance between content and style. In this work, we propose a novel and training-free approach for style transfer, combining textual embedding with spatial features and separating the injection of content or style. Specifically, we adopt the BLIP-2 encoder to extract the textual representation of the style image. We utilize the DDIM inversion technique to extract intermediate embeddings in content and style branches as spatial features. Finally, we harness the step-by-step property of diffusion models by separating the injection of content and style in the target branch, which improves the balance between content preservation and style fusion. Various experiments have demonstrated the effectiveness and robustness of our proposed DiffeseST for achieving balanced and controllable style transfer results, as well as the potential to extend to other tasks.

Related papers

One-shot Embroidery Customization via Contrastive LoRA Modulation [20.463441212598273]
We propose a novel contrastive learning framework that disentangles fine-grained style and content features with a single reference image.<n>To evaluate our method on fine-grained style transfer, we build a benchmark for embroidery customization.
arXiv Detail & Related papers (2025-09-23T12:58:15Z)
Break Stylistic Sophon: Are We Really Meant to Confine the Imagination in Style Transfer? [12.2238770989173]
StyleWallfacer is a groundbreaking unified training and inference framework.<n>It addresses various issues encountered in the style transfer process of traditional methods.<n>It delivers artist-level style transfer and text-driven stylization.
arXiv Detail & Related papers (2025-06-18T00:24:29Z)
WikiStyle+: A Multimodal Approach to Content-Style Representation Disentanglement for Artistic Image Stylization [0.0]
Artistic image stylization aims to render the content provided by text or image with the target style. Current methods for content and style disentanglement rely on image supervision. This paper proposes a multimodal approach to content-style disentanglement for artistic image stylization.
arXiv Detail & Related papers (2024-12-19T03:42:58Z)
Z-STAR+: A Zero-shot Style Transfer Method via Adjusting Style Distribution [24.88532732093652]
Style transfer presents a significant challenge, primarily centered on identifying an appropriate style representation. In contrast to existing approaches, we have discovered that latent features in vanilla diffusion models inherently contain natural style and content distributions. Our method adopts dual denoising paths to represent content and style references in latent space, subsequently guiding the content image denoising process with style latent codes.
arXiv Detail & Related papers (2024-11-28T15:56:17Z)
AEANet: Affinity Enhanced Attentional Networks for Arbitrary Style Transfer [4.639424509503966]
A research area that combines rational academic study with emotive artistic creation. It aims to create a new image from a content image according to a target artistic style, maintaining the content's textural structural information. Existing style transfer methods often significantly damage the texture lines of the content image during the style transformation. We propose affinity-enhanced attentional network, which include the content affinity-enhanced attention (CAEA) module, the style affinity-enhanced attention (SAEA) module, and the hybrid attention (HA) module.
arXiv Detail & Related papers (2024-09-23T01:39:11Z)
ArtWeaver: Advanced Dynamic Style Integration via Diffusion Model [73.95608242322949]
Stylized Text-to-Image Generation (STIG) aims to generate images from text prompts and style reference images. We present ArtWeaver, a novel framework that leverages pretrained Stable Diffusion to address challenges such as misinterpreted styles and inconsistent semantics.
arXiv Detail & Related papers (2024-05-24T07:19:40Z)
DiffStyler: Diffusion-based Localized Image Style Transfer [0.0]
Image style transfer aims to imbue digital imagery with the distinctive attributes of style targets, such as colors, brushstrokes, shapes. Despite the advancements in arbitrary style transfer methods, a prevalent challenge remains the delicate equilibrium between content semantics and style attributes. This paper introduces DiffStyler, a novel approach that facilitates efficient and precise arbitrary image style transfer.
arXiv Detail & Related papers (2024-03-27T11:19:34Z)
ALADIN-NST: Self-supervised disentangled representation learning of artistic style through Neural Style Transfer [60.6863849241972]
We learn a representation of visual artistic style more strongly disentangled from the semantic content depicted in an image. We show that strongly addressing the disentanglement of style and content leads to large gains in style-specific metrics.
arXiv Detail & Related papers (2023-04-12T10:33:18Z)
A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning [84.8813842101747]
Unified Contrastive Arbitrary Style Transfer (UCAST) is a novel style representation learning and transfer framework. We present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature. Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
arXiv Detail & Related papers (2023-03-09T04:35:00Z)
DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization [66.42741426640633]
DiffStyler is a dual diffusion processing architecture to control the balance between the content and style of diffused results. We propose a content image-based learnable noise on which the reverse denoising process is based, enabling the stylization results to better preserve the structure information of the content image.
arXiv Detail & Related papers (2022-11-19T12:30:44Z)
Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning. Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z)
Arbitrary Style Transfer via Multi-Adaptation Network [109.6765099732799]
A desired style transfer, given a content image and referenced style painting, would render the content image with the color tone and vivid stroke patterns of the style painting. A new disentanglement loss function enables our network to extract main style patterns and exact content structures to adapt to various input images.
arXiv Detail & Related papers (2020-05-27T08:00:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.