U-StyDiT: Ultra-high Quality Artistic Style Transfer Using Diffusion Transformers
- URL: http://arxiv.org/abs/2503.08157v1
- Date: Tue, 11 Mar 2025 08:12:38 GMT
- Title: U-StyDiT: Ultra-high Quality Artistic Style Transfer Using Diffusion Transformers
- Authors: Zhanjie Zhang, Ao Ma, Ke Cao, Jing Wang, Shanyuan Liu, Yuhang Ma, Bo Cheng, Dawei Leng, Yuhui Yin,
- Abstract summary: We propose a novel artistic image style transfer method, U-StyDiT, built on transformer-based diffusion (DiT)<n>We first design a Multi-view Style Modulator (MSM) to learn style information from a style image from local and global perspectives.<n>Then, we introduce a StyDiT Block to learn content and style conditions simultaneously from a style image.
- Score: 11.37321110116169
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ultra-high quality artistic style transfer refers to repainting an ultra-high quality content image using the style information learned from the style image. Existing artistic style transfer methods can be categorized into style reconstruction-based and content-style disentanglement-based style transfer approaches. Although these methods can generate some artistic stylized images, they still exhibit obvious artifacts and disharmonious patterns, which hinder their ability to produce ultra-high quality artistic stylized images. To address these issues, we propose a novel artistic image style transfer method, U-StyDiT, which is built on transformer-based diffusion (DiT) and learns content-style disentanglement, generating ultra-high quality artistic stylized images. Specifically, we first design a Multi-view Style Modulator (MSM) to learn style information from a style image from local and global perspectives, conditioning U-StyDiT to generate stylized images with the learned style information. Then, we introduce a StyDiT Block to learn content and style conditions simultaneously from a style image. Additionally, we propose an ultra-high quality artistic image dataset, Aes4M, comprising 10 categories, each containing 400,000 style images. This dataset effectively solves the problem that the existing style transfer methods cannot produce high-quality artistic stylized images due to the size of the dataset and the quality of the images in the dataset. Finally, the extensive qualitative and quantitative experiments validate that our U-StyDiT can create higher quality stylized images compared to state-of-the-art artistic style transfer methods. To our knowledge, our proposed method is the first to address the generation of ultra-high quality stylized images using transformer-based diffusion.
Related papers
- DyArtbank: Diverse Artistic Style Transfer via Pre-trained Stable Diffusion and Dynamic Style Prompt Artbank [10.193101382716373]
Artistic style transfer aims to transfer the learned style onto an arbitrary content image.<n>Most existing style transfer methods can only render consistent artistic stylized images.<n>We propose a novel artistic style transfer framework called DyArtbank, which can generate diverse and highly realistic artistic stylized images.
arXiv Detail & Related papers (2025-03-11T12:56:47Z) - Multiscale style transfer based on a Laplacian pyramid for traditional Chinese painting [6.248530911794617]
We present a novel effective multiscale style transfer method based on Laplacian pyramid decomposition and reconstruction.<n>In the first stage, the holistic patterns are transferred at low resolution by adopting a Style Transfer Base Network.<n>The details of the content and style are gradually enhanced at higher resolutions by a Detail Enhancement Network.
arXiv Detail & Related papers (2025-02-07T01:04:49Z) - Art-Free Generative Models: Art Creation Without Graphic Art Knowledge [50.60063523054282]
We propose a text-to-image generation model trained without access to art-related content.<n>We then introduce a simple yet effective method to learn an art adapter using only a few examples of selected artistic styles.
arXiv Detail & Related papers (2024-11-29T18:59:01Z) - Towards Highly Realistic Artistic Style Transfer via Stable Diffusion with Step-aware and Layer-aware Prompt [12.27693060663517]
Artistic style transfer aims to transfer the learned artistic style onto an arbitrary content image, generating artistic stylized images.
We propose a novel pre-trained diffusion-based artistic style transfer method, called LSAST.
Our proposed method can generate more highly realistic artistic stylized images than the state-of-the-art artistic style transfer methods.
arXiv Detail & Related papers (2024-04-17T15:28:53Z) - ArtBank: Artistic Style Transfer with Pre-trained Diffusion Model and
Implicit Style Prompt Bank [9.99530386586636]
Artistic style transfer aims to repaint the content image with the learned artistic style.
Existing artistic style transfer methods can be divided into two categories: small model-based approaches and pre-trained large-scale model-based approaches.
We propose ArtBank, a novel artistic style transfer framework, to generate highly realistic stylized images.
arXiv Detail & Related papers (2023-12-11T05:53:40Z) - Style Aligned Image Generation via Shared Attention [61.121465570763085]
We introduce StyleAligned, a technique designed to establish style alignment among a series of generated images.
By employing minimal attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models.
Our method's evaluation across diverse styles and text prompts demonstrates high-quality and fidelity.
arXiv Detail & Related papers (2023-12-04T18:55:35Z) - StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter [78.75422651890776]
StyleCrafter is a generic method that enhances pre-trained T2V models with a style control adapter.
To promote content-style disentanglement, we remove style descriptions from the text prompt and extract style information solely from the reference image.
StyleCrafter efficiently generates high-quality stylized videos that align with the content of the texts and resemble the style of the reference images.
arXiv Detail & Related papers (2023-12-01T03:53:21Z) - NeAT: Neural Artistic Tracing for Beautiful Style Transfer [29.38791171225834]
Style transfer is the task of reproducing semantic contents of a source image in the artistic style of a second target image.
We present NeAT, a new state-of-the art feed-forward style transfer method.
We use BBST-4M to improve and measure the generalization of NeAT across a huge variety of styles.
arXiv Detail & Related papers (2023-04-11T11:08:13Z) - Inversion-Based Style Transfer with Diffusion Models [78.93863016223858]
Previous arbitrary example-guided artistic image generation methods often fail to control shape changes or convey elements.
We propose an inversion-based style transfer method (InST), which can efficiently and accurately learn the key information of an image.
arXiv Detail & Related papers (2022-11-23T18:44:25Z) - Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning.
Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z) - Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer [103.54337984566877]
Recent studies on StyleGAN show high performance on artistic portrait generation by transfer learning with limited data.
We introduce a novel DualStyleGAN with flexible control of dual styles of the original face domain and the extended artistic portrait domain.
Experiments demonstrate the superiority of DualStyleGAN over state-of-the-art methods in high-quality portrait style transfer and flexible style control.
arXiv Detail & Related papers (2022-03-24T17:57:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.