TeleStyle: Content-Preserving Style Transfer in Images and Videos
- URL: http://arxiv.org/abs/2601.20175v1
- Date: Wed, 28 Jan 2026 02:16:03 GMT
- Title: TeleStyle: Content-Preserving Style Transfer in Images and Videos
- Authors: Shiwen Zhang, Xiaoyan Yang, Bojia Zi, Haibin Huang, Chi Zhang, Xuelong Li,
- Abstract summary: We present TeleStyle, a lightweight model for both image and video stylization.<n>We curated a high-quality dataset of distinct specific styles and synthesized triplets using thousands of diverse, in-the-wild style categories.<n>TeleStyle achieves state-of-the-art performance across three core evaluation metrics: style similarity, content consistency, and aesthetic quality.
- Score: 52.76027947278353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Content-preserving style transfer, generating stylized outputs based on content and style references, remains a significant challenge for Diffusion Transformers (DiTs) due to the inherent entanglement of content and style features in their internal representations. In this technical report, we present TeleStyle, a lightweight yet effective model for both image and video stylization. Built upon Qwen-Image-Edit, TeleStyle leverages the base model's robust capabilities in content preservation and style customization. To facilitate effective training, we curated a high-quality dataset of distinct specific styles and further synthesized triplets using thousands of diverse, in-the-wild style categories. We introduce a Curriculum Continual Learning framework to train TeleStyle on this hybrid dataset of clean (curated) and noisy (synthetic) triplets. This approach enables the model to generalize to unseen styles without compromising precise content fidelity. Additionally, we introduce a video-to-video stylization module to enhance temporal consistency and visual quality. TeleStyle achieves state-of-the-art performance across three core evaluation metrics: style similarity, content consistency, and aesthetic quality. Code and pre-trained models are available at https://github.com/Tele-AI/TeleStyle
Related papers
- QwenStyle: Content-Preserving Style Transfer with Qwen-Image-Edit [54.11909509184315]
We propose the first content-preserving style transfer model trained on Qwen-Image-Edit.<n>QwenStyle V1 achieves state-of-the-art performance in three core metrics: style similarity, content consistency, and aesthetic quality.
arXiv Detail & Related papers (2026-01-08T10:22:51Z) - DreamStyle: A Unified Framework for Video Stylization [18.820518165759403]
We introduce DreamStyle, a unified framework for video stylization.<n>It supports (1) text-guided, (2) style-image-guided, and (3) first-frame-guided video stylization.<n>Both qualitative and quantitative evaluations demonstrate that DreamStyle is competent in all three video stylization tasks.
arXiv Detail & Related papers (2026-01-06T07:42:12Z) - StyleMaster: Stylize Your Video with Artistic Generation and Translation [43.808656030545556]
Style control has been popular in video generation models.<n>Current methods often generate videos far from the given style, cause content leakage, and struggle to transfer one video to the desired style.<n>Our approach, StyleMaster, achieves significant improvement in both style resemblance and temporal coherence.
arXiv Detail & Related papers (2024-12-10T18:44:08Z) - InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image Generation [4.1177497612346]
Style transfer is an inventive process designed to create an image that maintains the essence of the original while embracing the visual style of another.
We introduce InstantStyle-Plus, an approach that prioritizes the integrity of the original content while seamlessly integrating the target style.
arXiv Detail & Related papers (2024-06-30T18:05:33Z) - StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter [78.75422651890776]
StyleCrafter is a generic method that enhances pre-trained T2V models with a style control adapter.
To promote content-style disentanglement, we remove style descriptions from the text prompt and extract style information solely from the reference image.
StyleCrafter efficiently generates high-quality stylized videos that align with the content of the texts and resemble the style of the reference images.
arXiv Detail & Related papers (2023-12-01T03:53:21Z) - Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning.
Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z) - Arbitrary Video Style Transfer via Multi-Channel Correlation [84.75377967652753]
We propose Multi-Channel Correction network (MCCNet) to fuse exemplar style features and input content features for efficient style transfer.
MCCNet works directly on the feature space of style and content domain where it learns to rearrange and fuse style features based on similarity with content features.
The outputs generated by MCC are features containing the desired style patterns which can further be decoded into images with vivid style textures.
arXiv Detail & Related papers (2020-09-17T01:30:46Z) - Arbitrary Style Transfer via Multi-Adaptation Network [109.6765099732799]
A desired style transfer, given a content image and referenced style painting, would render the content image with the color tone and vivid stroke patterns of the style painting.
A new disentanglement loss function enables our network to extract main style patterns and exact content structures to adapt to various input images.
arXiv Detail & Related papers (2020-05-27T08:00:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.