UniVST: A Unified Framework for Training-free Localized Video Style Transfer
- URL: http://arxiv.org/abs/2410.20084v2
- Date: Tue, 05 Nov 2024 09:52:50 GMT
- Title: UniVST: A Unified Framework for Training-free Localized Video Style Transfer
- Authors: Quanjian Song, Mingbao Lin, Wengyi Zhan, Shuicheng Yan, Liujuan Cao,
- Abstract summary: This paper presents UniVST, a unified framework for localized video style transfer.
It operates without the need for training, offering a distinct advantage over existing methods that transfer style across entire videos.
- Score: 66.69471376934034
- License:
- Abstract: This paper presents UniVST, a unified framework for localized video style transfer. It operates without the need for training, offering a distinct advantage over existing methods that transfer style across entire videos. The endeavors of this paper comprise: (1) A point-matching mask propagation strategy that leverages feature maps from the DDIM inversion. This streamlines the model's architecture by obviating the need for tracking models. (2) An AdaIN-guided style transfer mechanism that operates at both the latent and attention levels. This balances content fidelity and style richness, mitigating the loss of localized details commonly associated with direct video stylization. (3) A sliding window smoothing strategy that harnesses optical flow within the pixel representation and refines predicted noise to update the latent space. This significantly enhances temporal consistency and diminishes artifacts in video outputs. Our proposed UniVST has been validated to be superior to existing methods in quantitative and qualitative metrics. It adeptly addresses the challenges of preserving the primary object's style while ensuring temporal consistency and detail preservation.
Related papers
- ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps.
We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z) - Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework [33.46782517803435]
Make-Your-Anchor is a system requiring only a one-minute video clip of an individual for training.
We finetune a proposed structure-guided diffusion model on input video to render 3D mesh conditions into human appearances.
A novel identity-specific face enhancement module is introduced to improve the visual quality of facial regions in the output videos.
arXiv Detail & Related papers (2024-03-25T07:54:18Z) - Diffusion-based Human Motion Style Transfer with Semantic Guidance [23.600154466988073]
We propose a novel framework for few-shot style transfer learning based on the diffusion model.
In the first stage, we pre-train a diffusion-based text-to-motion model as a generative prior.
In the second stage, based on the single style example, we fine-tune the pre-trained diffusion model in a few-shot manner to make it capable of style transfer.
arXiv Detail & Related papers (2024-03-20T05:52:11Z) - HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced
Diffusion Models [84.12784265734238]
The goal of Arbitrary Style Transfer (AST) is injecting the artistic features of a style reference into a given image/video.
We propose HiCAST, which is capable of explicitly customizing the stylization results according to various source of semantic clues.
A novel learning objective is leveraged for video diffusion model training, which significantly improve cross-frame temporal consistency.
arXiv Detail & Related papers (2024-01-11T12:26:23Z) - Two Birds, One Stone: A Unified Framework for Joint Learning of Image
and Video Style Transfers [14.057935237805982]
Current arbitrary style transfer models are limited to either image or video domains.
We introduce UniST, a Unified Style Transfer framework for both images and videos.
We show that UniST performs favorably against state-of-the-art approaches on both tasks.
arXiv Detail & Related papers (2023-04-22T07:15:49Z) - A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive
Learning [84.8813842101747]
Unified Contrastive Arbitrary Style Transfer (UCAST) is a novel style representation learning and transfer framework.
We present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature.
Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
arXiv Detail & Related papers (2023-03-09T04:35:00Z) - CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer [58.020470877242865]
We devise a universally versatile style transfer method capable of performing artistic, photo-realistic, and video style transfer jointly.
We make a mild and reasonable assumption that global inconsistency is dominated by local inconsistencies and devise a generic Contrastive Coherence Preserving Loss (CCPL) applied to local patches.
CCPL can preserve the coherence of the content source during style transfer without degrading stylization.
arXiv Detail & Related papers (2022-07-11T12:09:41Z) - Arbitrary Video Style Transfer via Multi-Channel Correlation [84.75377967652753]
We propose Multi-Channel Correction network (MCCNet) to fuse exemplar style features and input content features for efficient style transfer.
MCCNet works directly on the feature space of style and content domain where it learns to rearrange and fuse style features based on similarity with content features.
The outputs generated by MCC are features containing the desired style patterns which can further be decoded into images with vivid style textures.
arXiv Detail & Related papers (2020-09-17T01:30:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.