Diffusion-based Human Motion Style Transfer with Semantic Guidance
- URL: http://arxiv.org/abs/2405.06646v2
- Date: Wed, 7 Aug 2024 14:06:30 GMT
- Title: Diffusion-based Human Motion Style Transfer with Semantic Guidance
- Authors: Lei Hu, Zihao Zhang, Yongjing Ye, Yiwen Xu, Shihong Xia,
- Abstract summary: We propose a novel framework for few-shot style transfer learning based on the diffusion model.
In the first stage, we pre-train a diffusion-based text-to-motion model as a generative prior.
In the second stage, based on the single style example, we fine-tune the pre-trained diffusion model in a few-shot manner to make it capable of style transfer.
- Score: 23.600154466988073
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D Human motion style transfer is a fundamental problem in computer graphic and animation processing. Existing AdaIN- based methods necessitate datasets with balanced style distribution and content/style labels to train the clustered latent space. However, we may encounter a single unseen style example in practical scenarios, but not in sufficient quantity to constitute a style cluster for AdaIN-based methods. Therefore, in this paper, we propose a novel two-stage framework for few-shot style transfer learning based on the diffusion model. Specifically, in the first stage, we pre-train a diffusion-based text-to-motion model as a generative prior so that it can cope with various content motion inputs. In the second stage, based on the single style example, we fine-tune the pre-trained diffusion model in a few-shot manner to make it capable of style transfer. The key idea is regarding the reverse process of diffusion as a motion-style translation process since the motion styles can be viewed as special motion variations. During the fine-tuning for style transfer, a simple yet effective semantic-guided style transfer loss coordinated with style example reconstruction loss is introduced to supervise the style transfer in CLIP semantic space. The qualitative and quantitative evaluations demonstrate that our method can achieve state-of-the-art performance and has practical applications.
Related papers
- UniVST: A Unified Framework for Training-free Localized Video Style Transfer [66.69471376934034]
This paper presents UniVST, a unified framework for localized video style transfer.
It operates without the need for training, offering a distinct advantage over existing methods that transfer style across entire videos.
arXiv Detail & Related papers (2024-10-26T05:28:02Z) - DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer [13.588643982359413]
Style transfer aims to fuse the artistic representation of a style image with the structural information of a content image.
Existing methods train specific networks or utilize pre-trained models to learn content and style features.
We propose a novel and training-free approach for style transfer, combining textual embedding with spatial features.
arXiv Detail & Related papers (2024-10-19T06:42:43Z) - Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer [19.355744690301403]
We introduce a novel artistic style transfer method based on a pre-trained large-scale diffusion model without any optimization.
Our experimental results demonstrate that our proposed method surpasses state-of-the-art methods in both conventional and diffusion-based style transfer baselines.
arXiv Detail & Related papers (2023-12-11T09:53:12Z) - ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style
Transfer [57.6482608202409]
Textual style transfer is the task of transforming stylistic properties of text while preserving meaning.
We introduce a novel diffusion-based framework for general-purpose style transfer that can be flexibly adapted to arbitrary target styles.
We validate the method on the Enron Email Corpus, with both human and automatic evaluations, and find that it outperforms strong baselines on formality, sentiment, and even authorship style transfer.
arXiv Detail & Related papers (2023-08-29T17:36:02Z) - Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot
Artistic Style Transfer [83.1333306079676]
In this paper, we devise a novel Transformer model termed as emphMaster specifically for style transfer.
In the proposed model, different Transformer layers share a common group of parameters, which (1) reduces the total number of parameters, (2) leads to more robust training convergence, and (3) is readily to control the degree of stylization.
Experiments demonstrate the superiority of Master under both zero-shot and few-shot style transfer settings.
arXiv Detail & Related papers (2023-04-24T04:46:39Z) - A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive
Learning [84.8813842101747]
Unified Contrastive Arbitrary Style Transfer (UCAST) is a novel style representation learning and transfer framework.
We present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature.
Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
arXiv Detail & Related papers (2023-03-09T04:35:00Z) - Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning.
Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z) - Unified Style Transfer [6.914642763754318]
It is hard to compare and evaluate different style transfer algorithms due to chaotic definitions of style.
In this paper, a novel approach, the Unified Style Transfer (UST) model, is proposed.
With the introduction of a generative model for internal style representation, UST can transfer images in two approaches, i.e., Domain-based and Image-based, simultaneously.
arXiv Detail & Related papers (2021-10-20T10:45:38Z) - 3DSNet: Unsupervised Shape-to-Shape 3D Style Transfer [66.48720190245616]
We propose a learning-based approach for style transfer between 3D objects.
The proposed method can synthesize new 3D shapes both in the form of point clouds and meshes.
We extend our technique to implicitly learn the multimodal style distribution of the chosen domains.
arXiv Detail & Related papers (2020-11-26T16:59:12Z) - Anisotropic Stroke Control for Multiple Artists Style Transfer [36.92721585146738]
Stroke Control Multi-Artist Style Transfer framework is developed.
Anisotropic Stroke Module (ASM) endows the network with the ability of adaptive semantic-consistency among various styles.
In contrast to the single-scale conditional discriminator, our discriminator is able to capture multi-scale texture clue.
arXiv Detail & Related papers (2020-10-16T05:32:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.