On-the-fly Learning to Transfer Motion Style with Diffusion Models: A Semantic Guidance Approach
- URL: http://arxiv.org/abs/2405.06646v1
- Date: Wed, 20 Mar 2024 05:52:11 GMT
- Title: On-the-fly Learning to Transfer Motion Style with Diffusion Models: A Semantic Guidance Approach
- Authors: Lei Hu, Zihao Zhang, Yongjing Ye, Yiwen Xu, Shihong Xia,
- Abstract summary: We propose an on-the-fly human motion style transfer learning method based on the diffusion model.
We first generate the corresponding neutral motion through the proposed Style-Neutral Motion Pair Generation module.
We then add noise to the generated neutral motion and denoise it to be close to the style example to fine-tune the style transfer diffusion model.
- Score: 23.600154466988073
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, the emergence of generative models has spurred development of human motion generation, among which the generation of stylized human motion has consistently been a focal point of research. The conventional approach for stylized human motion generation involves transferring the style from given style examples to new motions. Despite decades of research in human motion style transfer, it still faces three main challenges: 1) difficulties in decoupling the motion content and style; 2) generalization to unseen motion style. 3) requirements of dedicated motion style dataset; To address these issues, we propose an on-the-fly human motion style transfer learning method based on the diffusion model, which can learn a style transfer model in a few minutes of fine-tuning to transfer an unseen style to diverse content motions. The key idea of our method is to consider the denoising process of the diffusion model as a motion translation process that learns the difference between the style-neutral motion pair, thereby avoiding the challenge of style and content decoupling. Specifically, given an unseen style example, we first generate the corresponding neutral motion through the proposed Style-Neutral Motion Pair Generation module. We then add noise to the generated neutral motion and denoise it to be close to the style example to fine-tune the style transfer diffusion model. We only need one style example and a text-to-motion dataset with predominantly neutral motion (e.g. HumanML3D). The qualitative and quantitative evaluations demonstrate that our method can achieve state-of-the-art performance and has practical applications.
Related papers
- SMooDi: Stylized Motion Diffusion Model [46.293854851116215]
We introduce a novel Stylized Motion Diffusion model, dubbed SMooDi, to generate stylized motion driven by content texts and style sequences.
Our proposed framework outperforms existing methods in stylized motion generation.
arXiv Detail & Related papers (2024-07-17T17:59:42Z) - SMCD: High Realism Motion Style Transfer via Mamba-based Diffusion [12.426879081036116]
Style transfer is widely applied in multimedia scenarios such as movies, games, and the Metaverse.
Most of the current work in this field adopts the GAN, which may lead to instability and convergence issues.
We propose the Style Motion Conditioned Diffusion (SMCD) framework for the first time, which can more comprehensively learn the style features of motion.
arXiv Detail & Related papers (2024-05-05T08:28:07Z) - MoST: Motion Style Transformer between Diverse Action Contents [23.62426940733713]
We propose a novel motion style transformer that effectively disentangles style from content and generates a plausible motion with transferred style from a source motion.
Our method outperforms existing methods and demonstrates exceptionally high quality, particularly in motion pairs with different contents, without the need for post-processing.
arXiv Detail & Related papers (2024-03-10T14:11:25Z) - MotionCrafter: One-Shot Motion Customization of Diffusion Models [66.44642854791807]
We introduce MotionCrafter, a one-shot instance-guided motion customization method.
MotionCrafter employs a parallel spatial-temporal architecture that injects the reference motion into the temporal component of the base model.
During training, a frozen base model provides appearance normalization, effectively separating appearance from motion.
arXiv Detail & Related papers (2023-12-08T16:31:04Z) - Customizing Motion in Text-to-Video Diffusion Models [79.4121510826141]
We introduce an approach for augmenting text-to-video generation models with customized motions.
By leveraging a few video samples demonstrating specific movements as input, our method learns and generalizes the input motion patterns for diverse, text-specified scenarios.
arXiv Detail & Related papers (2023-12-07T18:59:03Z) - Priority-Centric Human Motion Generation in Discrete Latent Space [59.401128190423535]
We introduce a Priority-Centric Motion Discrete Diffusion Model (M2DM) for text-to-motion generation.
M2DM incorporates a global self-attention mechanism and a regularization term to counteract code collapse.
We also present a motion discrete diffusion model that employs an innovative noise schedule, determined by the significance of each motion token.
arXiv Detail & Related papers (2023-08-28T10:40:16Z) - Unifying Human Motion Synthesis and Style Transfer with Denoising
Diffusion Probabilistic Models [9.789705536694665]
Generating realistic motions for digital humans is a core but challenging part of computer animations and games.
We propose a denoising diffusion model solution for styled motion synthesis.
We design a multi-task architecture of diffusion model that strategically generates aspects of human motions for local guidance.
arXiv Detail & Related papers (2022-12-16T15:15:34Z) - Freeform Body Motion Generation from Speech [53.50388964591343]
Body motion generation from speech is inherently difficult due to the non-deterministic mapping from speech to body motions.
We introduce a novel freeform motion generation model (FreeMo) by equipping a two-stream architecture.
Experiments demonstrate the superior performance against several baselines.
arXiv Detail & Related papers (2022-03-04T13:03:22Z) - Unpaired Motion Style Transfer from Video to Animation [74.15550388701833]
Transferring the motion style from one animation clip to another, while preserving the motion content of the latter, has been a long-standing problem in character animation.
We present a novel data-driven framework for motion style transfer, which learns from an unpaired collection of motions with style labels.
Our framework is able to extract motion styles directly from videos, bypassing 3D reconstruction, and apply them to the 3D input motion.
arXiv Detail & Related papers (2020-05-12T13:21:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.