SMooDi: Stylized Motion Diffusion Model
- URL: http://arxiv.org/abs/2407.12783v1
- Date: Wed, 17 Jul 2024 17:59:42 GMT
- Title: SMooDi: Stylized Motion Diffusion Model
- Authors: Lei Zhong, Yiming Xie, Varun Jampani, Deqing Sun, Huaizu Jiang,
- Abstract summary: We introduce a novel Stylized Motion Diffusion model, dubbed SMooDi, to generate stylized motion driven by content texts and style sequences.
Our proposed framework outperforms existing methods in stylized motion generation.
- Score: 46.293854851116215
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a novel Stylized Motion Diffusion model, dubbed SMooDi, to generate stylized motion driven by content texts and style motion sequences. Unlike existing methods that either generate motion of various content or transfer style from one sequence to another, SMooDi can rapidly generate motion across a broad range of content and diverse styles. To this end, we tailor a pre-trained text-to-motion model for stylization. Specifically, we propose style guidance to ensure that the generated motion closely matches the reference style, alongside a lightweight style adaptor that directs the motion towards the desired style while ensuring realism. Experiments across various applications demonstrate that our proposed framework outperforms existing methods in stylized motion generation.
Related papers
- MoST: Motion Style Transformer between Diverse Action Contents [23.62426940733713]
We propose a novel motion style transformer that effectively disentangles style from content and generates a plausible motion with transferred style from a source motion.
Our method outperforms existing methods and demonstrates exceptionally high quality, particularly in motion pairs with different contents, without the need for post-processing.
arXiv Detail & Related papers (2024-03-10T14:11:25Z) - Generative Human Motion Stylization in Latent Space [42.831468727082694]
We present a novel generative model that produces diverse stylization results of a single motion (latent) code.
In inference, users can opt to stylize a motion using style cues from a reference motion or a label.
Experimental results show that our proposed stylization models, despite their lightweight design, outperform the state-of-the-art in style reenactment, content preservation, and generalization.
arXiv Detail & Related papers (2024-01-24T14:53:13Z) - MotionCrafter: One-Shot Motion Customization of Diffusion Models [66.44642854791807]
We introduce MotionCrafter, a one-shot instance-guided motion customization method.
MotionCrafter employs a parallel spatial-temporal architecture that injects the reference motion into the temporal component of the base model.
During training, a frozen base model provides appearance normalization, effectively separating appearance from motion.
arXiv Detail & Related papers (2023-12-08T16:31:04Z) - Customizing Motion in Text-to-Video Diffusion Models [79.4121510826141]
We introduce an approach for augmenting text-to-video generation models with customized motions.
By leveraging a few video samples demonstrating specific movements as input, our method learns and generalizes the input motion patterns for diverse, text-specified scenarios.
arXiv Detail & Related papers (2023-12-07T18:59:03Z) - Style Aligned Image Generation via Shared Attention [61.121465570763085]
We introduce StyleAligned, a technique designed to establish style alignment among a series of generated images.
By employing minimal attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models.
Our method's evaluation across diverse styles and text prompts demonstrates high-quality and fidelity.
arXiv Detail & Related papers (2023-12-04T18:55:35Z) - MODIFY: Model-driven Face Stylization without Style Images [77.24793103549158]
Existing face stylization methods always acquire the presence of the target (style) domain during the translation process.
We propose a new method called MODel-drIven Face stYlization (MODIFY), which relies on the generative model to bypass the dependence of the target images.
Experimental results on several different datasets validate the effectiveness of MODIFY for unsupervised face stylization.
arXiv Detail & Related papers (2023-03-17T08:35:17Z) - ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech [6.8527462303619195]
We present ZeroEGGS, a neural network framework for speech-driven gesture generation with zero-shot style control by example.
Our model uses a Variational framework to learn a style embedding, making it easy to modify style through latent space manipulation or blending and scaling of style embeddings.
In a user study, we show that our model outperforms previous state-of-the-art techniques in naturalness of motion, for speech, and style portrayal.
arXiv Detail & Related papers (2022-09-15T18:34:30Z) - Style-ERD: Responsive and Coherent Online Motion Style Transfer [13.15016322155052]
Style transfer is a common method for enriching character animation.
We propose a novel style transfer model, Style-ERD, to stylize motions in an online manner.
Our method stylizes motions into multiple target styles with a unified model.
arXiv Detail & Related papers (2022-03-04T21:12:09Z) - Real-Time Style Modelling of Human Locomotion via Feature-Wise
Transformations and Local Motion Phases [13.034241298005044]
We present a style modelling system that uses an animation synthesis network to model motion content based on local motion phases.
An additional style modulation network uses feature-wise transformations to modulate style in real-time.
In comparison to other methods for real-time style modelling, we show our system is more robust and efficient in its style representation while improving motion quality.
arXiv Detail & Related papers (2022-01-12T12:25:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.