SMooDi: Stylized Motion Diffusion Model
- URL: http://arxiv.org/abs/2407.12783v1
- Date: Wed, 17 Jul 2024 17:59:42 GMT
- Title: SMooDi: Stylized Motion Diffusion Model
- Authors: Lei Zhong, Yiming Xie, Varun Jampani, Deqing Sun, Huaizu Jiang,
- Abstract summary: We introduce a novel Stylized Motion Diffusion model, dubbed SMooDi, to generate stylized motion driven by content texts and style sequences.
Our proposed framework outperforms existing methods in stylized motion generation.
- Score: 46.293854851116215
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a novel Stylized Motion Diffusion model, dubbed SMooDi, to generate stylized motion driven by content texts and style motion sequences. Unlike existing methods that either generate motion of various content or transfer style from one sequence to another, SMooDi can rapidly generate motion across a broad range of content and diverse styles. To this end, we tailor a pre-trained text-to-motion model for stylization. Specifically, we propose style guidance to ensure that the generated motion closely matches the reference style, alongside a lightweight style adaptor that directs the motion towards the desired style while ensuring realism. Experiments across various applications demonstrate that our proposed framework outperforms existing methods in stylized motion generation.
Related papers
- MulSMo: Multimodal Stylized Motion Generation by Bidirectional Control Flow [11.491447470132279]
In existing methods, the information usually only flows from style to content, which may cause conflict between the style and content.
In this work we build a bidirectional control flow between the style and the content, also adjusting the style towards the content.
We extend the stylized motion generation from one modality, i.e. the style motion, to multiple modalities including texts and images through contrastive learning.
arXiv Detail & Related papers (2024-12-13T06:40:26Z) - MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models [59.10171699717122]
MoTrans is a customized motion transfer method enabling video generation of similar motion in new context.
multimodal representations from recaptioned prompt and video frames promote the modeling of appearance.
Our method effectively learns specific motion pattern from singular or multiple reference videos.
arXiv Detail & Related papers (2024-12-02T10:07:59Z) - MoST: Motion Style Transformer between Diverse Action Contents [23.62426940733713]
We propose a novel motion style transformer that effectively disentangles style from content and generates a plausible motion with transferred style from a source motion.
Our method outperforms existing methods and demonstrates exceptionally high quality, particularly in motion pairs with different contents, without the need for post-processing.
arXiv Detail & Related papers (2024-03-10T14:11:25Z) - Generative Human Motion Stylization in Latent Space [42.831468727082694]
We present a novel generative model that produces diverse stylization results of a single motion (latent) code.
In inference, users can opt to stylize a motion using style cues from a reference motion or a label.
Experimental results show that our proposed stylization models, despite their lightweight design, outperform the state-of-the-art in style reenactment, content preservation, and generalization.
arXiv Detail & Related papers (2024-01-24T14:53:13Z) - MotionCrafter: One-Shot Motion Customization of Diffusion Models [66.44642854791807]
We introduce MotionCrafter, a one-shot instance-guided motion customization method.
MotionCrafter employs a parallel spatial-temporal architecture that injects the reference motion into the temporal component of the base model.
During training, a frozen base model provides appearance normalization, effectively separating appearance from motion.
arXiv Detail & Related papers (2023-12-08T16:31:04Z) - NewMove: Customizing text-to-video models with novel motions [74.9442859239997]
We introduce an approach for augmenting text-to-video generation models with customized motions.
By leveraging a few video samples demonstrating specific movements as input, our method learns and generalizes the input motion patterns for diverse, text-specified scenarios.
arXiv Detail & Related papers (2023-12-07T18:59:03Z) - Style Aligned Image Generation via Shared Attention [61.121465570763085]
We introduce StyleAligned, a technique designed to establish style alignment among a series of generated images.
By employing minimal attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models.
Our method's evaluation across diverse styles and text prompts demonstrates high-quality and fidelity.
arXiv Detail & Related papers (2023-12-04T18:55:35Z) - MODIFY: Model-driven Face Stylization without Style Images [77.24793103549158]
Existing face stylization methods always acquire the presence of the target (style) domain during the translation process.
We propose a new method called MODel-drIven Face stYlization (MODIFY), which relies on the generative model to bypass the dependence of the target images.
Experimental results on several different datasets validate the effectiveness of MODIFY for unsupervised face stylization.
arXiv Detail & Related papers (2023-03-17T08:35:17Z) - ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech [6.8527462303619195]
We present ZeroEGGS, a neural network framework for speech-driven gesture generation with zero-shot style control by example.
Our model uses a Variational framework to learn a style embedding, making it easy to modify style through latent space manipulation or blending and scaling of style embeddings.
In a user study, we show that our model outperforms previous state-of-the-art techniques in naturalness of motion, for speech, and style portrayal.
arXiv Detail & Related papers (2022-09-15T18:34:30Z) - Style-ERD: Responsive and Coherent Online Motion Style Transfer [13.15016322155052]
Style transfer is a common method for enriching character animation.
We propose a novel style transfer model, Style-ERD, to stylize motions in an online manner.
Our method stylizes motions into multiple target styles with a unified model.
arXiv Detail & Related papers (2022-03-04T21:12:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.