SMooDi: Stylized Motion Diffusion Model
- URL: http://arxiv.org/abs/2407.12783v1
- Date: Wed, 17 Jul 2024 17:59:42 GMT
- Title: SMooDi: Stylized Motion Diffusion Model
- Authors: Lei Zhong, Yiming Xie, Varun Jampani, Deqing Sun, Huaizu Jiang,
- Abstract summary: We introduce a novel Stylized Motion Diffusion model, dubbed SMooDi, to generate stylized motion driven by content texts and style sequences.
Our proposed framework outperforms existing methods in stylized motion generation.
- Score: 46.293854851116215
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a novel Stylized Motion Diffusion model, dubbed SMooDi, to generate stylized motion driven by content texts and style motion sequences. Unlike existing methods that either generate motion of various content or transfer style from one sequence to another, SMooDi can rapidly generate motion across a broad range of content and diverse styles. To this end, we tailor a pre-trained text-to-motion model for stylization. Specifically, we propose style guidance to ensure that the generated motion closely matches the reference style, alongside a lightweight style adaptor that directs the motion towards the desired style while ensuring realism. Experiments across various applications demonstrate that our proposed framework outperforms existing methods in stylized motion generation.
Related papers
- StyleMotif: Multi-Modal Motion Stylization using Style-Content Cross Fusion [14.213279927964903]
StyleMotif is a novel Stylized Motion Latent Diffusion model.
It generates motion conditioned on both content and style from multiple modalities.
arXiv Detail & Related papers (2025-03-27T17:59:46Z) - Dance Like a Chicken: Low-Rank Stylization for Human Motion Diffusion [28.94750481325469]
We introduce LoRA-MDM, a framework for motion stylization that generalizes to complex actions while maintaining editability.
Our key insight is that adapting the generative prior to include the style, while preserving its overall distribution, is more effective than modifying each individual motion during generation.
LoRA-MDM learns to adapt the prior to include the reference style using only a few samples.
arXiv Detail & Related papers (2025-03-25T11:23:34Z) - SOYO: A Tuning-Free Approach for Video Style Morphing via Style-Adaptive Interpolation in Diffusion Models [54.641809532055916]
We introduce SOYO, a novel diffusion-based framework for video style morphing.
Our method employs a pre-trained text-to-image diffusion model without fine-tuning, combining attention injection and AdaIN to preserve structural consistency.
To harmonize across video frames, we propose a novel adaptive sampling scheduler between two style images.
arXiv Detail & Related papers (2025-03-10T07:27:01Z) - MulSMo: Multimodal Stylized Motion Generation by Bidirectional Control Flow [11.491447470132279]
In existing methods, the information usually only flows from style to content, which may cause conflict between the style and content.
In this work we build a bidirectional control flow between the style and the content, also adjusting the style towards the content.
We extend the stylized motion generation from one modality, i.e. the style motion, to multiple modalities including texts and images through contrastive learning.
arXiv Detail & Related papers (2024-12-13T06:40:26Z) - MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models [59.10171699717122]
MoTrans is a customized motion transfer method enabling video generation of similar motion in new context.
multimodal representations from recaptioned prompt and video frames promote the modeling of appearance.
Our method effectively learns specific motion pattern from singular or multiple reference videos.
arXiv Detail & Related papers (2024-12-02T10:07:59Z) - MoST: Motion Style Transformer between Diverse Action Contents [23.62426940733713]
We propose a novel motion style transformer that effectively disentangles style from content and generates a plausible motion with transferred style from a source motion.
Our method outperforms existing methods and demonstrates exceptionally high quality, particularly in motion pairs with different contents, without the need for post-processing.
arXiv Detail & Related papers (2024-03-10T14:11:25Z) - Generative Human Motion Stylization in Latent Space [42.831468727082694]
We present a novel generative model that produces diverse stylization results of a single motion (latent) code.
In inference, users can opt to stylize a motion using style cues from a reference motion or a label.
Experimental results show that our proposed stylization models, despite their lightweight design, outperform the state-of-the-art in style reenactment, content preservation, and generalization.
arXiv Detail & Related papers (2024-01-24T14:53:13Z) - MotionCrafter: One-Shot Motion Customization of Diffusion Models [66.44642854791807]
We introduce MotionCrafter, a one-shot instance-guided motion customization method.
MotionCrafter employs a parallel spatial-temporal architecture that injects the reference motion into the temporal component of the base model.
During training, a frozen base model provides appearance normalization, effectively separating appearance from motion.
arXiv Detail & Related papers (2023-12-08T16:31:04Z) - Customizing Motion in Text-to-Video Diffusion Models [79.4121510826141]
We introduce an approach for augmenting text-to-video generation models with customized motions.
By leveraging a few video samples demonstrating specific movements as input, our method learns and generalizes the input motion patterns for diverse, text-specified scenarios.
arXiv Detail & Related papers (2023-12-07T18:59:03Z) - Style Aligned Image Generation via Shared Attention [61.121465570763085]
We introduce StyleAligned, a technique designed to establish style alignment among a series of generated images.
By employing minimal attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models.
Our method's evaluation across diverse styles and text prompts demonstrates high-quality and fidelity.
arXiv Detail & Related papers (2023-12-04T18:55:35Z) - MODIFY: Model-driven Face Stylization without Style Images [77.24793103549158]
Existing face stylization methods always acquire the presence of the target (style) domain during the translation process.
We propose a new method called MODel-drIven Face stYlization (MODIFY), which relies on the generative model to bypass the dependence of the target images.
Experimental results on several different datasets validate the effectiveness of MODIFY for unsupervised face stylization.
arXiv Detail & Related papers (2023-03-17T08:35:17Z) - ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech [6.8527462303619195]
We present ZeroEGGS, a neural network framework for speech-driven gesture generation with zero-shot style control by example.
Our model uses a Variational framework to learn a style embedding, making it easy to modify style through latent space manipulation or blending and scaling of style embeddings.
In a user study, we show that our model outperforms previous state-of-the-art techniques in naturalness of motion, for speech, and style portrayal.
arXiv Detail & Related papers (2022-09-15T18:34:30Z) - Style-ERD: Responsive and Coherent Online Motion Style Transfer [13.15016322155052]
Style transfer is a common method for enriching character animation.
We propose a novel style transfer model, Style-ERD, to stylize motions in an online manner.
Our method stylizes motions into multiple target styles with a unified model.
arXiv Detail & Related papers (2022-03-04T21:12:09Z) - Real-Time Style Modelling of Human Locomotion via Feature-Wise
Transformations and Local Motion Phases [13.034241298005044]
We present a style modelling system that uses an animation synthesis network to model motion content based on local motion phases.
An additional style modulation network uses feature-wise transformations to modulate style in real-time.
In comparison to other methods for real-time style modelling, we show our system is more robust and efficient in its style representation while improving motion quality.
arXiv Detail & Related papers (2022-01-12T12:25:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.