MotionMix: Weakly-Supervised Diffusion for Controllable Motion
Generation
- URL: http://arxiv.org/abs/2401.11115v3
- Date: Wed, 24 Jan 2024 13:08:59 GMT
- Title: MotionMix: Weakly-Supervised Diffusion for Controllable Motion
Generation
- Authors: Nhat M. Hoang, Kehong Gong, Chuan Guo, Michael Bi Mi
- Abstract summary: MotionMix is a weakly-supervised diffusion model that leverages both noisy and unannotated motion sequences.
Our framework consistently achieves state-of-the-art performances on text-to-motion, action-to-motion, and music-to-dance tasks.
- Score: 19.999239668765885
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Controllable generation of 3D human motions becomes an important topic as the
world embraces digital transformation. Existing works, though making promising
progress with the advent of diffusion models, heavily rely on meticulously
captured and annotated (e.g., text) high-quality motion corpus, a
resource-intensive endeavor in the real world. This motivates our proposed
MotionMix, a simple yet effective weakly-supervised diffusion model that
leverages both noisy and unannotated motion sequences. Specifically, we
separate the denoising objectives of a diffusion model into two stages:
obtaining conditional rough motion approximations in the initial $T-T^*$ steps
by learning the noisy annotated motions, followed by the unconditional
refinement of these preliminary motions during the last $T^*$ steps using
unannotated motions. Notably, though learning from two sources of imperfect
data, our model does not compromise motion generation quality compared to fully
supervised approaches that access gold data. Extensive experiments on several
benchmarks demonstrate that our MotionMix, as a versatile framework,
consistently achieves state-of-the-art performances on text-to-motion,
action-to-motion, and music-to-dance tasks. Project page:
https://nhathoang2002.github.io/MotionMix-page/
Related papers
- Monkey See, Monkey Do: Harnessing Self-attention in Motion Diffusion for Zero-shot Motion Transfer [55.109778609058154]
Existing diffusion-based motion editing methods overlook the profound potential of the prior embedded within the weights of pre-trained models.
We uncover the roles and interactions of attention elements in capturing and representing motion patterns.
We integrate these elements to transfer a leader motion to a follower one while maintaining the nuanced characteristics of the follower, resulting in zero-shot motion transfer.
arXiv Detail & Related papers (2024-06-10T17:47:14Z) - MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion [94.66090422753126]
MotionFollower is a lightweight score-guided diffusion model for video motion editing.
It delivers superior motion editing performance and exclusively supports large camera movements and actions.
Compared with MotionEditor, the most advanced motion editing model, MotionFollower achieves an approximately 80% reduction in GPU memory.
arXiv Detail & Related papers (2024-05-30T17:57:30Z) - Animate Your Motion: Turning Still Images into Dynamic Videos [58.63109848837741]
We introduce Scene and Motion Conditional Diffusion (SMCD), a novel methodology for managing multimodal inputs.
SMCD incorporates a recognized motion conditioning module and investigates various approaches to integrate scene conditions.
Our design significantly enhances video quality, motion precision, and semantic coherence.
arXiv Detail & Related papers (2024-03-15T10:36:24Z) - Motion Flow Matching for Human Motion Synthesis and Editing [75.13665467944314]
We propose emphMotion Flow Matching, a novel generative model for human motion generation featuring efficient sampling and effectiveness in motion editing applications.
Our method reduces the sampling complexity from thousand steps in previous diffusion models to just ten steps, while achieving comparable performance in text-to-motion and action-to-motion generation benchmarks.
arXiv Detail & Related papers (2023-12-14T12:57:35Z) - DiverseMotion: Towards Diverse Human Motion Generation via Discrete
Diffusion [70.33381660741861]
We present DiverseMotion, a new approach for synthesizing high-quality human motions conditioned on textual descriptions.
We show that our DiverseMotion achieves the state-of-the-art motion quality and competitive motion diversity.
arXiv Detail & Related papers (2023-09-04T05:43:48Z) - Priority-Centric Human Motion Generation in Discrete Latent Space [59.401128190423535]
We introduce a Priority-Centric Motion Discrete Diffusion Model (M2DM) for text-to-motion generation.
M2DM incorporates a global self-attention mechanism and a regularization term to counteract code collapse.
We also present a motion discrete diffusion model that employs an innovative noise schedule, determined by the significance of each motion token.
arXiv Detail & Related papers (2023-08-28T10:40:16Z) - BoDiffusion: Diffusing Sparse Observations for Full-Body Human Motion
Synthesis [14.331548412833513]
Mixed reality applications require tracking the user's full-body motion to enable an immersive experience.
We propose BoDiffusion -- a generative diffusion model for motion synthesis to tackle this under-constrained reconstruction problem.
We present a time and space conditioning scheme that allows BoDiffusion to leverage sparse tracking inputs while generating smooth and realistic full-body motion sequences.
arXiv Detail & Related papers (2023-04-21T16:39:05Z) - Human Motion Diffusion as a Generative Prior [20.004837564647367]
We introduce three forms of composition based on diffusion priors.
We tackle the challenge of long sequence generation.
Using parallel composition, we show promising steps toward two-person generation.
arXiv Detail & Related papers (2023-03-02T17:09:27Z) - MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model [35.32967411186489]
MotionDiffuse is a diffusion model-based text-driven motion generation framework.
It excels at modeling complicated data distribution and generating vivid motion sequences.
It responds to fine-grained instructions on body parts, and arbitrary-length motion synthesis with time-varied text prompts.
arXiv Detail & Related papers (2022-08-31T17:58:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.