DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation
- URL: http://arxiv.org/abs/2308.02915v1
- Date: Sat, 5 Aug 2023 16:18:57 GMT
- Title: DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation
- Authors: Qiaosong Qi, Le Zhuo, Aixi Zhang, Yue Liao, Fei Fang, Si Liu,
Shuicheng Yan
- Abstract summary: We present a novel cascaded motion diffusion model, DiffDance, designed for high-resolution, long-form dance generation.
This model comprises a music-to-dance diffusion model and a sequence super-resolution diffusion model.
We demonstrate that DiffDance is capable of generating realistic dance sequences that align effectively with the input music.
- Score: 89.50310360658791
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: When hearing music, it is natural for people to dance to its rhythm.
Automatic dance generation, however, is a challenging task due to the physical
constraints of human motion and rhythmic alignment with target music.
Conventional autoregressive methods introduce compounding errors during
sampling and struggle to capture the long-term structure of dance sequences. To
address these limitations, we present a novel cascaded motion diffusion model,
DiffDance, designed for high-resolution, long-form dance generation. This model
comprises a music-to-dance diffusion model and a sequence super-resolution
diffusion model. To bridge the gap between music and motion for conditional
generation, DiffDance employs a pretrained audio representation learning model
to extract music embeddings and further align its embedding space to motion via
contrastive loss. During training our cascaded diffusion model, we also
incorporate multiple geometric losses to constrain the model outputs to be
physically plausible and add a dynamic loss weight that adaptively changes over
diffusion timesteps to facilitate sample diversity. Through comprehensive
experiments performed on the benchmark dataset AIST++, we demonstrate that
DiffDance is capable of generating realistic dance sequences that align
effectively with the input music. These results are comparable to those
achieved by state-of-the-art autoregressive methods.
Related papers
- Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment [87.20240797625648]
We introduce a novel task within the field of 3D dance generation, termed dance accompaniment.
It requires the generation of responsive movements from a dance partner, the "follower", synchronized with the lead dancer's movements and the underlying musical rhythm.
We propose a GPT-based model, Duolando, which autoregressively predicts the subsequent tokenized motion conditioned on the coordinated information of the music, the leader's and the follower's movements.
arXiv Detail & Related papers (2024-03-27T17:57:02Z) - Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives [50.37531720524434]
We propose Lodge, a network capable of generating extremely long dance sequences conditioned on given music.
Our approach can parallelly generate dance sequences of extremely long length, striking a balance between global choreographic patterns and local motion quality and expressiveness.
arXiv Detail & Related papers (2024-03-15T17:59:33Z) - Bidirectional Autoregressive Diffusion Model for Dance Generation [26.449135437337034]
We propose a Bidirectional Autoregressive Diffusion Model (BADM) for music-to-dance generation.
A bidirectional encoder is built to enforce that the generated dance is harmonious in both the forward and backward directions.
To make the generated dance motion smoother, a local information decoder is built for local motion enhancement.
arXiv Detail & Related papers (2024-02-06T19:42:18Z) - LongDanceDiff: Long-term Dance Generation with Conditional Diffusion
Model [3.036230795326545]
LongDanceDiff is a conditional diffusion model for sequence-to-sequence long-term dance generation.
It addresses the challenges of temporal coherency and spatial constraint.
We also address common visual quality issues in dance generation, such as foot sliding and unsmooth motion.
arXiv Detail & Related papers (2023-08-23T06:37:41Z) - BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis [123.73677487809418]
We introduce a new dataset aiming to challenge common assumptions in dance motion synthesis.
We focus on breakdancing which features acrobatic moves and tangled postures.
Our efforts produced the BRACE dataset, which contains over 3 hours and 30 minutes of densely annotated poses.
arXiv Detail & Related papers (2022-07-20T18:03:54Z) - Music-to-Dance Generation with Optimal Transport [48.92483627635586]
We propose a Music-to-Dance with Optimal Transport Network (MDOT-Net) for learning to generate 3D dance choreographs from music.
We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music.
arXiv Detail & Related papers (2021-12-03T09:37:26Z) - Transflower: probabilistic autoregressive dance generation with
multimodal attention [31.308435764603658]
We present a novel probabilistic autoregressive architecture that models the distribution over future poses with a normalizing flow conditioned on previous poses as well as music context.
Second, we introduce the currently largest 3D dance-motion dataset, obtained with a variety of motion-capture technologies, and including both professional and casual dancers.
arXiv Detail & Related papers (2021-06-25T20:14:28Z) - Dance Revolution: Long-Term Dance Generation with Music via Curriculum
Learning [55.854205371307884]
We formalize the music-conditioned dance generation as a sequence-to-sequence learning problem.
We propose a novel curriculum learning strategy to alleviate error accumulation of autoregressive models in long motion sequence generation.
Our approach significantly outperforms the existing state-of-the-arts on automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-06-11T00:08:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.