LongDanceDiff: Long-term Dance Generation with Conditional Diffusion
Model
- URL: http://arxiv.org/abs/2308.11945v1
- Date: Wed, 23 Aug 2023 06:37:41 GMT
- Title: LongDanceDiff: Long-term Dance Generation with Conditional Diffusion
Model
- Authors: Siqi Yang, Zejun Yang, Zhisheng Wang
- Abstract summary: LongDanceDiff is a conditional diffusion model for sequence-to-sequence long-term dance generation.
It addresses the challenges of temporal coherency and spatial constraint.
We also address common visual quality issues in dance generation, such as foot sliding and unsmooth motion.
- Score: 3.036230795326545
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dancing with music is always an essential human art form to express emotion.
Due to the high temporal-spacial complexity, long-term 3D realist dance
generation synchronized with music is challenging. Existing methods suffer from
the freezing problem when generating long-term dances due to error accumulation
and training-inference discrepancy. To address this, we design a conditional
diffusion model, LongDanceDiff, for this sequence-to-sequence long-term dance
generation, addressing the challenges of temporal coherency and spatial
constraint. LongDanceDiff contains a transformer-based diffusion model, where
the input is a concatenation of music, past motions, and noised future motions.
This partial noising strategy leverages the full-attention mechanism and learns
the dependencies among music and past motions. To enhance the diversity of
generated dance motions and mitigate the freezing problem, we introduce a
mutual information minimization objective that regularizes the dependency
between past and future motions. We also address common visual quality issues
in dance generation, such as foot sliding and unsmooth motion, by incorporating
spatial constraints through a Global-Trajectory Modulation (GTM) layer and
motion perceptual losses, thereby improving the smoothness and naturalness of
motion generation. Extensive experiments demonstrate a significant improvement
in our approach over the existing state-of-the-art methods. We plan to release
our codes and models soon.
Related papers
- Scalable Group Choreography via Variational Phase Manifold Learning [8.504657927912076]
We propose a phase-based variational generative model for group dance generation on learning a generative manifold.
Our method achieves high-fidelity group dance motion and enables the generation with an unlimited number of dancers.
arXiv Detail & Related papers (2024-07-26T16:02:37Z) - Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives [50.37531720524434]
We propose Lodge, a network capable of generating extremely long dance sequences conditioned on given music.
Our approach can parallelly generate dance sequences of extremely long length, striking a balance between global choreographic patterns and local motion quality and expressiveness.
arXiv Detail & Related papers (2024-03-15T17:59:33Z) - Bidirectional Autoregressive Diffusion Model for Dance Generation [26.449135437337034]
We propose a Bidirectional Autoregressive Diffusion Model (BADM) for music-to-dance generation.
A bidirectional encoder is built to enforce that the generated dance is harmonious in both the forward and backward directions.
To make the generated dance motion smoother, a local information decoder is built for local motion enhancement.
arXiv Detail & Related papers (2024-02-06T19:42:18Z) - DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation [89.50310360658791]
We present a novel cascaded motion diffusion model, DiffDance, designed for high-resolution, long-form dance generation.
This model comprises a music-to-dance diffusion model and a sequence super-resolution diffusion model.
We demonstrate that DiffDance is capable of generating realistic dance sequences that align effectively with the input music.
arXiv Detail & Related papers (2023-08-05T16:18:57Z) - BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis [123.73677487809418]
We introduce a new dataset aiming to challenge common assumptions in dance motion synthesis.
We focus on breakdancing which features acrobatic moves and tangled postures.
Our efforts produced the BRACE dataset, which contains over 3 hours and 30 minutes of densely annotated poses.
arXiv Detail & Related papers (2022-07-20T18:03:54Z) - Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic
Memory [92.81383016482813]
We propose a novel music-to-dance framework, Bailando, for driving 3D characters to dance following a piece of music.
We introduce an actor-critic Generative Pre-trained Transformer (GPT) that composes units to a fluent dance coherent to the music.
Our proposed framework achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-03-24T13:06:43Z) - Music-to-Dance Generation with Optimal Transport [48.92483627635586]
We propose a Music-to-Dance with Optimal Transport Network (MDOT-Net) for learning to generate 3D dance choreographs from music.
We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music.
arXiv Detail & Related papers (2021-12-03T09:37:26Z) - Dance Revolution: Long-Term Dance Generation with Music via Curriculum
Learning [55.854205371307884]
We formalize the music-conditioned dance generation as a sequence-to-sequence learning problem.
We propose a novel curriculum learning strategy to alleviate error accumulation of autoregressive models in long motion sequence generation.
Our approach significantly outperforms the existing state-of-the-arts on automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-06-11T00:08:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.