Music-to-Dance Generation with Optimal Transport
- URL: http://arxiv.org/abs/2112.01806v1
- Date: Fri, 3 Dec 2021 09:37:26 GMT
- Title: Music-to-Dance Generation with Optimal Transport
- Authors: Shuang Wu, Shijian Lu, Li Cheng
- Abstract summary: We propose a Music-to-Dance with Optimal Transport Network (MDOT-Net) for learning to generate 3D dance choreographs from music.
We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music.
- Score: 48.92483627635586
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dance choreography for a piece of music is a challenging task, having to be
creative in presenting distinctive stylistic dance elements while taking into
account the musical theme and rhythm. It has been tackled by different
approaches such as similarity retrieval, sequence-to-sequence modeling and
generative adversarial networks, but their generated dance sequences are often
short of motion realism, diversity and music consistency. In this paper, we
propose a Music-to-Dance with Optimal Transport Network (MDOT-Net) for learning
to generate 3D dance choreographs from music. We introduce an optimal transport
distance for evaluating the authenticity of the generated dance distribution
and a Gromov-Wasserstein distance to measure the correspondence between the
dance distribution and the input music. This gives a well defined and
non-divergent training objective that mitigates the limitation of standard GAN
training which is frequently plagued with instability and divergent generator
loss issues. Extensive experiments demonstrate that our MDOT-Net can synthesize
realistic and diverse dances which achieve an organic unity with the input
music, reflecting the shared intentionality and matching the rhythmic
articulation.
Related papers
- Lodge++: High-quality and Long Dance Generation with Vivid Choreography Patterns [48.54956784928394]
Lodge++ is a choreography framework to generate high-quality, ultra-long, and vivid dances given the music and desired genre.
To handle the challenges in computational efficiency, Lodge++ adopts a two-stage strategy to produce dances from coarse to fine.
Lodge++ is validated by extensive experiments, which show that our method can rapidly generate ultra-long dances suitable for various dance genres.
arXiv Detail & Related papers (2024-10-27T09:32:35Z) - Bidirectional Autoregressive Diffusion Model for Dance Generation [26.449135437337034]
We propose a Bidirectional Autoregressive Diffusion Model (BADM) for music-to-dance generation.
A bidirectional encoder is built to enforce that the generated dance is harmonious in both the forward and backward directions.
To make the generated dance motion smoother, a local information decoder is built for local motion enhancement.
arXiv Detail & Related papers (2024-02-06T19:42:18Z) - DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation [89.50310360658791]
We present a novel cascaded motion diffusion model, DiffDance, designed for high-resolution, long-form dance generation.
This model comprises a music-to-dance diffusion model and a sequence super-resolution diffusion model.
We demonstrate that DiffDance is capable of generating realistic dance sequences that align effectively with the input music.
arXiv Detail & Related papers (2023-08-05T16:18:57Z) - FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance
Generation [33.9261932800456]
FineDance is the largest music-dance paired dataset with the most dance genres.
To address monotonous and unnatural hand movements existing in previous methods, we propose a full-body dance generation network.
To further enhance the genre-matching and long-term stability of generated dances, we propose a Genre&Coherent aware Retrieval Module.
arXiv Detail & Related papers (2022-12-07T16:10:08Z) - Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic
Memory [92.81383016482813]
We propose a novel music-to-dance framework, Bailando, for driving 3D characters to dance following a piece of music.
We introduce an actor-critic Generative Pre-trained Transformer (GPT) that composes units to a fluent dance coherent to the music.
Our proposed framework achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-03-24T13:06:43Z) - Learning to Generate Diverse Dance Motions with Transformer [67.43270523386185]
We introduce a complete system for dance motion synthesis.
A massive dance motion data set is created from YouTube videos.
A novel two-stream motion transformer generative model can generate motion sequences with high flexibility.
arXiv Detail & Related papers (2020-08-18T22:29:40Z) - Dance Revolution: Long-Term Dance Generation with Music via Curriculum
Learning [55.854205371307884]
We formalize the music-conditioned dance generation as a sequence-to-sequence learning problem.
We propose a novel curriculum learning strategy to alleviate error accumulation of autoregressive models in long motion sequence generation.
Our approach significantly outperforms the existing state-of-the-arts on automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-06-11T00:08:25Z) - Music2Dance: DanceNet for Music-driven Dance Generation [11.73506542921528]
We propose a novel autoregressive generative model, DanceNet, to take the style, rhythm and melody of music as the control signals.
We capture several synchronized music-dance pairs by professional dancers, and build a high-quality music-dance pair dataset.
arXiv Detail & Related papers (2020-02-02T17:18:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.