Music to Dance as Language Translation using Sequence Models
- URL: http://arxiv.org/abs/2403.15569v1
- Date: Fri, 22 Mar 2024 18:47:54 GMT
- Title: Music to Dance as Language Translation using Sequence Models
- Authors: André Correia, Luís A. Alexandre,
- Abstract summary: We introduce MDLT, a novel approach that frames the choreography generation problem as a translation task.
We present two variants of MDLT: one utilising the Transformer architecture and the other employing the Mamba architecture.
We train our method on AIST++ and PhantomDance data sets to teach a robotic arm to dance, but our method can be applied to a full humanoid robot.
- Score: 1.4255659581428335
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Synthesising appropriate choreographies from music remains an open problem. We introduce MDLT, a novel approach that frames the choreography generation problem as a translation task. Our method leverages an existing data set to learn to translate sequences of audio into corresponding dance poses. We present two variants of MDLT: one utilising the Transformer architecture and the other employing the Mamba architecture. We train our method on AIST++ and PhantomDance data sets to teach a robotic arm to dance, but our method can be applied to a full humanoid robot. Evaluation metrics, including Average Joint Error and Frechet Inception Distance, consistently demonstrate that, when given a piece of music, MDLT excels at producing realistic and high-quality choreography. The code can be found at github.com/meowatthemoon/MDLT.
Related papers
- DanceCamera3D: 3D Camera Movement Synthesis with Music and Dance [50.01162760878841]
We present DCM, a new multi-modal 3D dataset that combines camera movement with dance motion and music audio.
This dataset encompasses 108 dance sequences (3.2 hours) of paired dance-camera-music data from the anime community.
We propose DanceCamera3D, a transformer-based diffusion model that incorporates a novel body attention loss and a condition separation strategy.
arXiv Detail & Related papers (2024-03-20T15:24:57Z) - LM2D: Lyrics- and Music-Driven Dance Synthesis [28.884929875333846]
LM2D is designed to create dance conditioned on both music and lyrics in one diffusion generation step.
We introduce the first 3D dance-motion dataset that encompasses both music and lyrics, obtained with pose estimation technologies.
The results demonstrate LM2D is able to produce realistic and diverse dance matching both lyrics and music.
arXiv Detail & Related papers (2024-03-14T13:59:04Z) - TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration [75.37311932218773]
We propose a novel task for generating 3D dance movements that simultaneously incorporate both text and music modalities.
Our approach can generate realistic and coherent dance movements conditioned on both text and music while maintaining comparable performance with the two single modalities.
arXiv Detail & Related papers (2023-04-05T12:58:33Z) - BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis [123.73677487809418]
We introduce a new dataset aiming to challenge common assumptions in dance motion synthesis.
We focus on breakdancing which features acrobatic moves and tangled postures.
Our efforts produced the BRACE dataset, which contains over 3 hours and 30 minutes of densely annotated poses.
arXiv Detail & Related papers (2022-07-20T18:03:54Z) - Quantized GAN for Complex Music Generation from Dance Videos [48.196705493763986]
We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates musical samples conditioned on dance videos.
Our proposed framework takes dance video frames and human body motion as input, and learns to generate music samples that plausibly accompany the corresponding input.
arXiv Detail & Related papers (2022-04-01T17:53:39Z) - Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic
Memory [92.81383016482813]
We propose a novel music-to-dance framework, Bailando, for driving 3D characters to dance following a piece of music.
We introduce an actor-critic Generative Pre-trained Transformer (GPT) that composes units to a fluent dance coherent to the music.
Our proposed framework achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-03-24T13:06:43Z) - Music-to-Dance Generation with Optimal Transport [48.92483627635586]
We propose a Music-to-Dance with Optimal Transport Network (MDOT-Net) for learning to generate 3D dance choreographs from music.
We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music.
arXiv Detail & Related papers (2021-12-03T09:37:26Z) - Semi-Supervised Learning for In-Game Expert-Level Music-to-Dance
Translation [0.0]
Music-to-dance translation is a powerful feature in recent role-playing games.
We re-formulate the translation problem as a piece-wise dance phrase retrieval problem based on the choreography theory.
Our method generalizes well over various styles of music and succeeds in expert-level choreography for game players.
arXiv Detail & Related papers (2020-09-27T07:08:04Z) - ChoreoNet: Towards Music to Dance Synthesis with Choreographic Action
Unit [28.877908457607678]
We design a two-stage music-to-dance synthesis framework ChoreoNet to imitate human choreography procedure.
Our framework firstly devises a CAU prediction model to learn the mapping relationship between music and CAU sequences.
We then devise a spatial-temporal inpainting model to convert the CAU sequence into continuous dance motions.
arXiv Detail & Related papers (2020-09-16T12:38:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.