Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment
- URL: http://arxiv.org/abs/2403.18811v1
- Date: Wed, 27 Mar 2024 17:57:02 GMT
- Title: Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment
- Authors: Li Siyao, Tianpei Gu, Zhitao Yang, Zhengyu Lin, Ziwei Liu, Henghui Ding, Lei Yang, Chen Change Loy,
- Abstract summary: We introduce a novel task within the field of 3D dance generation, termed dance accompaniment.
It requires the generation of responsive movements from a dance partner, the "follower", synchronized with the lead dancer's movements and the underlying musical rhythm.
We propose a GPT-based model, Duolando, which autoregressively predicts the subsequent tokenized motion conditioned on the coordinated information of the music, the leader's and the follower's movements.
- Score: 87.20240797625648
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a novel task within the field of 3D dance generation, termed dance accompaniment, which necessitates the generation of responsive movements from a dance partner, the "follower", synchronized with the lead dancer's movements and the underlying musical rhythm. Unlike existing solo or group dance generation tasks, a duet dance scenario entails a heightened degree of interaction between the two participants, requiring delicate coordination in both pose and position. To support this task, we first build a large-scale and diverse duet interactive dance dataset, DD100, by recording about 117 minutes of professional dancers' performances. To address the challenges inherent in this task, we propose a GPT-based model, Duolando, which autoregressively predicts the subsequent tokenized motion conditioned on the coordinated information of the music, the leader's and the follower's movements. To further enhance the GPT's capabilities of generating stable results on unseen conditions (music and leader motions), we devise an off-policy reinforcement learning strategy that allows the model to explore viable trajectories from out-of-distribution samplings, guided by human-defined rewards. Based on the collected dataset and proposed method, we establish a benchmark with several carefully designed metrics.
Related papers
- Scalable Group Choreography via Variational Phase Manifold Learning [8.504657927912076]
We propose a phase-based variational generative model for group dance generation on learning a generative manifold.
Our method achieves high-fidelity group dance motion and enables the generation with an unlimited number of dancers.
arXiv Detail & Related papers (2024-07-26T16:02:37Z) - Harmonious Group Choreography with Trajectory-Controllable Diffusion [28.82215057058883]
Trajectory-Controllable Diffusion (TCDiff) is a novel approach that harnesses non-overlapping trajectories to facilitate coherent dance movements.
To tackle dancer collisions, we introduce a Dance-Beat Navigator capable of generating trajectories for multiple dancers based on the music.
To mitigate foot sliding, we present a Footwork Adaptor that utilizes trajectory displacement from adjacent frames to enable flexible footwork.
arXiv Detail & Related papers (2024-03-10T12:11:34Z) - Dance with You: The Diversity Controllable Dancer Generation via
Diffusion Models [27.82646255903689]
We introduce a novel multi-dancer synthesis task called partner dancer generation.
The core of this task is to ensure the controllable diversity of the generated partner dancer.
To address the lack of multi-person datasets, we introduce AIST-M, a new dataset for partner dancer generation.
arXiv Detail & Related papers (2023-08-23T15:54:42Z) - DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation [89.50310360658791]
We present a novel cascaded motion diffusion model, DiffDance, designed for high-resolution, long-form dance generation.
This model comprises a music-to-dance diffusion model and a sequence super-resolution diffusion model.
We demonstrate that DiffDance is capable of generating realistic dance sequences that align effectively with the input music.
arXiv Detail & Related papers (2023-08-05T16:18:57Z) - TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration [75.37311932218773]
We propose a novel task for generating 3D dance movements that simultaneously incorporate both text and music modalities.
Our approach can generate realistic and coherent dance movements conditioned on both text and music while maintaining comparable performance with the two single modalities.
arXiv Detail & Related papers (2023-04-05T12:58:33Z) - Music-Driven Group Choreography [10.501572863039852]
$rm AIOZ-GDANCE$ is a new large-scale dataset for music-driven group dance generation.
We show that naively applying single dance generation technique to creating group dance motion may lead to unsatisfactory results.
We propose a new method that takes an input music sequence and a set of 3D positions of dancers to efficiently produce multiple group-coherent choreographies.
arXiv Detail & Related papers (2023-03-22T06:26:56Z) - BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis [123.73677487809418]
We introduce a new dataset aiming to challenge common assumptions in dance motion synthesis.
We focus on breakdancing which features acrobatic moves and tangled postures.
Our efforts produced the BRACE dataset, which contains over 3 hours and 30 minutes of densely annotated poses.
arXiv Detail & Related papers (2022-07-20T18:03:54Z) - Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic
Memory [92.81383016482813]
We propose a novel music-to-dance framework, Bailando, for driving 3D characters to dance following a piece of music.
We introduce an actor-critic Generative Pre-trained Transformer (GPT) that composes units to a fluent dance coherent to the music.
Our proposed framework achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-03-24T13:06:43Z) - Music-to-Dance Generation with Optimal Transport [48.92483627635586]
We propose a Music-to-Dance with Optimal Transport Network (MDOT-Net) for learning to generate 3D dance choreographs from music.
We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music.
arXiv Detail & Related papers (2021-12-03T09:37:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.