Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic
Memory
- URL: http://arxiv.org/abs/2203.13055v2
- Date: Fri, 25 Mar 2022 03:07:26 GMT
- Title: Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic
Memory
- Authors: Li Siyao, Weijiang Yu, Tianpei Gu, Chunze Lin, Quan Wang, Chen Qian,
Chen Change Loy, Ziwei Liu
- Abstract summary: We propose a novel music-to-dance framework, Bailando, for driving 3D characters to dance following a piece of music.
We introduce an actor-critic Generative Pre-trained Transformer (GPT) that composes units to a fluent dance coherent to the music.
Our proposed framework achieves state-of-the-art performance both qualitatively and quantitatively.
- Score: 92.81383016482813
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Driving 3D characters to dance following a piece of music is highly
challenging due to the spatial constraints applied to poses by choreography
norms. In addition, the generated dance sequence also needs to maintain
temporal coherency with different music genres. To tackle these challenges, we
propose a novel music-to-dance framework, Bailando, with two powerful
components: 1) a choreographic memory that learns to summarize meaningful
dancing units from 3D pose sequence to a quantized codebook, 2) an actor-critic
Generative Pre-trained Transformer (GPT) that composes these units to a fluent
dance coherent to the music. With the learned choreographic memory, dance
generation is realized on the quantized units that meet high choreography
standards, such that the generated dancing sequences are confined within the
spatial constraints. To achieve synchronized alignment between diverse motion
tempos and music beats, we introduce an actor-critic-based reinforcement
learning scheme to the GPT with a newly-designed beat-align reward function.
Extensive experiments on the standard benchmark demonstrate that our proposed
framework achieves state-of-the-art performance both qualitatively and
quantitatively. Notably, the learned choreographic memory is shown to discover
human-interpretable dancing-style poses in an unsupervised manner.
Related papers
- Lodge++: High-quality and Long Dance Generation with Vivid Choreography Patterns [48.54956784928394]
Lodge++ is a choreography framework to generate high-quality, ultra-long, and vivid dances given the music and desired genre.
To handle the challenges in computational efficiency, Lodge++ adopts a two-stage strategy to produce dances from coarse to fine.
Lodge++ is validated by extensive experiments, which show that our method can rapidly generate ultra-long dances suitable for various dance genres.
arXiv Detail & Related papers (2024-10-27T09:32:35Z) - Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment [87.20240797625648]
We introduce a novel task within the field of 3D dance generation, termed dance accompaniment.
It requires the generation of responsive movements from a dance partner, the "follower", synchronized with the lead dancer's movements and the underlying musical rhythm.
We propose a GPT-based model, Duolando, which autoregressively predicts the subsequent tokenized motion conditioned on the coordinated information of the music, the leader's and the follower's movements.
arXiv Detail & Related papers (2024-03-27T17:57:02Z) - TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration [75.37311932218773]
We propose a novel task for generating 3D dance movements that simultaneously incorporate both text and music modalities.
Our approach can generate realistic and coherent dance movements conditioned on both text and music while maintaining comparable performance with the two single modalities.
arXiv Detail & Related papers (2023-04-05T12:58:33Z) - Dual Learning Music Composition and Dance Choreography [57.55406449959893]
Music and dance have always co-existed as pillars of human activities, contributing immensely to cultural, social, and entertainment functions.
Recent research works have studied generative models for dance sequences conditioned on music.
We propose a novel extension, where we jointly model both tasks in a dual learning approach.
arXiv Detail & Related papers (2022-01-28T09:20:28Z) - Music-to-Dance Generation with Optimal Transport [48.92483627635586]
We propose a Music-to-Dance with Optimal Transport Network (MDOT-Net) for learning to generate 3D dance choreographs from music.
We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music.
arXiv Detail & Related papers (2021-12-03T09:37:26Z) - DanceFormer: Music Conditioned 3D Dance Generation with Parametric
Motion Transformer [23.51701359698245]
In this paper, we reformulate it by a two-stage process, ie, a key pose generation and then an in-between parametric motion curve prediction.
We propose a large-scale music conditioned 3D dance dataset, called PhantomDance, that is accurately labeled by experienced animators.
Experiments demonstrate that the proposed method, even trained by existing datasets, can generate fluent, performative, and music-matched 3D dances.
arXiv Detail & Related papers (2021-03-18T12:17:38Z) - ChoreoNet: Towards Music to Dance Synthesis with Choreographic Action
Unit [28.877908457607678]
We design a two-stage music-to-dance synthesis framework ChoreoNet to imitate human choreography procedure.
Our framework firstly devises a CAU prediction model to learn the mapping relationship between music and CAU sequences.
We then devise a spatial-temporal inpainting model to convert the CAU sequence into continuous dance motions.
arXiv Detail & Related papers (2020-09-16T12:38:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.