Related papers: DanceFormer: Music Conditioned 3D Dance Generation with Parametric Motion Transformer

DanceFormer: Music Conditioned 3D Dance Generation with Parametric Motion Transformer

URL: http://arxiv.org/abs/2103.10206v5
Date: Thu, 27 Jul 2023 08:49:55 GMT
Title: DanceFormer: Music Conditioned 3D Dance Generation with Parametric Motion Transformer
Authors: Buyu Li, Yongchi Zhao, Zhelun Shi, Lu Sheng
Abstract summary: In this paper, we reformulate it by a two-stage process, ie, a key pose generation and then an in-between parametric motion curve prediction. We propose a large-scale music conditioned 3D dance dataset, called PhantomDance, that is accurately labeled by experienced animators. Experiments demonstrate that the proposed method, even trained by existing datasets, can generate fluent, performative, and music-matched 3D dances.
Score: 23.51701359698245
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generating 3D dances from music is an emerged research task that benefits a lot of applications in vision and graphics. Previous works treat this task as sequence generation, however, it is challenging to render a music-aligned long-term sequence with high kinematic complexity and coherent movements. In this paper, we reformulate it by a two-stage process, ie, a key pose generation and then an in-between parametric motion curve prediction, where the key poses are easier to be synchronized with the music beats and the parametric curves can be efficiently regressed to render fluent rhythm-aligned movements. We named the proposed method as DanceFormer, which includes two cascading kinematics-enhanced transformer-guided networks (called DanTrans) that tackle each stage, respectively. Furthermore, we propose a large-scale music conditioned 3D dance dataset, called PhantomDance, that is accurately labeled by experienced animators rather than reconstruction or motion capture. This dataset also encodes dances as key poses and parametric motion curves apart from pose sequences, thus benefiting the training of our DanceFormer. Extensive experiments demonstrate that the proposed method, even trained by existing datasets, can generate fluent, performative, and music-matched 3D dances that surpass previous works quantitatively and qualitatively. Moreover, the proposed DanceFormer, together with the PhantomDance dataset (https://github.com/libuyu/PhantomDanceDataset), are seamlessly compatible with industrial animation software, thus facilitating the adaptation for various downstream applications.

Related papers

X-Dancer: Expressive Music to Human Dance Video Generation [26.544761204917336]
X-Dancer is a novel zero-shot music-driven image animation pipeline. It creates diverse and long-range lifelike human dance videos from a single static image.
arXiv Detail & Related papers (2025-02-24T18:47:54Z)
DanceCamAnimator: Keyframe-Based Controllable 3D Dance Camera Synthesis [49.614150163184064]
Dance camera movements involve both continuous sequences of variable lengths and sudden changes to simulate the switching of multiple cameras. We propose to integrate cinematography knowledge by formulating this task as a three-stage process: animator detection, synthesis, and tween function prediction. Following this formulation, we design a novel end-to-end dance camera framework textbfDanceCamAnimator, which imitates human animation procedures and shows powerful-based controllability with variable lengths.
arXiv Detail & Related papers (2024-09-23T11:20:44Z)
Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment [87.20240797625648]
We introduce a novel task within the field of 3D dance generation, termed dance accompaniment. It requires the generation of responsive movements from a dance partner, the "follower", synchronized with the lead dancer's movements and the underlying musical rhythm. We propose a GPT-based model, Duolando, which autoregressively predicts the subsequent tokenized motion conditioned on the coordinated information of the music, the leader's and the follower's movements.
arXiv Detail & Related papers (2024-03-27T17:57:02Z)
QEAN: Quaternion-Enhanced Attention Network for Visual Dance Generation [6.060426136203966]
We propose a Quaternion-Enhanced Attention Network (QEAN) for visual dance synthesis from a quaternion perspective. First, SPE embeds position information into self-attention in a rotational manner, leading to better learning of features of movement sequences and audio sequences. Second, QRA represents and fuses 3D motion features and audio features in the form of a series of quaternions, enabling the model to better learn the temporal coordination of music and dance.
arXiv Detail & Related papers (2024-03-18T09:58:43Z)
TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration [75.37311932218773]
We propose a novel task for generating 3D dance movements that simultaneously incorporate both text and music modalities. Our approach can generate realistic and coherent dance movements conditioned on both text and music while maintaining comparable performance with the two single modalities.
arXiv Detail & Related papers (2023-04-05T12:58:33Z)
BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis [123.73677487809418]
We introduce a new dataset aiming to challenge common assumptions in dance motion synthesis. We focus on breakdancing which features acrobatic moves and tangled postures. Our efforts produced the BRACE dataset, which contains over 3 hours and 30 minutes of densely annotated poses.
arXiv Detail & Related papers (2022-07-20T18:03:54Z)
Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory [92.81383016482813]
We propose a novel music-to-dance framework, Bailando, for driving 3D characters to dance following a piece of music. We introduce an actor-critic Generative Pre-trained Transformer (GPT) that composes units to a fluent dance coherent to the music. Our proposed framework achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-03-24T13:06:43Z)
Transflower: probabilistic autoregressive dance generation with multimodal attention [31.308435764603658]
We present a novel probabilistic autoregressive architecture that models the distribution over future poses with a normalizing flow conditioned on previous poses as well as music context. Second, we introduce the currently largest 3D dance-motion dataset, obtained with a variety of motion-capture technologies, and including both professional and casual dancers.
arXiv Detail & Related papers (2021-06-25T20:14:28Z)
Learning to Generate Diverse Dance Motions with Transformer [67.43270523386185]
We introduce a complete system for dance motion synthesis. A massive dance motion data set is created from YouTube videos. A novel two-stream motion transformer generative model can generate motion sequences with high flexibility.
arXiv Detail & Related papers (2020-08-18T22:29:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.