DanceCamera3D: 3D Camera Movement Synthesis with Music and Dance
- URL: http://arxiv.org/abs/2403.13667v1
- Date: Wed, 20 Mar 2024 15:24:57 GMT
- Title: DanceCamera3D: 3D Camera Movement Synthesis with Music and Dance
- Authors: Zixuan Wang, Jia Jia, Shikun Sun, Haozhe Wu, Rong Han, Zhenyu Li, Di Tang, Jiaqing Zhou, Jiebo Luo,
- Abstract summary: We present DCM, a new multi-modal 3D dataset that combines camera movement with dance motion and music audio.
This dataset encompasses 108 dance sequences (3.2 hours) of paired dance-camera-music data from the anime community.
We propose DanceCamera3D, a transformer-based diffusion model that incorporates a novel body attention loss and a condition separation strategy.
- Score: 50.01162760878841
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Choreographers determine what the dances look like, while cameramen determine the final presentation of dances. Recently, various methods and datasets have showcased the feasibility of dance synthesis. However, camera movement synthesis with music and dance remains an unsolved challenging problem due to the scarcity of paired data. Thus, we present DCM, a new multi-modal 3D dataset, which for the first time combines camera movement with dance motion and music audio. This dataset encompasses 108 dance sequences (3.2 hours) of paired dance-camera-music data from the anime community, covering 4 music genres. With this dataset, we uncover that dance camera movement is multifaceted and human-centric, and possesses multiple influencing factors, making dance camera synthesis a more challenging task compared to camera or dance synthesis alone. To overcome these difficulties, we propose DanceCamera3D, a transformer-based diffusion model that incorporates a novel body attention loss and a condition separation strategy. For evaluation, we devise new metrics measuring camera movement quality, diversity, and dancer fidelity. Utilizing these metrics, we conduct extensive experiments on our DCM dataset, providing both quantitative and qualitative evidence showcasing the effectiveness of our DanceCamera3D model. Code and video demos are available at https://github.com/Carmenw1203/DanceCamera3D-Official.
Related papers
- DanceCamAnimator: Keyframe-Based Controllable 3D Dance Camera Synthesis [49.614150163184064]
Dance camera movements involve both continuous sequences of variable lengths and sudden changes to simulate the switching of multiple cameras.
We propose to integrate cinematography knowledge by formulating this task as a three-stage process: animator detection, synthesis, and tween function prediction.
Following this formulation, we design a novel end-to-end dance camera framework textbfDanceCamAnimator, which imitates human animation procedures and shows powerful-based controllability with variable lengths.
arXiv Detail & Related papers (2024-09-23T11:20:44Z) - Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment [87.20240797625648]
We introduce a novel task within the field of 3D dance generation, termed dance accompaniment.
It requires the generation of responsive movements from a dance partner, the "follower", synchronized with the lead dancer's movements and the underlying musical rhythm.
We propose a GPT-based model, Duolando, which autoregressively predicts the subsequent tokenized motion conditioned on the coordinated information of the music, the leader's and the follower's movements.
arXiv Detail & Related papers (2024-03-27T17:57:02Z) - Music- and Lyrics-driven Dance Synthesis [25.474741654098114]
We introduce JustLMD, a new dataset of 3D dance motion with music and lyrics.
To the best of our knowledge, this is the first dataset with triplet information including dance motion, music, and lyrics.
The proposed JustLMD dataset encompasses 4.6 hours of 3D dance motion in 1867 sequences, accompanied by musical tracks and their corresponding English lyrics.
arXiv Detail & Related papers (2023-09-30T18:27:14Z) - TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration [75.37311932218773]
We propose a novel task for generating 3D dance movements that simultaneously incorporate both text and music modalities.
Our approach can generate realistic and coherent dance movements conditioned on both text and music while maintaining comparable performance with the two single modalities.
arXiv Detail & Related papers (2023-04-05T12:58:33Z) - Music-Driven Group Choreography [10.501572863039852]
$rm AIOZ-GDANCE$ is a new large-scale dataset for music-driven group dance generation.
We show that naively applying single dance generation technique to creating group dance motion may lead to unsatisfactory results.
We propose a new method that takes an input music sequence and a set of 3D positions of dancers to efficiently produce multiple group-coherent choreographies.
arXiv Detail & Related papers (2023-03-22T06:26:56Z) - BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis [123.73677487809418]
We introduce a new dataset aiming to challenge common assumptions in dance motion synthesis.
We focus on breakdancing which features acrobatic moves and tangled postures.
Our efforts produced the BRACE dataset, which contains over 3 hours and 30 minutes of densely annotated poses.
arXiv Detail & Related papers (2022-07-20T18:03:54Z) - Unsupervised 3D Pose Estimation for Hierarchical Dance Video Recognition [13.289339907084424]
We propose a Hierarchical Dance Video Recognition framework (HDVR)
HDVR estimates 2D pose sequences, tracks dancers, and then simultaneously estimates corresponding 3D poses and 3D-to-2D imaging parameters.
From the estimated 3D pose sequence, HDVR extracts body part movements, and therefrom dance genre.
arXiv Detail & Related papers (2021-09-19T16:59:37Z) - Learn to Dance with AIST++: Music Conditioned 3D Dance Generation [28.623222697548456]
We present a transformer-based learning framework for 3D dance generation conditioned on music.
We also propose a new dataset of paired 3D motion and music called AIST++, which we reconstruct from the AIST multi-view dance videos.
arXiv Detail & Related papers (2021-01-21T18:59:22Z) - Learning to Generate Diverse Dance Motions with Transformer [67.43270523386185]
We introduce a complete system for dance motion synthesis.
A massive dance motion data set is created from YouTube videos.
A novel two-stream motion transformer generative model can generate motion sequences with high flexibility.
arXiv Detail & Related papers (2020-08-18T22:29:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.