Related papers: QEAN: Quaternion-Enhanced Attention Network for Visual Dance Generation

QEAN: Quaternion-Enhanced Attention Network for Visual Dance Generation

URL: http://arxiv.org/abs/2403.11626v1
Date: Mon, 18 Mar 2024 09:58:43 GMT
Title: QEAN: Quaternion-Enhanced Attention Network for Visual Dance Generation
Authors: Zhizhen Zhou, Yejing Huo, Guoheng Huang, An Zeng, Xuhang Chen, Lian Huang, Zinuo Li,
Abstract summary: We propose a Quaternion-Enhanced Attention Network (QEAN) for visual dance synthesis from a quaternion perspective. First, SPE embeds position information into self-attention in a rotational manner, leading to better learning of features of movement sequences and audio sequences. Second, QRA represents and fuses 3D motion features and audio features in the form of a series of quaternions, enabling the model to better learn the temporal coordination of music and dance.
Score: 6.060426136203966
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The study of music-generated dance is a novel and challenging Image generation task. It aims to input a piece of music and seed motions, then generate natural dance movements for the subsequent music. Transformer-based methods face challenges in time series prediction tasks related to human movements and music due to their struggle in capturing the nonlinear relationship and temporal aspects. This can lead to issues like joint deformation, role deviation, floating, and inconsistencies in dance movements generated in response to the music. In this paper, we propose a Quaternion-Enhanced Attention Network (QEAN) for visual dance synthesis from a quaternion perspective, which consists of a Spin Position Embedding (SPE) module and a Quaternion Rotary Attention (QRA) module. First, SPE embeds position information into self-attention in a rotational manner, leading to better learning of features of movement sequences and audio sequences, and improved understanding of the connection between music and dance. Second, QRA represents and fuses 3D motion features and audio features in the form of a series of quaternions, enabling the model to better learn the temporal coordination of music and dance under the complex temporal cycle conditions of dance generation. Finally, we conducted experiments on the dataset AIST++, and the results show that our approach achieves better and more robust performance in generating accurate, high-quality dance movements. Our source code and dataset can be available from https://github.com/MarasyZZ/QEAN and https://google.github.io/aistplusplus_dataset respectively.

Related papers

PAMD: Plausibility-Aware Motion Diffusion Model for Long Dance Generation [51.2555550979386]
Plausibility-Aware Motion Diffusion (PAMD) is a framework for generating dances that are both musically aligned and physically realistic.<n>To provide more effective guidance during generation, we incorporate Prior Motion Guidance (PMG)<n>Experiments show that PAMD significantly improves musical alignment and enhances the physical plausibility of generated motions.
arXiv Detail & Related papers (2025-05-26T14:44:09Z)
TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration [75.37311932218773]
We propose a novel task for generating 3D dance movements that simultaneously incorporate both text and music modalities. Our approach can generate realistic and coherent dance movements conditioned on both text and music while maintaining comparable performance with the two single modalities.
arXiv Detail & Related papers (2023-04-05T12:58:33Z)
BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis [123.73677487809418]
We introduce a new dataset aiming to challenge common assumptions in dance motion synthesis. We focus on breakdancing which features acrobatic moves and tangled postures. Our efforts produced the BRACE dataset, which contains over 3 hours and 30 minutes of densely annotated poses.
arXiv Detail & Related papers (2022-07-20T18:03:54Z)
Quantized GAN for Complex Music Generation from Dance Videos [48.196705493763986]
We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates musical samples conditioned on dance videos. Our proposed framework takes dance video frames and human body motion as input, and learns to generate music samples that plausibly accompany the corresponding input.
arXiv Detail & Related papers (2022-04-01T17:53:39Z)
Music-to-Dance Generation with Optimal Transport [48.92483627635586]
We propose a Music-to-Dance with Optimal Transport Network (MDOT-Net) for learning to generate 3D dance choreographs from music. We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music.
arXiv Detail & Related papers (2021-12-03T09:37:26Z)
DanceFormer: Music Conditioned 3D Dance Generation with Parametric Motion Transformer [23.51701359698245]
In this paper, we reformulate it by a two-stage process, ie, a key pose generation and then an in-between parametric motion curve prediction. We propose a large-scale music conditioned 3D dance dataset, called PhantomDance, that is accurately labeled by experienced animators. Experiments demonstrate that the proposed method, even trained by existing datasets, can generate fluent, performative, and music-matched 3D dances.
arXiv Detail & Related papers (2021-03-18T12:17:38Z)
Learn to Dance with AIST++: Music Conditioned 3D Dance Generation [28.623222697548456]
We present a transformer-based learning framework for 3D dance generation conditioned on music. We also propose a new dataset of paired 3D motion and music called AIST++, which we reconstruct from the AIST multi-view dance videos.
arXiv Detail & Related papers (2021-01-21T18:59:22Z)
Learning to Generate Diverse Dance Motions with Transformer [67.43270523386185]
We introduce a complete system for dance motion synthesis. A massive dance motion data set is created from YouTube videos. A novel two-stream motion transformer generative model can generate motion sequences with high flexibility.
arXiv Detail & Related papers (2020-08-18T22:29:40Z)
Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning [55.854205371307884]
We formalize the music-conditioned dance generation as a sequence-to-sequence learning problem. We propose a novel curriculum learning strategy to alleviate error accumulation of autoregressive models in long motion sequence generation. Our approach significantly outperforms the existing state-of-the-arts on automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-06-11T00:08:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.