Related papers: GCDance: Genre-Controlled Music-Driven 3D Full Body Dance Generation

GCDance: Genre-Controlled Music-Driven 3D Full Body Dance Generation

URL: http://arxiv.org/abs/2502.18309v3
Date: Mon, 29 Sep 2025 11:12:31 GMT
Title: GCDance: Genre-Controlled Music-Driven 3D Full Body Dance Generation
Authors: Xinran Liu, Xu Dong, Shenbin Qian, Diptesh Kanojia, Wenwu Wang, Zhenhua Feng,
Abstract summary: GCDance is a framework for genre-specific 3D full-body dance generation conditioned on music and descriptive text.<n>We develop a text-based control mechanism that maps input prompts, explicit genre labels or free-form descriptive text, into genre-specific control signals.<n>To balance the objectives of extracting text-genre information and maintaining high-quality generation results, we propose a novel multi-task optimization strategy.
Score: 30.028340528694432
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Music-driven dance generation is a challenging task as it requires strict adherence to genre-specific choreography while ensuring physically realistic and precisely synchronized dance sequences with the music's beats and rhythm. Although significant progress has been made in music-conditioned dance generation, most existing methods struggle to convey specific stylistic attributes in generated dance. To bridge this gap, we propose a diffusion-based framework for genre-specific 3D full-body dance generation, conditioned on both music and descriptive text. To effectively incorporate genre information, we develop a text-based control mechanism that maps input prompts, either explicit genre labels or free-form descriptive text, into genre-specific control signals, enabling precise and controllable text-guided generation of genre-consistent dance motions. Furthermore, to enhance the alignment between music and textual conditions, we leverage the features of a music foundation model, facilitating coherent and semantically aligned dance synthesis. Last, to balance the objectives of extracting text-genre information and maintaining high-quality generation results, we propose a novel multi-task optimization strategy. This effectively balances competing factors such as physical realism, spatial accuracy, and text classification, significantly improving the overall quality of the generated sequences. Extensive experimental results obtained on the FineDance and AIST++ datasets demonstrate the superiority of GCDance over the existing state-of-the-art approaches.

Related papers

MDD: A Dataset for Text-and-Music Conditioned Duet Dance Generation [16.657210427678198]
We introduce Multimodal DuetDance (MDD), a diverse multimodal benchmark dataset designed for text-controlled and music-conditioned 3D duet dance motion generation.<n>Our dataset comprises 620 minutes of high-quality motion capture data performed by professional dancers, synchronized with music, and detailed with over 10K fine-grained natural language descriptions.<n>The annotations capture a rich movement vocabulary, detailing spatial relationships, body movements, and rhythm, making MDD the first dataset to seamlessly integrate human motions, music, and text for duet dance generation.
arXiv Detail & Related papers (2025-08-23T05:56:37Z)
DanceChat: Large Language Model-Guided Music-to-Dance Generation [8.455652926559427]
Music-to-dance generation aims to synthesize human dance motion conditioned on musical input.<n>We introduce DanceChat, a Large Language Model (LLM)-guided music-to-dance generation approach.
arXiv Detail & Related papers (2025-06-12T11:03:47Z)
Lodge++: High-quality and Long Dance Generation with Vivid Choreography Patterns [48.54956784928394]
Lodge++ is a choreography framework to generate high-quality, ultra-long, and vivid dances given the music and desired genre. To handle the challenges in computational efficiency, Lodge++ adopts a two-stage strategy to produce dances from coarse to fine. Lodge++ is validated by extensive experiments, which show that our method can rapidly generate ultra-long dances suitable for various dance genres.
arXiv Detail & Related papers (2024-10-27T09:32:35Z)
Flexible Music-Conditioned Dance Generation with Style Description Prompts [41.04549275897979]
We introduce Flexible Dance Generation with Style Description Prompts (DGSDP), a diffusion-based framework suitable for diversified tasks of dance generation. The core component of this framework is Music-Conditioned Style-Aware Diffusion (MCSAD), which comprises a Transformer-based network and a music Style Modulation module. The proposed framework successfully generates realistic dance sequences that are accurately aligned with music for a variety of tasks such as long-term generation, dance in-betweening, dance inpainting, and etc.
arXiv Detail & Related papers (2024-06-12T04:55:14Z)
DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation [89.50310360658791]
We present a novel cascaded motion diffusion model, DiffDance, designed for high-resolution, long-form dance generation. This model comprises a music-to-dance diffusion model and a sequence super-resolution diffusion model. We demonstrate that DiffDance is capable of generating realistic dance sequences that align effectively with the input music.
arXiv Detail & Related papers (2023-08-05T16:18:57Z)
TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration [75.37311932218773]
We propose a novel task for generating 3D dance movements that simultaneously incorporate both text and music modalities. Our approach can generate realistic and coherent dance movements conditioned on both text and music while maintaining comparable performance with the two single modalities.
arXiv Detail & Related papers (2023-04-05T12:58:33Z)
FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation [33.9261932800456]
FineDance is the largest music-dance paired dataset with the most dance genres. To address monotonous and unnatural hand movements existing in previous methods, we propose a full-body dance generation network. To further enhance the genre-matching and long-term stability of generated dances, we propose a Genre&Coherent aware Retrieval Module.
arXiv Detail & Related papers (2022-12-07T16:10:08Z)
Quantized GAN for Complex Music Generation from Dance Videos [48.196705493763986]
We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates musical samples conditioned on dance videos. Our proposed framework takes dance video frames and human body motion as input, and learns to generate music samples that plausibly accompany the corresponding input.
arXiv Detail & Related papers (2022-04-01T17:53:39Z)
Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory [92.81383016482813]
We propose a novel music-to-dance framework, Bailando, for driving 3D characters to dance following a piece of music. We introduce an actor-critic Generative Pre-trained Transformer (GPT) that composes units to a fluent dance coherent to the music. Our proposed framework achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-03-24T13:06:43Z)
Dual Learning Music Composition and Dance Choreography [57.55406449959893]
Music and dance have always co-existed as pillars of human activities, contributing immensely to cultural, social, and entertainment functions. Recent research works have studied generative models for dance sequences conditioned on music. We propose a novel extension, where we jointly model both tasks in a dual learning approach.
arXiv Detail & Related papers (2022-01-28T09:20:28Z)
Music-to-Dance Generation with Optimal Transport [48.92483627635586]
We propose a Music-to-Dance with Optimal Transport Network (MDOT-Net) for learning to generate 3D dance choreographs from music. We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music.
arXiv Detail & Related papers (2021-12-03T09:37:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.