Related papers: Flexible Music-Conditioned Dance Generation with Style Description Prompts

Flexible Music-Conditioned Dance Generation with Style Description Prompts

URL: http://arxiv.org/abs/2406.07871v1
Date: Wed, 12 Jun 2024 04:55:14 GMT
Title: Flexible Music-Conditioned Dance Generation with Style Description Prompts
Authors: Hongsong Wang, Yin Zhu, Xin Geng,
Abstract summary: We introduce Flexible Dance Generation with Style Description Prompts (DGSDP), a diffusion-based framework suitable for diversified tasks of dance generation. The core component of this framework is Music-Conditioned Style-Aware Diffusion (MCSAD), which comprises a Transformer-based network and a music Style Modulation module. The proposed framework successfully generates realistic dance sequences that are accurately aligned with music for a variety of tasks such as long-term generation, dance in-betweening, dance inpainting, and etc.
Score: 41.04549275897979
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Dance plays an important role as an artistic form and expression in human culture, yet the creation of dance remains a challenging task. Most dance generation methods primarily rely solely on music, seldom taking into consideration intrinsic attributes such as music style or genre. In this work, we introduce Flexible Dance Generation with Style Description Prompts (DGSDP), a diffusion-based framework suitable for diversified tasks of dance generation by fully leveraging the semantics of music style. The core component of this framework is Music-Conditioned Style-Aware Diffusion (MCSAD), which comprises a Transformer-based network and a music Style Modulation module. The MCSAD seemly integrates music conditions and style description prompts into the dance generation framework, ensuring that generated dances are consistent with the music content and style. To facilitate flexible dance generation and accommodate different tasks, a spatial-temporal masking strategy is effectively applied in the backward diffusion process. The proposed framework successfully generates realistic dance sequences that are accurately aligned with music for a variety of tasks such as long-term generation, dance in-betweening, dance inpainting, and etc. We hope that this work has the potential to inspire dance generation and creation, with promising applications in entertainment, art, and education.

Related papers

DanceChat: Large Language Model-Guided Music-to-Dance Generation [8.455652926559427]
Music-to-dance generation aims to synthesize human dance motion conditioned on musical input.<n>We introduce DanceChat, a Large Language Model (LLM)-guided music-to-dance generation approach.
arXiv Detail & Related papers (2025-06-12T11:03:47Z)
Align Your Rhythm: Generating Highly Aligned Dance Poses with Gating-Enhanced Rhythm-Aware Feature Representation [22.729568599120846]
We propose Danceba, a novel framework that leverages gating mechanism to enhance rhythm-aware feature representation. Phase-Based Rhythm Extraction (PRE) to precisely extract rhythmic information from musical phase data. Temporal-Gated Causal Attention (TGCA) to focus on global rhythmic features. Parallel Mamba Motion Modeling (PMMM) architecture to separately model upper and lower body motions.
arXiv Detail & Related papers (2025-03-21T17:42:50Z)
GCDance: Genre-Controlled 3D Full Body Dance Generation Driven By Music [22.352036716156967]
GCDance is a classifier-free diffusion framework for generating genre-specific dance motions conditioned on both music and textual prompts. Our approach extracts music features by combining high-level pre-trained music foundation model features with hand-crafted features for multi-granularity feature fusion.
arXiv Detail & Related papers (2025-02-25T15:53:18Z)
Lodge++: High-quality and Long Dance Generation with Vivid Choreography Patterns [48.54956784928394]
Lodge++ is a choreography framework to generate high-quality, ultra-long, and vivid dances given the music and desired genre. To handle the challenges in computational efficiency, Lodge++ adopts a two-stage strategy to produce dances from coarse to fine. Lodge++ is validated by extensive experiments, which show that our method can rapidly generate ultra-long dances suitable for various dance genres.
arXiv Detail & Related papers (2024-10-27T09:32:35Z)
Bidirectional Autoregressive Diffusion Model for Dance Generation [26.449135437337034]
We propose a Bidirectional Autoregressive Diffusion Model (BADM) for music-to-dance generation. A bidirectional encoder is built to enforce that the generated dance is harmonious in both the forward and backward directions. To make the generated dance motion smoother, a local information decoder is built for local motion enhancement.
arXiv Detail & Related papers (2024-02-06T19:42:18Z)
FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation [33.9261932800456]
FineDance is the largest music-dance paired dataset with the most dance genres. To address monotonous and unnatural hand movements existing in previous methods, we propose a full-body dance generation network. To further enhance the genre-matching and long-term stability of generated dances, we propose a Genre&Coherent aware Retrieval Module.
arXiv Detail & Related papers (2022-12-07T16:10:08Z)
Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory [92.81383016482813]
We propose a novel music-to-dance framework, Bailando, for driving 3D characters to dance following a piece of music. We introduce an actor-critic Generative Pre-trained Transformer (GPT) that composes units to a fluent dance coherent to the music. Our proposed framework achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-03-24T13:06:43Z)
Music-to-Dance Generation with Optimal Transport [48.92483627635586]
We propose a Music-to-Dance with Optimal Transport Network (MDOT-Net) for learning to generate 3D dance choreographs from music. We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music.
arXiv Detail & Related papers (2021-12-03T09:37:26Z)
Learning to Generate Diverse Dance Motions with Transformer [67.43270523386185]
We introduce a complete system for dance motion synthesis. A massive dance motion data set is created from YouTube videos. A novel two-stream motion transformer generative model can generate motion sequences with high flexibility.
arXiv Detail & Related papers (2020-08-18T22:29:40Z)
Feel The Music: Automatically Generating A Dance For An Input Song [58.653867648572]
We present a general computational approach that enables a machine to generate a dance for any input music. We encode intuitive, flexibles for what a 'good' dance is: the structure of the dance should align with the structure of the music.
arXiv Detail & Related papers (2020-06-21T20:29:50Z)
Music2Dance: DanceNet for Music-driven Dance Generation [11.73506542921528]
We propose a novel autoregressive generative model, DanceNet, to take the style, rhythm and melody of music as the control signals. We capture several synchronized music-dance pairs by professional dancers, and build a high-quality music-dance pair dataset.
arXiv Detail & Related papers (2020-02-02T17:18:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.