Related papers: DanceFusion: A Spatio-Temporal Skeleton Diffusion Transformer for Audio-Driven Dance Motion Reconstruction

DanceFusion: A Spatio-Temporal Skeleton Diffusion Transformer for Audio-Driven Dance Motion Reconstruction

URL: http://arxiv.org/abs/2411.04646v1
Date: Thu, 07 Nov 2024 12:11:11 GMT
Title: DanceFusion: A Spatio-Temporal Skeleton Diffusion Transformer for Audio-Driven Dance Motion Reconstruction
Authors: Li Zhao, Zhengmin Lu,
Abstract summary: This paper introduces DanceFusion, a novel framework for reconstructing and generating dance movements synchronized to music. The framework adeptly handles incomplete and noisy skeletal data common in short-form dance videos on social media platforms like TikTok. Comprehensive evaluations demonstrate that DanceFusion surpasses existing methods, providing state-of-the-art performance in generating dynamic, realistic, and stylistically diverse dance motions.
Score: 3.2189686611762163
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces DanceFusion, a novel framework for reconstructing and generating dance movements synchronized to music, utilizing a Spatio-Temporal Skeleton Diffusion Transformer. The framework adeptly handles incomplete and noisy skeletal data common in short-form dance videos on social media platforms like TikTok. DanceFusion incorporates a hierarchical Transformer-based Variational Autoencoder (VAE) integrated with a diffusion model, significantly enhancing motion realism and accuracy. Our approach introduces sophisticated masking techniques and a unique iterative diffusion process that refines the motion sequences, ensuring high fidelity in both motion generation and synchronization with accompanying audio cues. Comprehensive evaluations demonstrate that DanceFusion surpasses existing methods, providing state-of-the-art performance in generating dynamic, realistic, and stylistically diverse dance motions. Potential applications of this framework extend to content creation, virtual reality, and interactive entertainment, promising substantial advancements in automated dance generation. Visit our project page at https://th-mlab.github.io/DanceFusion/.

Related papers

PAMD: Plausibility-Aware Motion Diffusion Model for Long Dance Generation [51.2555550979386]
Plausibility-Aware Motion Diffusion (PAMD) is a framework for generating dances that are both musically aligned and physically realistic.<n>To provide more effective guidance during generation, we incorporate Prior Motion Guidance (PMG)<n>Experiments show that PAMD significantly improves musical alignment and enhances the physical plausibility of generated motions.
arXiv Detail & Related papers (2025-05-26T14:44:09Z)
X-Dancer: Expressive Music to Human Dance Video Generation [26.544761204917336]
X-Dancer is a novel zero-shot music-driven image animation pipeline. It creates diverse and long-range lifelike human dance videos from a single static image.
arXiv Detail & Related papers (2025-02-24T18:47:54Z)
InterDance:Reactive 3D Dance Generation with Realistic Duet Interactions [67.37790144477503]
We propose InterDance, a large-scale duet dance dataset that significantly enhances motion quality, data scale, and the variety of dance genres. We introduce a diffusion-based framework with an interaction refinement guidance strategy to optimize the realism of interactions progressively.
arXiv Detail & Related papers (2024-12-22T11:53:51Z)
DanceCamAnimator: Keyframe-Based Controllable 3D Dance Camera Synthesis [49.614150163184064]
Dance camera movements involve both continuous sequences of variable lengths and sudden changes to simulate the switching of multiple cameras. We propose to integrate cinematography knowledge by formulating this task as a three-stage process: animator detection, synthesis, and tween function prediction. Following this formulation, we design a novel end-to-end dance camera framework textbfDanceCamAnimator, which imitates human animation procedures and shows powerful-based controllability with variable lengths.
arXiv Detail & Related papers (2024-09-23T11:20:44Z)
Controllable Dance Generation with Style-Guided Motion Diffusion [49.35282418951445]
Dance plays an important role as an artistic form and expression in human culture, yet the creation of dance remains a challenging task.<n>Most dance generation methods rely solely on music, seldom taking into consideration intrinsic attributes such as music style or genre.<n>In this work, we introduce Flexible Dance Generation with Style Description Prompts (DGSDP), a diffusion-based framework suitable for diversified tasks of dance generation.
arXiv Detail & Related papers (2024-06-12T04:55:14Z)
Dance Any Beat: Blending Beats with Visuals in Dance Video Generation [12.018432669719742]
We introduce a novel task: generating dance videos directly from images of individuals guided by music. Our solution, the Dance Any Beat Diffusion model (DabFusion), utilizes a reference image and a music piece to generate dance videos. We evaluate DabFusion's performance using the AIST++ dataset, focusing on video quality, audio-video synchronization, and motion-music alignment.
arXiv Detail & Related papers (2024-05-15T11:33:07Z)
Bidirectional Autoregressive Diffusion Model for Dance Generation [26.449135437337034]
We propose a Bidirectional Autoregressive Diffusion Model (BADM) for music-to-dance generation. A bidirectional encoder is built to enforce that the generated dance is harmonious in both the forward and backward directions. To make the generated dance motion smoother, a local information decoder is built for local motion enhancement.
arXiv Detail & Related papers (2024-02-06T19:42:18Z)
DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation [89.50310360658791]
We present a novel cascaded motion diffusion model, DiffDance, designed for high-resolution, long-form dance generation. This model comprises a music-to-dance diffusion model and a sequence super-resolution diffusion model. We demonstrate that DiffDance is capable of generating realistic dance sequences that align effectively with the input music.
arXiv Detail & Related papers (2023-08-05T16:18:57Z)
TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration [75.37311932218773]
We propose a novel task for generating 3D dance movements that simultaneously incorporate both text and music modalities. Our approach can generate realistic and coherent dance movements conditioned on both text and music while maintaining comparable performance with the two single modalities.
arXiv Detail & Related papers (2023-04-05T12:58:33Z)
Dance Style Transfer with Cross-modal Transformer [17.216186480300756]
CycleDance is a dance style transfer system to transform an existing motion clip in one dance style to a motion clip in another dance style. Our method extends an existing CycleGAN architecture for modeling audio sequences and integrates multimodal transformer encoders to account for music context.
arXiv Detail & Related papers (2022-08-19T15:48:30Z)
BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis [123.73677487809418]
We introduce a new dataset aiming to challenge common assumptions in dance motion synthesis. We focus on breakdancing which features acrobatic moves and tangled postures. Our efforts produced the BRACE dataset, which contains over 3 hours and 30 minutes of densely annotated poses.
arXiv Detail & Related papers (2022-07-20T18:03:54Z)
Music-to-Dance Generation with Optimal Transport [48.92483627635586]
We propose a Music-to-Dance with Optimal Transport Network (MDOT-Net) for learning to generate 3D dance choreographs from music. We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music.
arXiv Detail & Related papers (2021-12-03T09:37:26Z)
Transflower: probabilistic autoregressive dance generation with multimodal attention [31.308435764603658]
We present a novel probabilistic autoregressive architecture that models the distribution over future poses with a normalizing flow conditioned on previous poses as well as music context. Second, we introduce the currently largest 3D dance-motion dataset, obtained with a variety of motion-capture technologies, and including both professional and casual dancers.
arXiv Detail & Related papers (2021-06-25T20:14:28Z)
Learning to Generate Diverse Dance Motions with Transformer [67.43270523386185]
We introduce a complete system for dance motion synthesis. A massive dance motion data set is created from YouTube videos. A novel two-stream motion transformer generative model can generate motion sequences with high flexibility.
arXiv Detail & Related papers (2020-08-18T22:29:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.