Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives
- URL: http://arxiv.org/abs/2403.10518v3
- Date: Sat, 20 Apr 2024 02:01:11 GMT
- Title: Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives
- Authors: Ronghui Li, YuXiang Zhang, Yachao Zhang, Hongwen Zhang, Jie Guo, Yan Zhang, Yebin Liu, Xiu Li,
- Abstract summary: We propose Lodge, a network capable of generating extremely long dance sequences conditioned on given music.
Our approach can parallelly generate dance sequences of extremely long length, striking a balance between global choreographic patterns and local motion quality and expressiveness.
- Score: 50.37531720524434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose Lodge, a network capable of generating extremely long dance sequences conditioned on given music. We design Lodge as a two-stage coarse to fine diffusion architecture, and propose the characteristic dance primitives that possess significant expressiveness as intermediate representations between two diffusion models. The first stage is global diffusion, which focuses on comprehending the coarse-level music-dance correlation and production characteristic dance primitives. In contrast, the second-stage is the local diffusion, which parallelly generates detailed motion sequences under the guidance of the dance primitives and choreographic rules. In addition, we propose a Foot Refine Block to optimize the contact between the feet and the ground, enhancing the physical realism of the motion. Our approach can parallelly generate dance sequences of extremely long length, striking a balance between global choreographic patterns and local motion quality and expressiveness. Extensive experiments validate the efficacy of our method.
Related papers
- Lodge++: High-quality and Long Dance Generation with Vivid Choreography Patterns [48.54956784928394]
Lodge++ is a choreography framework to generate high-quality, ultra-long, and vivid dances given the music and desired genre.
To handle the challenges in computational efficiency, Lodge++ adopts a two-stage strategy to produce dances from coarse to fine.
Lodge++ is validated by extensive experiments, which show that our method can rapidly generate ultra-long dances suitable for various dance genres.
arXiv Detail & Related papers (2024-10-27T09:32:35Z) - Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment [87.20240797625648]
We introduce a novel task within the field of 3D dance generation, termed dance accompaniment.
It requires the generation of responsive movements from a dance partner, the "follower", synchronized with the lead dancer's movements and the underlying musical rhythm.
We propose a GPT-based model, Duolando, which autoregressively predicts the subsequent tokenized motion conditioned on the coordinated information of the music, the leader's and the follower's movements.
arXiv Detail & Related papers (2024-03-27T17:57:02Z) - Bidirectional Autoregressive Diffusion Model for Dance Generation [26.449135437337034]
We propose a Bidirectional Autoregressive Diffusion Model (BADM) for music-to-dance generation.
A bidirectional encoder is built to enforce that the generated dance is harmonious in both the forward and backward directions.
To make the generated dance motion smoother, a local information decoder is built for local motion enhancement.
arXiv Detail & Related papers (2024-02-06T19:42:18Z) - LongDanceDiff: Long-term Dance Generation with Conditional Diffusion
Model [3.036230795326545]
LongDanceDiff is a conditional diffusion model for sequence-to-sequence long-term dance generation.
It addresses the challenges of temporal coherency and spatial constraint.
We also address common visual quality issues in dance generation, such as foot sliding and unsmooth motion.
arXiv Detail & Related papers (2023-08-23T06:37:41Z) - DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation [89.50310360658791]
We present a novel cascaded motion diffusion model, DiffDance, designed for high-resolution, long-form dance generation.
This model comprises a music-to-dance diffusion model and a sequence super-resolution diffusion model.
We demonstrate that DiffDance is capable of generating realistic dance sequences that align effectively with the input music.
arXiv Detail & Related papers (2023-08-05T16:18:57Z) - Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic
Memory [92.81383016482813]
We propose a novel music-to-dance framework, Bailando, for driving 3D characters to dance following a piece of music.
We introduce an actor-critic Generative Pre-trained Transformer (GPT) that composes units to a fluent dance coherent to the music.
Our proposed framework achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-03-24T13:06:43Z) - Music-to-Dance Generation with Optimal Transport [48.92483627635586]
We propose a Music-to-Dance with Optimal Transport Network (MDOT-Net) for learning to generate 3D dance choreographs from music.
We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music.
arXiv Detail & Related papers (2021-12-03T09:37:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.