Explore 3D Dance Generation via Reward Model from Automatically-Ranked
Demonstrations
- URL: http://arxiv.org/abs/2312.11442v1
- Date: Mon, 18 Dec 2023 18:45:38 GMT
- Title: Explore 3D Dance Generation via Reward Model from Automatically-Ranked
Demonstrations
- Authors: Zilin Wang, Haolin Zhuang, Lu Li, Yinmin Zhang, Junjie Zhong, Jun
Chen, Yu Yang, Boshi Tang, Zhiyong Wu
- Abstract summary: This paper presents an Exploratory 3D Dance generation framework, E3D2, designed to address the exploration capability deficiency in existing music-conditioned 3D dance generation models.
The E3D2 framework involves a reward model trained from automatically-ranked dance demonstrations, which then guides the reinforcement learning process.
- Score: 18.56485266484622
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents an Exploratory 3D Dance generation framework, E3D2,
designed to address the exploration capability deficiency in existing
music-conditioned 3D dance generation models. Current models often generate
monotonous and simplistic dance sequences that misalign with human preferences
because they lack exploration capabilities. The E3D2 framework involves a
reward model trained from automatically-ranked dance demonstrations, which then
guides the reinforcement learning process. This approach encourages the agent
to explore and generate high quality and diverse dance movement sequences. The
soundness of the reward model is both theoretically and experimentally
validated. Empirical experiments demonstrate the effectiveness of E3D2 on the
AIST++ dataset. Project Page: https://sites.google.com/view/e3d2.
Related papers
- Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text [61.9973218744157]
We introduce Director3D, a robust open-world text-to-3D generation framework, designed to generate both real-world 3D scenes and adaptive camera trajectories.
Experiments demonstrate that Director3D outperforms existing methods, offering superior performance in real-world 3D generation.
arXiv Detail & Related papers (2024-06-25T14:42:51Z) - DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data [50.164670363633704]
We present DIRECT-3D, a diffusion-based 3D generative model for creating high-quality 3D assets from text prompts.
Our model is directly trained on extensive noisy and unaligned in-the-wild' 3D assets.
We achieve state-of-the-art performance in both single-class generation and text-to-3D generation.
arXiv Detail & Related papers (2024-06-06T17:58:15Z) - Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication [50.541882834405946]
We introduce Atlas3D, an automatic and easy-to-implement text-to-3D method.
Our approach combines a novel differentiable simulation-based loss function with physically inspired regularization.
We verify Atlas3D's efficacy through extensive generation tasks and validate the resulting 3D models in both simulated and real-world environments.
arXiv Detail & Related papers (2024-05-28T18:33:18Z) - MIDGET: Music Conditioned 3D Dance Generation [13.067687949642641]
We introduce a MusIc conditioned 3D Dance GEneraTion model, named MIDGET, to generate vibrant and highquality dances that match the music rhythm.
To tackle challenges in the field, we introduce three new components: 1) a pre-trained memory codebook based on the Motion VQ-VAE model to store different human pose codes, 2) employing Motion GPT model to generate pose codes with music and motion ablations, and 3) a simple framework for music feature extraction.
arXiv Detail & Related papers (2024-04-18T10:20:37Z) - Probing the 3D Awareness of Visual Foundation Models [56.68380136809413]
We analyze the 3D awareness of visual foundation models.
We conduct experiments using task-specific probes and zero-shot inference procedures on frozen features.
arXiv Detail & Related papers (2024-04-12T17:58:04Z) - GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation, Demonstration, and Imitation [31.702907860448477]
GenH2R is a framework for learning generalizable vision-based human-to-robot (H2R) handover skills.
We acquire such generalizability by learning H2R handover at scale with a comprehensive solution.
We leverage large-scale 3D model repositories, dexterous grasp generation methods, and curve-based 3D animation.
arXiv Detail & Related papers (2024-01-01T18:20:43Z) - PonderV2: Pave the Way for 3D Foundation Model with A Universal
Pre-training Paradigm [114.47216525866435]
We introduce a novel universal 3D pre-training framework designed to facilitate the acquisition of efficient 3D representation.
For the first time, PonderV2 achieves state-of-the-art performance on 11 indoor and outdoor benchmarks, implying its effectiveness.
arXiv Detail & Related papers (2023-10-12T17:59:57Z) - AG3D: Learning to Generate 3D Avatars from 2D Image Collections [96.28021214088746]
We propose a new adversarial generative model of realistic 3D people from 2D images.
Our method captures shape and deformation of the body and loose clothing by adopting a holistic 3D generator.
We experimentally find that our method outperforms previous 3D- and articulation-aware methods in terms of geometry and appearance.
arXiv Detail & Related papers (2023-05-03T17:56:24Z) - DanceFormer: Music Conditioned 3D Dance Generation with Parametric
Motion Transformer [23.51701359698245]
In this paper, we reformulate it by a two-stage process, ie, a key pose generation and then an in-between parametric motion curve prediction.
We propose a large-scale music conditioned 3D dance dataset, called PhantomDance, that is accurately labeled by experienced animators.
Experiments demonstrate that the proposed method, even trained by existing datasets, can generate fluent, performative, and music-matched 3D dances.
arXiv Detail & Related papers (2021-03-18T12:17:38Z) - Learn to Dance with AIST++: Music Conditioned 3D Dance Generation [28.623222697548456]
We present a transformer-based learning framework for 3D dance generation conditioned on music.
We also propose a new dataset of paired 3D motion and music called AIST++, which we reconstruct from the AIST multi-view dance videos.
arXiv Detail & Related papers (2021-01-21T18:59:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.