How to Train Your Dragon: Automatic Diffusion-Based Rigging for Characters with Diverse Topologies
- URL: http://arxiv.org/abs/2503.15586v1
- Date: Wed, 19 Mar 2025 17:46:36 GMT
- Title: How to Train Your Dragon: Automatic Diffusion-Based Rigging for Characters with Diverse Topologies
- Authors: Zeqi Gu, Difan Liu, Timothy Langlois, Matthew Fisher, Abe Davis,
- Abstract summary: We extend the ability of such models to animate images of characters with more diverse skeletal topologies.<n>We propose a procedural data generation pipeline that efficiently samples training data with diverse topologies on the fly.<n>During fine-tuning, our model rapidly adapts to unseen target characters and generalizes well to rendering new poses.
- Score: 17.66381105705158
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent diffusion-based methods have achieved impressive results on animating images of human subjects. However, most of that success has built on human-specific body pose representations and extensive training with labeled real videos. In this work, we extend the ability of such models to animate images of characters with more diverse skeletal topologies. Given a small number (3-5) of example frames showing the character in different poses with corresponding skeletal information, our model quickly infers a rig for that character that can generate images corresponding to new skeleton poses. We propose a procedural data generation pipeline that efficiently samples training data with diverse topologies on the fly. We use it, along with a novel skeleton representation, to train our model on articulated shapes spanning a large space of textures and topologies. Then during fine-tuning, our model rapidly adapts to unseen target characters and generalizes well to rendering new poses, both for realistic and more stylized cartoon appearances. To better evaluate performance on this novel and challenging task, we create the first 2D video dataset that contains both humanoid and non-humanoid subjects with per-frame keypoint annotations. With extensive experiments, we demonstrate the superior quality of our results. Project page: https://traindragondiffusion.github.io/
Related papers
- AnyTop: Character Animation Diffusion with Any Topology [54.07731933876742]
We introduce AnyTop, a diffusion model that generates motions for diverse characters with distinct motion dynamics.<n>Our work features a transformer-based denoising network, tailored for arbitrary skeleton learning.<n>Our evaluation demonstrates that AnyTops well, even with as few as three training examples per topology, and can produce motions for unseen skeletons as well.
arXiv Detail & Related papers (2025-02-24T17:00:36Z) - Can Generative Video Models Help Pose Estimation? [42.10672365565019]
Pairwise pose estimation from images with little or no overlap is an open challenge in computer vision.<n>Inspired by the human ability to infer spatial relationships from diverse scenes, we propose a novel approach, InterPose.<n>We propose a video model to hallucinate intermediate frames between two input images, effectively creating a dense, visual transition.
arXiv Detail & Related papers (2024-12-20T18:58:24Z) - DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses [57.17501809717155]
We present DreamDance, a novel method for animating human images using only skeleton pose sequences as conditional inputs.<n>Our key insight is that human images naturally exhibit multiple levels of correlation.<n>We construct the TikTok-Dance5K dataset, comprising 5K high-quality dance videos with detailed frame annotations.
arXiv Detail & Related papers (2024-11-30T08:42:13Z) - Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters [86.13319549186959]
We present Make-It-Animatable, a novel data-driven method to make any 3D humanoid model ready for character animation in less than one second.
Our framework generates high-quality blend weights, bones, and pose transformations.
Compared to existing methods, our approach demonstrates significant improvements in both quality and speed.
arXiv Detail & Related papers (2024-11-27T10:18:06Z) - Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation [27.700371215886683]
diffusion models have become the mainstream in visual generation research, owing to their robust generative capabilities.
In this paper, we propose a novel framework tailored for character animation.
By expanding the training data, our approach can animate arbitrary characters, yielding superior results in character animation compared to other image-to-video methods.
arXiv Detail & Related papers (2023-11-28T12:27:15Z) - Neural Novel Actor: Learning a Generalized Animatable Neural
Representation for Human Actors [98.24047528960406]
We propose a new method for learning a generalized animatable neural representation from a sparse set of multi-view imagery of multiple persons.
The learned representation can be used to synthesize novel view images of an arbitrary person from a sparse set of cameras, and further animate them with the user's pose control.
arXiv Detail & Related papers (2022-08-25T07:36:46Z) - DANBO: Disentangled Articulated Neural Body Representations via Graph
Neural Networks [12.132886846993108]
High-resolution models enable photo-realistic avatars but at the cost of requiring studio settings not available to end users.
Our goal is to create avatars directly from raw images without relying on expensive studio setups and surface tracking.
We introduce a three-stage method that induces two inductive biases to better disentangled pose-dependent deformation.
arXiv Detail & Related papers (2022-05-03T17:56:46Z) - Neural Rendering of Humans in Novel View and Pose from Monocular Video [68.37767099240236]
We introduce a new method that generates photo-realistic humans under novel views and poses given a monocular video as input.
Our method significantly outperforms existing approaches under unseen poses and novel views given monocular videos as input.
arXiv Detail & Related papers (2022-04-04T03:09:20Z) - Liquid Warping GAN with Attention: A Unified Framework for Human Image
Synthesis [58.05389586712485]
We tackle human image synthesis, including human motion imitation, appearance transfer, and novel view synthesis.
In this paper, we propose a 3D body mesh recovery module to disentangle the pose and shape.
We also build a new dataset, namely iPER dataset, for the evaluation of human motion imitation, appearance transfer, and novel view synthesis.
arXiv Detail & Related papers (2020-11-18T02:57:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.