Dessie: Disentanglement for Articulated 3D Horse Shape and Pose Estimation from Images
- URL: http://arxiv.org/abs/2410.03438v2
- Date: Thu, 31 Oct 2024 13:41:19 GMT
- Title: Dessie: Disentanglement for Articulated 3D Horse Shape and Pose Estimation from Images
- Authors: Ci Li, Yi Yang, Zehang Weng, Elin Hernlund, Silvia Zuffi, Hedvig Kjellström,
- Abstract summary: We introduce the first method using synthetic data generation and disentanglement to learn to regress 3D shape and pose.
Our method, Dessie, surpasses existing 3D horse reconstruction methods and generalizes to other large animals like zebras, cows, and deer.
- Score: 21.718426435322925
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, 3D parametric animal models have been developed to aid in estimating 3D shape and pose from images and video. While progress has been made for humans, it's more challenging for animals due to limited annotated data. To address this, we introduce the first method using synthetic data generation and disentanglement to learn to regress 3D shape and pose. Focusing on horses, we use text-based texture generation and a synthetic data pipeline to create varied shapes, poses, and appearances, learning disentangled spaces. Our method, Dessie, surpasses existing 3D horse reconstruction methods and generalizes to other large animals like zebras, cows, and deer. See the project website at: \url{https://celiali.github.io/Dessie/}.
Related papers
- Generative Zoo [41.65977386204797]
We introduce a pipeline that samples a diverse set of poses and shapes for a variety of mammalian quadrupeds and generates realistic images with corresponding ground-truth pose and shape parameters.
We train a 3D pose and shape regressor on GenZoo, which achieves state-of-the-art performance on a real-world animal pose and shape estimation benchmark.
arXiv Detail & Related papers (2024-12-11T04:57:53Z) - Reconstructing Animals and the Wild [51.98009864071166]
We propose a method to reconstruct natural scenes from single images.
We base our approach on advances leveraging the strong world priors in Large Language Models.
We propose a synthetic dataset comprising one million images and thousands of assets.
arXiv Detail & Related papers (2024-11-27T23:24:27Z) - Learning the 3D Fauna of the Web [70.01196719128912]
We develop 3D-Fauna, an approach that learns a pan-category deformable 3D animal model for more than 100 animal species jointly.
One crucial bottleneck of modeling animals is the limited availability of training data.
We show that prior category-specific attempts fail to generalize to rare species with limited training images.
arXiv Detail & Related papers (2024-01-04T18:32:48Z) - Animal3D: A Comprehensive Dataset of 3D Animal Pose and Shape [32.11280929126699]
We propose Animal3D, the first comprehensive dataset for mammal animal 3D pose and shape estimation.
Animal3D consists of 3379 images collected from 40 mammal species, high-quality annotations of 26 keypoints, and importantly the pose and shape parameters of the SMAL model.
Based on the Animal3D dataset, we benchmark representative shape and pose estimation models at: (1) supervised learning from only the Animal3D data, (2) synthetic to real transfer from synthetically generated images, and (3) fine-tuning human pose and shape estimation models.
arXiv Detail & Related papers (2023-08-22T18:57:07Z) - SCULPT: Shape-Conditioned Unpaired Learning of Pose-dependent Clothed and Textured Human Meshes [62.82552328188602]
We present SCULPT, a novel 3D generative model for clothed and textured 3D meshes of humans.
We devise a deep neural network that learns to represent the geometry and appearance distribution of clothed human bodies.
arXiv Detail & Related papers (2023-08-21T11:23:25Z) - A Horse with no Labels: Self-Supervised Horse Pose Estimation from
Unlabelled Images and Synthetic Prior [1.0878040851638]
We propose a simple yet effective self-supervised method for estimating animal pose.
We train our method with unlabelled images of horses mainly collected for YouTube videos and a prior consisting of 2D synthetic poses.
We demonstrate that it is possible to learn accurate animal poses even with as few assumptions as unlabelled images and a small set of 2D poses generated from synthetic data.
arXiv Detail & Related papers (2023-08-07T09:02:26Z) - LASSIE: Learning Articulated Shapes from Sparse Image Ensemble via 3D
Part Discovery [72.3681707384754]
We propose a practical problem setting to estimate 3D pose and shape of animals given only a few in-the-wild images of a particular animal species.
We do not assume any form of 2D or 3D ground-truth annotations, nor do we leverage any multi-view or temporal information.
Following these insights, we propose LASSIE, a novel optimization framework which discovers 3D parts in a self-supervised manner.
arXiv Detail & Related papers (2022-07-07T17:00:07Z) - Unsupervised Shape and Pose Disentanglement for 3D Meshes [49.431680543840706]
We present a simple yet effective approach to learn disentangled shape and pose representations in an unsupervised setting.
We use a combination of self-consistency and cross-consistency constraints to learn pose and shape space from registered meshes.
We demonstrate the usefulness of learned representations through a number of tasks including pose transfer and shape retrieval.
arXiv Detail & Related papers (2020-07-22T11:00:27Z) - RGBD-Dog: Predicting Canine Pose from RGBD Sensors [25.747221533627464]
We focus on the problem of 3D canine pose estimation from RGBD images.
We generate a dataset of synthetic RGBD images from this data.
A stacked hourglass network is trained to predict 3D joint locations.
arXiv Detail & Related papers (2020-04-16T17:34:45Z) - Chained Representation Cycling: Learning to Estimate 3D Human Pose and
Shape by Cycling Between Representations [73.11883464562895]
We propose a new architecture that facilitates unsupervised, or lightly supervised, learning.
We demonstrate the method by learning 3D human pose and shape from un-paired and un-annotated images.
While we present results for modeling humans, our formulation is general and can be applied to other vision problems.
arXiv Detail & Related papers (2020-01-06T14:54:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.