Who Left the Dogs Out? 3D Animal Reconstruction with Expectation
Maximization in the Loop
- URL: http://arxiv.org/abs/2007.11110v2
- Date: Thu, 11 Feb 2021 13:47:24 GMT
- Title: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation
Maximization in the Loop
- Authors: Benjamin Biggs, Oliver Boyne, James Charles, Andrew Fitzgibbon and
Roberto Cipolla
- Abstract summary: We introduce an automatic, end-to-end method for recovering the 3D pose and shape of dogs from monocular internet images.
We learn a richer prior over shapes than previous work, which helps regularize parameter estimation.
We demonstrate results on the Stanford Dog dataset, an 'in the wild' dataset of 20,580 dog images.
- Score: 25.40930904714051
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce an automatic, end-to-end method for recovering the 3D pose and
shape of dogs from monocular internet images. The large variation in shape
between dog breeds, significant occlusion and low quality of internet images
makes this a challenging problem. We learn a richer prior over shapes than
previous work, which helps regularize parameter estimation. We demonstrate
results on the Stanford Dog dataset, an 'in the wild' dataset of 20,580 dog
images for which we have collected 2D joint and silhouette annotations to split
for training and evaluation. In order to capture the large shape variety of
dogs, we show that the natural variation in the 2D dataset is enough to learn a
detailed 3D prior through expectation maximization (EM). As a by-product of
training, we generate a new parameterized model (including limb scaling) SMBLD
which we release alongside our new annotation dataset StanfordExtra to the
research community.
Related papers
- Learning the 3D Fauna of the Web [70.01196719128912]
We develop 3D-Fauna, an approach that learns a pan-category deformable 3D animal model for more than 100 animal species jointly.
One crucial bottleneck of modeling animals is the limited availability of training data.
We show that prior category-specific attempts fail to generalize to rare species with limited training images.
arXiv Detail & Related papers (2024-01-04T18:32:48Z) - Animal3D: A Comprehensive Dataset of 3D Animal Pose and Shape [32.11280929126699]
We propose Animal3D, the first comprehensive dataset for mammal animal 3D pose and shape estimation.
Animal3D consists of 3379 images collected from 40 mammal species, high-quality annotations of 26 keypoints, and importantly the pose and shape parameters of the SMAL model.
Based on the Animal3D dataset, we benchmark representative shape and pose estimation models at: (1) supervised learning from only the Animal3D data, (2) synthetic to real transfer from synthetically generated images, and (3) fine-tuning human pose and shape estimation models.
arXiv Detail & Related papers (2023-08-22T18:57:07Z) - Of Mice and Pose: 2D Mouse Pose Estimation from Unlabelled Data and
Synthetic Prior [0.7499722271664145]
We propose an approach for estimating 2D mouse body pose from unlabelled images using a synthetically generated empirical pose prior.
We adapt this method to the limb structure of the mouse and generate the empirical prior of 2D poses from a synthetic 3D mouse model.
In experiments on a new mouse video dataset, we evaluate the performance of the approach by comparing pose predictions to a manually obtained ground truth.
arXiv Detail & Related papers (2023-07-25T09:31:55Z) - 3D generation on ImageNet [76.0440752186121]
We develop a 3D generator with Generic Priors (3DGP): a 3D synthesis framework with more general assumptions about the training data.
Our model is based on three new ideas.
We explore our model on four datasets: SDIP Dogs 256x256, SDIP Elephants 256x256, LSUN Horses 256x256, and ImageNet 256x256.
arXiv Detail & Related papers (2023-03-02T17:06:57Z) - LASSIE: Learning Articulated Shapes from Sparse Image Ensemble via 3D
Part Discovery [72.3681707384754]
We propose a practical problem setting to estimate 3D pose and shape of animals given only a few in-the-wild images of a particular animal species.
We do not assume any form of 2D or 3D ground-truth annotations, nor do we leverage any multi-view or temporal information.
Following these insights, we propose LASSIE, a novel optimization framework which discovers 3D parts in a self-supervised manner.
arXiv Detail & Related papers (2022-07-07T17:00:07Z) - BARC: Learning to Regress 3D Dog Shape from Images by Exploiting Breed
Information [66.77206206569802]
Our goal is to recover the 3D shape and pose of dogs from a single image.
Recent work has proposed to directly regress the SMAL animal model, with additional limb scale parameters, from images.
Our method, called BARC (Breed-Augmented Regression using Classification), goes beyond prior work in several important ways.
This work shows that a-priori information about genetic similarity can help to compensate for the lack of 3D training data.
arXiv Detail & Related papers (2022-03-29T13:16:06Z) - Coarse-to-fine Animal Pose and Shape Estimation [67.39635503744395]
We propose a coarse-to-fine approach to reconstruct 3D animal mesh from a single image.
The coarse estimation stage first estimates the pose, shape and translation parameters of the SMAL model.
The estimated meshes are then used as a starting point by a graph convolutional network (GCN) to predict a per-vertex deformation in the refinement stage.
arXiv Detail & Related papers (2021-11-16T01:27:20Z) - Probabilistic 3D Human Shape and Pose Estimation from Multiple
Unconstrained Images in the Wild [25.647676661390282]
We propose a new task: shape and pose estimation from a group of multiple images of a human subject.
Our solution predicts distributions over SMPL body shape and pose parameters conditioned on the input images in the group.
We show that the additional body shape information present in multi-image input groups improves 3D human shape estimation metrics.
arXiv Detail & Related papers (2021-03-19T18:32:16Z) - Cascaded deep monocular 3D human pose estimation with evolutionary
training data [76.3478675752847]
Deep representation learning has achieved remarkable accuracy for monocular 3D human pose estimation.
This paper proposes a novel data augmentation method that is scalable for massive amount of training data.
Our method synthesizes unseen 3D human skeletons based on a hierarchical human representation and synthesizings inspired by prior knowledge.
arXiv Detail & Related papers (2020-06-14T03:09:52Z) - RGBD-Dog: Predicting Canine Pose from RGBD Sensors [25.747221533627464]
We focus on the problem of 3D canine pose estimation from RGBD images.
We generate a dataset of synthetic RGBD images from this data.
A stacked hourglass network is trained to predict 3D joint locations.
arXiv Detail & Related papers (2020-04-16T17:34:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.