MagicPony: Learning Articulated 3D Animals in the Wild
- URL: http://arxiv.org/abs/2211.12497v3
- Date: Tue, 4 Apr 2023 03:29:39 GMT
- Title: MagicPony: Learning Articulated 3D Animals in the Wild
- Authors: Shangzhe Wu, Ruining Li, Tomas Jakab, Christian Rupprecht, Andrea
Vedaldi
- Abstract summary: We present a new method, dubbed MagicPony, that learns this predictor purely from in-the-wild single-view images of the object category.
At its core is an implicit-explicit representation of articulated shape and appearance, combining the strengths of neural fields and meshes.
- Score: 81.63322697335228
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of predicting the 3D shape, articulation, viewpoint,
texture, and lighting of an articulated animal like a horse given a single test
image as input. We present a new method, dubbed MagicPony, that learns this
predictor purely from in-the-wild single-view images of the object category,
with minimal assumptions about the topology of deformation. At its core is an
implicit-explicit representation of articulated shape and appearance, combining
the strengths of neural fields and meshes. In order to help the model
understand an object's shape and pose, we distil the knowledge captured by an
off-the-shelf self-supervised vision transformer and fuse it into the 3D model.
To overcome local optima in viewpoint estimation, we further introduce a new
viewpoint sampling scheme that comes at no additional training cost. MagicPony
outperforms prior work on this challenging task and demonstrates excellent
generalisation in reconstructing art, despite the fact that it is only trained
on real images.
Related papers
- Learning the 3D Fauna of the Web [70.01196719128912]
We develop 3D-Fauna, an approach that learns a pan-category deformable 3D animal model for more than 100 animal species jointly.
One crucial bottleneck of modeling animals is the limited availability of training data.
We show that prior category-specific attempts fail to generalize to rare species with limited training images.
arXiv Detail & Related papers (2024-01-04T18:32:48Z) - Understanding Pose and Appearance Disentanglement in 3D Human Pose
Estimation [72.50214227616728]
Several methods have proposed to learn image representations in a self-supervised fashion so as to disentangle the appearance information from the pose one.
We study disentanglement from the perspective of the self-supervised network, via diverse image synthesis experiments.
We design an adversarial strategy focusing on generating natural appearance changes of the subject, and against which we could expect a disentangled network to be robust.
arXiv Detail & Related papers (2023-09-20T22:22:21Z) - SAOR: Single-View Articulated Object Reconstruction [17.2716639564414]
We introduce SAOR, a novel approach for estimating the 3D shape, texture, and viewpoint of an articulated object from a single image captured in the wild.
Unlike prior approaches that rely on pre-defined category-specific 3D templates or tailored 3D skeletons, SAOR learns to articulate shapes from single-view image collections with a skeleton-free part-based model without requiring any 3D object shape priors.
arXiv Detail & Related papers (2023-03-23T17:59:35Z) - Learning 3D Photography Videos via Self-supervised Diffusion on Single
Images [105.81348348510551]
3D photography renders a static image into a video with appealing 3D visual effects.
Existing approaches typically first conduct monocular depth estimation, then render the input frame to subsequent frames with various viewpoints.
We present a novel task: out-animation, which extends the space and time of input objects.
arXiv Detail & Related papers (2023-02-21T16:18:40Z) - Neural Articulated Radiance Field [90.91714894044253]
We present Neural Articulated Radiance Field (NARF), a novel deformable 3D representation for articulated objects learned from images.
Experiments show that the proposed method is efficient and can generalize well to novel poses.
arXiv Detail & Related papers (2021-04-07T13:23:14Z) - Unsupervised Shape and Pose Disentanglement for 3D Meshes [49.431680543840706]
We present a simple yet effective approach to learn disentangled shape and pose representations in an unsupervised setting.
We use a combination of self-consistency and cross-consistency constraints to learn pose and shape space from registered meshes.
We demonstrate the usefulness of learned representations through a number of tasks including pose transfer and shape retrieval.
arXiv Detail & Related papers (2020-07-22T11:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.