Neural Articulated Radiance Field
- URL: http://arxiv.org/abs/2104.03110v1
- Date: Wed, 7 Apr 2021 13:23:14 GMT
- Title: Neural Articulated Radiance Field
- Authors: Atsuhiro Noguchi, Xiao Sun, Stephen Lin, Tatsuya Harada
- Abstract summary: We present Neural Articulated Radiance Field (NARF), a novel deformable 3D representation for articulated objects learned from images.
Experiments show that the proposed method is efficient and can generalize well to novel poses.
- Score: 90.91714894044253
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present Neural Articulated Radiance Field (NARF), a novel deformable 3D
representation for articulated objects learned from images. While recent
advances in 3D implicit representation have made it possible to learn models of
complex objects, learning pose-controllable representations of articulated
objects remains a challenge, as current methods require 3D shape supervision
and are unable to render appearance. In formulating an implicit representation
of 3D articulated objects, our method considers only the rigid transformation
of the most relevant object part in solving for the radiance field at each 3D
location. In this way, the proposed method represents pose-dependent changes
without significantly increasing the computational complexity. NARF is fully
differentiable and can be trained from images with pose annotations. Moreover,
through the use of an autoencoder, it can learn appearance variations over
multiple instances of an object class. Experiments show that the proposed
method is efficient and can generalize well to novel poses. We make the code,
model and demo available for research purposes at
https://github.com/nogu-atsu/NARF
Related papers
- ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and
Pose Optimization [40.36229450208817]
We present ShAPO, a method for joint multi-object detection, 3D textured reconstruction, 6D object pose and size estimation.
Key to ShAPO is a single-shot pipeline to regress shape, appearance and pose latent codes along with the masks of each object instance.
Our method significantly out-performs all baselines on the NOCS dataset with an 8% absolute improvement in mAP for 6D pose estimation.
arXiv Detail & Related papers (2022-07-27T17:59:31Z) - Disentangled3D: Learning a 3D Generative Model with Disentangled
Geometry and Appearance from Monocular Images [94.49117671450531]
State-of-the-art 3D generative models are GANs which use neural 3D volumetric representations for synthesis.
In this paper, we design a 3D GAN which can learn a disentangled model of objects, just from monocular observations.
arXiv Detail & Related papers (2022-03-29T22:03:18Z) - De-rendering 3D Objects in the Wild [21.16153549406485]
We present a weakly supervised method that is able to decompose a single image of an object into shape.
For training, the method only relies on a rough initial shape estimate of the training objects to bootstrap the learning process.
In our experiments, we show that the method can successfully de-render 2D images into a 3D representation and generalizes to unseen object categories.
arXiv Detail & Related papers (2022-01-06T23:50:09Z) - Learning Canonical 3D Object Representation for Fine-Grained Recognition [77.33501114409036]
We propose a novel framework for fine-grained object recognition that learns to recover object variation in 3D space from a single image.
We represent an object as a composition of 3D shape and its appearance, while eliminating the effect of camera viewpoint.
By incorporating 3D shape and appearance jointly in a deep representation, our method learns the discriminative representation of the object.
arXiv Detail & Related papers (2021-08-10T12:19:34Z) - Sparse Pose Trajectory Completion [87.31270669154452]
We propose a method to learn, even using a dataset where objects appear only in sparsely sampled views.
This is achieved with a cross-modal pose trajectory transfer mechanism.
Our method is evaluated on the Pix3D and ShapeNet datasets.
arXiv Detail & Related papers (2021-05-01T00:07:21Z) - Weakly Supervised Learning of Multi-Object 3D Scene Decompositions Using
Deep Shape Priors [69.02332607843569]
PriSMONet is a novel approach for learning Multi-Object 3D scene decomposition and representations from single images.
A recurrent encoder regresses a latent representation of 3D shape, pose and texture of each object from an input RGB image.
We evaluate the accuracy of our model in inferring 3D scene layout, demonstrate its generative capabilities, assess its generalization to real images, and point out benefits of the learned representation.
arXiv Detail & Related papers (2020-10-08T14:49:23Z) - Chained Representation Cycling: Learning to Estimate 3D Human Pose and
Shape by Cycling Between Representations [73.11883464562895]
We propose a new architecture that facilitates unsupervised, or lightly supervised, learning.
We demonstrate the method by learning 3D human pose and shape from un-paired and un-annotated images.
While we present results for modeling humans, our formulation is general and can be applied to other vision problems.
arXiv Detail & Related papers (2020-01-06T14:54:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.