Canonical Fields: Self-Supervised Learning of Pose-Canonicalized Neural
Fields
- URL: http://arxiv.org/abs/2212.02493v3
- Date: Wed, 17 May 2023 11:02:22 GMT
- Title: Canonical Fields: Self-Supervised Learning of Pose-Canonicalized Neural
Fields
- Authors: Rohith Agaram, Shaurya Dewan, Rahul Sajnani, Adrien Poulenard, Madhava
Krishna, Srinath Sridhar
- Abstract summary: CaFi-Net is a self-supervised method to canonicalize the 3D pose of instances from an object category represented as neural fields.
During inference, our method takes pre-trained neural radiance fields of novel object instances at arbitrary 3D pose.
Experiments on a new dataset of 1300 NeRF models across 13 object categories show that our method matches or exceeds the performance of 3D point cloud-based methods.
- Score: 9.401281193955583
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Coordinate-based implicit neural networks, or neural fields, have emerged as
useful representations of shape and appearance in 3D computer vision. Despite
advances, however, it remains challenging to build neural fields for categories
of objects without datasets like ShapeNet that provide "canonicalized" object
instances that are consistently aligned for their 3D position and orientation
(pose). We present Canonical Field Network (CaFi-Net), a self-supervised method
to canonicalize the 3D pose of instances from an object category represented as
neural fields, specifically neural radiance fields (NeRFs). CaFi-Net directly
learns from continuous and noisy radiance fields using a Siamese network
architecture that is designed to extract equivariant field features for
category-level canonicalization. During inference, our method takes pre-trained
neural radiance fields of novel object instances at arbitrary 3D pose and
estimates a canonical field with consistent 3D pose across the entire category.
Extensive experiments on a new dataset of 1300 NeRF models across 13 object
categories show that our method matches or exceeds the performance of 3D point
cloud-based methods.
Related papers
- Source-Free and Image-Only Unsupervised Domain Adaptation for Category
Level Object Pose Estimation [18.011044932979143]
3DUDA is a method capable of adapting to a nuisance-ridden target domain without 3D or depth data.
We represent object categories as simple cuboid meshes, and harness a generative model of neural feature activations.
We show that our method simulates fine-tuning on a global pseudo-labeled dataset under mild assumptions.
arXiv Detail & Related papers (2024-01-19T17:48:05Z) - Learning Neural Parametric Head Models [7.679586286000453]
We propose a novel 3D morphable model for complete human heads based on hybrid neural fields.
We capture a person's identity in a canonical space as a signed distance field (SDF), and model facial expressions with a neural deformation field.
Our representation achieves high-fidelity local detail by introducing an ensemble of local fields centered around facial anchor points.
arXiv Detail & Related papers (2022-12-06T05:24:42Z) - Topologically-Aware Deformation Fields for Single-View 3D Reconstruction [30.738926104317514]
We present a new framework for learning 3D object shapes and dense cross-object 3D correspondences from just an unaligned category-specific image collection.
The 3D shapes are generated implicitly as deformations to a category-specific signed distance field.
Our approach, dubbed TARS, achieves state-of-the-art reconstruction fidelity on several datasets.
arXiv Detail & Related papers (2022-05-12T17:59:59Z) - Animatable Implicit Neural Representations for Creating Realistic
Avatars from Videos [63.16888987770885]
This paper addresses the challenge of reconstructing an animatable human model from a multi-view video.
We introduce a pose-driven deformation field based on the linear blend skinning algorithm.
We show that our approach significantly outperforms recent human modeling methods.
arXiv Detail & Related papers (2022-03-15T17:56:59Z) - Learning Smooth Neural Functions via Lipschitz Regularization [92.42667575719048]
We introduce a novel regularization designed to encourage smooth latent spaces in neural fields.
Compared with prior Lipschitz regularized networks, ours is computationally fast and can be implemented in four lines of code.
arXiv Detail & Related papers (2022-02-16T21:24:54Z) - ConDor: Self-Supervised Canonicalization of 3D Pose for Partial Shapes [55.689763519293464]
ConDor is a self-supervised method that learns to canonicalize the 3D orientation and position for full and partial 3D point clouds.
During inference, our method takes an unseen full or partial 3D point cloud at an arbitrary pose and outputs an equivariant canonical pose.
arXiv Detail & Related papers (2022-01-19T18:57:21Z) - Scene Synthesis via Uncertainty-Driven Attribute Synchronization [52.31834816911887]
This paper introduces a novel neural scene synthesis approach that can capture diverse feature patterns of 3D scenes.
Our method combines the strength of both neural network-based and conventional scene synthesis approaches.
arXiv Detail & Related papers (2021-08-30T19:45:07Z) - Concentric Spherical GNN for 3D Representation Learning [53.45704095146161]
We propose a novel multi-resolution convolutional architecture for learning over concentric spherical feature maps.
Our hierarchical architecture is based on alternatively learning to incorporate both intra-sphere and inter-sphere information.
We demonstrate the effectiveness of our approach in improving state-of-the-art performance on 3D classification tasks with rotated data.
arXiv Detail & Related papers (2021-03-18T19:05:04Z) - Exploring Deep 3D Spatial Encodings for Large-Scale 3D Scene
Understanding [19.134536179555102]
We propose an alternative approach to overcome the limitations of CNN based approaches by encoding the spatial features of raw 3D point clouds into undirected graph models.
The proposed method achieves on par state-of-the-art accuracy with improved training time and model stability thus indicating strong potential for further research.
arXiv Detail & Related papers (2020-11-29T12:56:19Z) - Cylindrical Convolutional Networks for Joint Object Detection and
Viewpoint Estimation [76.21696417873311]
We introduce a learnable module, cylindrical convolutional networks (CCNs), that exploit cylindrical representation of a convolutional kernel defined in the 3D space.
CCNs extract a view-specific feature through a view-specific convolutional kernel to predict object category scores at each viewpoint.
Our experiments demonstrate the effectiveness of the cylindrical convolutional networks on joint object detection and viewpoint estimation.
arXiv Detail & Related papers (2020-03-25T10:24:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.