Pre-train, Self-train, Distill: A simple recipe for Supersizing 3D
Reconstruction
- URL: http://arxiv.org/abs/2204.03642v1
- Date: Thu, 7 Apr 2022 17:59:25 GMT
- Title: Pre-train, Self-train, Distill: A simple recipe for Supersizing 3D
Reconstruction
- Authors: Kalyan Vasudev Alwala, Abhinav Gupta, Shubham Tulsiani
- Abstract summary: We learn a unified model for single-view 3D reconstruction of objects from hundreds of semantic categories.
Our work relies on segmented image collections for learning 3D of generic categories.
- Score: 47.38670633513938
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Our work learns a unified model for single-view 3D reconstruction of objects
from hundreds of semantic categories. As a scalable alternative to direct 3D
supervision, our work relies on segmented image collections for learning 3D of
generic categories. Unlike prior works that use similar supervision but learn
independent category-specific models from scratch, our approach of learning a
unified model simplifies the training process while also allowing the model to
benefit from the common structure across categories. Using image collections
from standard recognition datasets, we show that our approach allows learning
3D inference for over 150 object categories. We evaluate using two datasets and
qualitatively and quantitatively show that our unified reconstruction approach
improves over prior category-specific reconstruction baselines. Our final 3D
reconstruction model is also capable of zero-shot inference on images from
unseen object categories and we empirically show that increasing the number of
training categories improves the reconstruction quality.
Related papers
- Single-view 3D Mesh Reconstruction for Seen and Unseen Categories [69.29406107513621]
Single-view 3D Mesh Reconstruction is a fundamental computer vision task that aims at recovering 3D shapes from single-view RGB images.
This paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories.
We propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction.
arXiv Detail & Related papers (2022-08-04T14:13:35Z) - Few-shot Single-view 3D Reconstruction with Memory Prior Contrastive
Network [18.000566656946475]
3D reconstruction of novel categories based on few-shot learning is appealing in real-world applications.
We present a Memory Prior Contrastive Network (MPCN) that can store shape prior knowledge in a few-shot learning based 3D reconstruction framework.
arXiv Detail & Related papers (2022-07-30T10:49:39Z) - Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance
Consistency [59.427074701985795]
Single-view reconstruction typically rely on viewpoint annotations, silhouettes, the absence of background, multiple views of the same instance, a template shape, or symmetry.
We avoid all of these supervisions and hypotheses by leveraging explicitly the consistency between images of different object instances.
Our main contributions are two approaches to leverage cross-instance consistency: (i) progressive conditioning, a training strategy to gradually specialize the model from category to instances in a curriculum learning fashion; (ii) swap reconstruction, a loss enforcing consistency between instances having similar shape or texture.
arXiv Detail & Related papers (2022-04-21T17:47:35Z) - Multi-Category Mesh Reconstruction From Image Collections [90.24365811344987]
We present an alternative approach that infers the textured mesh of objects combining a series of deformable 3D models and a set of instance-specific deformation, pose, and texture.
Our method is trained with images of multiple object categories using only foreground masks and rough camera poses as supervision.
Experiments show that the proposed framework can distinguish between different object categories and learn category-specific shape priors in an unsupervised manner.
arXiv Detail & Related papers (2021-10-21T16:32:31Z) - StrobeNet: Category-Level Multiview Reconstruction of Articulated
Objects [17.698319441265223]
StrobeNet is a method for category-level 3D reconstruction of articulating objects from unposed RGB images.
Our approach reconstructs objects even when they are observed in different articulations in images with large baselines.
arXiv Detail & Related papers (2021-05-17T17:05:42Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z) - On the generalization of learning-based 3D reconstruction [10.516860541554632]
We study the inductive biases encoded in the model architecture that impact the generalization of learning-based 3D reconstruction methods.
We find that 3 inductive biases impact performance: the spatial extent of the encoder, the use of the underlying geometry of the scene to describe point features, and the mechanism to aggregate information from multiple views.
arXiv Detail & Related papers (2020-06-27T18:53:41Z) - Few-Shot Single-View 3-D Object Reconstruction with Compositional Priors [30.262308825799167]
We show that complex encoder-decoder architectures perform similarly to nearest-neighbor baselines in standard benchmarks.
We propose three approaches that efficiently integrate a class prior into a 3D reconstruction model.
arXiv Detail & Related papers (2020-04-14T04:53:34Z) - Self-supervised Single-view 3D Reconstruction via Semantic Consistency [142.71430568330172]
We learn a self-supervised, single-view 3D reconstruction model that predicts the shape, texture and camera pose of a target object.
The proposed method does not necessitate 3D supervision, manually annotated keypoints, multi-view images of an object or a prior 3D template.
arXiv Detail & Related papers (2020-03-13T20:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.