Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues
- URL: http://arxiv.org/abs/2204.10235v1
- Date: Thu, 21 Apr 2022 16:13:31 GMT
- Title: Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues
- Authors: Zixuan Huang, Stefan Stojanov, Anh Thai, Varun Jampani, James M. Rehg
- Abstract summary: We present a novel 3D shape reconstruction method which learns to predict an implicit 3D shape representation from a single RGB image.
Our approach uses a set of single-view images of multiple object categories without viewpoint annotation.
We are the first to examine and quantify the benefit of class information in single-view supervised 3D shape reconstruction.
- Score: 42.59825584255742
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a novel 3D shape reconstruction method which learns to predict an
implicit 3D shape representation from a single RGB image. Our approach uses a
set of single-view images of multiple object categories without viewpoint
annotation, forcing the model to learn across multiple object categories
without 3D supervision. To facilitate learning with such minimal supervision,
we use category labels to guide shape learning with a novel categorical metric
learning approach. We also utilize adversarial and viewpoint regularization
techniques to further disentangle the effects of viewpoint and shape. We obtain
the first results for large-scale (more than 50 categories) single-viewpoint
shape prediction using a single model without any 3D cues. We are also the
first to examine and quantify the benefit of class information in single-view
supervised 3D shape reconstruction. Our method achieves superior performance
over state-of-the-art methods on ShapeNet-13, ShapeNet-55 and Pascal3D+.
Related papers
- MV-CLIP: Multi-View CLIP for Zero-shot 3D Shape Recognition [49.52436478739151]
Large-scale pre-trained models have demonstrated impressive performance in vision and language tasks within open-world scenarios.
Recent methods utilize language-image pre-training to realize zero-shot 3D shape recognition.
This paper aims to improve the confidence with view selection and hierarchical prompts.
arXiv Detail & Related papers (2023-11-30T09:51:53Z) - MvDeCor: Multi-view Dense Correspondence Learning for Fine-grained 3D
Segmentation [91.6658845016214]
We propose to utilize self-supervised techniques in the 2D domain for fine-grained 3D shape segmentation tasks.
We render a 3D shape from multiple views, and set up a dense correspondence learning task within the contrastive learning framework.
As a result, the learned 2D representations are view-invariant and geometrically consistent.
arXiv Detail & Related papers (2022-08-18T00:48:15Z) - Single-view 3D Mesh Reconstruction for Seen and Unseen Categories [69.29406107513621]
Single-view 3D Mesh Reconstruction is a fundamental computer vision task that aims at recovering 3D shapes from single-view RGB images.
This paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories.
We propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction.
arXiv Detail & Related papers (2022-08-04T14:13:35Z) - End-to-End Learning of Multi-category 3D Pose and Shape Estimation [128.881857704338]
We propose an end-to-end method that simultaneously detects 2D keypoints from an image and lifts them to 3D.
The proposed method learns both 2D detection and 3D lifting only from 2D keypoints annotations.
In addition to being end-to-end in image to 3D learning, our method also handles objects from multiple categories using a single neural network.
arXiv Detail & Related papers (2021-12-19T17:10:40Z) - Multi-Category Mesh Reconstruction From Image Collections [90.24365811344987]
We present an alternative approach that infers the textured mesh of objects combining a series of deformable 3D models and a set of instance-specific deformation, pose, and texture.
Our method is trained with images of multiple object categories using only foreground masks and rough camera poses as supervision.
Experiments show that the proposed framework can distinguish between different object categories and learn category-specific shape priors in an unsupervised manner.
arXiv Detail & Related papers (2021-10-21T16:32:31Z) - Learning Compositional Shape Priors for Few-Shot 3D Reconstruction [36.40776735291117]
We show that complex encoder-decoder architectures exploit large amounts of per-category data.
We propose three ways to learn a class-specific global shape prior, directly from data.
Experiments on the popular ShapeNet dataset show that our method outperforms a zero-shot baseline by over 40%.
arXiv Detail & Related papers (2021-06-11T14:55:49Z) - Fine-Grained 3D Shape Classification with Hierarchical Part-View
Attentions [70.0171362989609]
We propose a novel fine-grained 3D shape classification method named FG3D-Net to capture the fine-grained local details of 3D shapes from multiple rendered views.
Our results under the fine-grained 3D shape dataset show that our method outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2020-05-26T06:53:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.