Fully Understanding Generic Objects: Modeling, Segmentation, and
Reconstruction
- URL: http://arxiv.org/abs/2104.00858v1
- Date: Fri, 2 Apr 2021 02:39:29 GMT
- Title: Fully Understanding Generic Objects: Modeling, Segmentation, and
Reconstruction
- Authors: Feng Liu, Luan Tran, Xiaoming Liu
- Abstract summary: Inferring 3D structure of a generic object from a 2D image is a long-standing objective of computer vision.
We take an alternative approach with semi-supervised learning. That is, for a 2D image of a generic object, we decompose it into latent representations of category, shape and albedo.
We show that the complete shape and albedo modeling enables us to leverage real 2D images in both modeling and model fitting.
- Score: 33.95791350070165
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Inferring 3D structure of a generic object from a 2D image is a long-standing
objective of computer vision. Conventional approaches either learn completely
from CAD-generated synthetic data, which have difficulty in inference from real
images, or generate 2.5D depth image via intrinsic decomposition, which is
limited compared to the full 3D reconstruction. One fundamental challenge lies
in how to leverage numerous real 2D images without any 3D ground truth. To
address this issue, we take an alternative approach with semi-supervised
learning. That is, for a 2D image of a generic object, we decompose it into
latent representations of category, shape and albedo, lighting and camera
projection matrix, decode the representations to segmented 3D shape and albedo
respectively, and fuse these components to render an image well approximating
the input image. Using a category-adaptive 3D joint occupancy field (JOF), we
show that the complete shape and albedo modeling enables us to leverage real 2D
images in both modeling and model fitting. The effectiveness of our approach is
demonstrated through superior 3D reconstruction from a single image, being
either synthetic or real, and shape segmentation.
Related papers
- 3D Surface Reconstruction in the Wild by Deforming Shape Priors from
Synthetic Data [24.97027425606138]
Reconstructing the underlying 3D surface of an object from a single image is a challenging problem.
We present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image.
Our approach achieves state-of-the-art reconstruction performance across several real-world datasets.
arXiv Detail & Related papers (2023-02-24T20:37:27Z) - SSR-2D: Semantic 3D Scene Reconstruction from 2D Images [54.46126685716471]
In this work, we explore a central 3D scene modeling task, namely, semantic scene reconstruction without using any 3D annotations.
The key idea of our approach is to design a trainable model that employs both incomplete 3D reconstructions and their corresponding source RGB-D images.
Our method achieves the state-of-the-art performance of semantic scene completion on two large-scale benchmark datasets MatterPort3D and ScanNet.
arXiv Detail & Related papers (2023-02-07T17:47:52Z) - GAN2X: Non-Lambertian Inverse Rendering of Image GANs [85.76426471872855]
We present GAN2X, a new method for unsupervised inverse rendering that only uses unpaired images for training.
Unlike previous Shape-from-GAN approaches that mainly focus on 3D shapes, we take the first attempt to also recover non-Lambertian material properties by exploiting the pseudo paired data generated by a GAN.
Experiments demonstrate that GAN2X can accurately decompose 2D images to 3D shape, albedo, and specular properties for different object categories, and achieves the state-of-the-art performance for unsupervised single-view 3D face reconstruction.
arXiv Detail & Related papers (2022-06-18T16:58:49Z) - 3D object reconstruction and 6D-pose estimation from 2D shape for
robotic grasping of objects [2.330913682033217]
We propose a method for 3D object reconstruction and 6D-pose estimation from 2D images.
By computing transformation parameters directly from the 2D images, the number of free parameters required during the registration process is reduced.
In robot experiments, successful grasping of objects demonstrates its usability in real-world environments.
arXiv Detail & Related papers (2022-03-02T11:58:35Z) - Learning Canonical 3D Object Representation for Fine-Grained Recognition [77.33501114409036]
We propose a novel framework for fine-grained object recognition that learns to recover object variation in 3D space from a single image.
We represent an object as a composition of 3D shape and its appearance, while eliminating the effect of camera viewpoint.
By incorporating 3D shape and appearance jointly in a deep representation, our method learns the discriminative representation of the object.
arXiv Detail & Related papers (2021-08-10T12:19:34Z) - Cycle-Consistent Generative Rendering for 2D-3D Modality Translation [21.962725416347855]
We learn a module that generates a realistic rendering of a 3D object and infers a realistic 3D shape from an image.
By leveraging generative domain translation methods, we are able to define a learning algorithm that requires only weak supervision, with unpaired data.
The resulting model is able to perform 3D shape, pose, and texture inference from 2D images, but can also generate novel textured 3D shapes and renders.
arXiv Detail & Related papers (2020-11-16T15:23:03Z) - Do 2D GANs Know 3D Shape? Unsupervised 3D shape reconstruction from 2D
Image GANs [156.1209884183522]
State-of-the-art 2D generative models like GANs show unprecedented quality in modeling the natural image manifold.
We present the first attempt to directly mine 3D geometric cues from an off-the-shelf 2D GAN that is trained on RGB images only.
arXiv Detail & Related papers (2020-11-02T09:38:43Z) - Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve [54.054575408582565]
We propose to leverage existing large-scale datasets of 3D models to understand the underlying 3D structure of objects seen in an image.
We present Mask2CAD, which jointly detects objects in real-world images and for each detected object, optimize for the most similar CAD model and its pose.
This produces a clean, lightweight representation of the objects in an image.
arXiv Detail & Related papers (2020-07-26T00:08:37Z) - Self-Supervised 2D Image to 3D Shape Translation with Disentangled
Representations [92.89846887298852]
We present a framework to translate between 2D image views and 3D object shapes.
We propose SIST, a Self-supervised Image to Shape Translation framework.
arXiv Detail & Related papers (2020-03-22T22:44:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.