From Image Collections to Point Clouds with Self-supervised Shape and
Pose Networks
- URL: http://arxiv.org/abs/2005.01939v1
- Date: Tue, 5 May 2020 04:25:16 GMT
- Title: From Image Collections to Point Clouds with Self-supervised Shape and
Pose Networks
- Authors: K L Navaneet, Ansu Mathew, Shashank Kashyap, Wei-Chih Hung, Varun
Jampani and R. Venkatesh Babu
- Abstract summary: Reconstructing 3D models from 2D images is one of the fundamental problems in computer vision.
We propose a deep learning technique for 3D object reconstruction from a single image.
We learn both 3D point cloud reconstruction and pose estimation networks in a self-supervised manner.
- Score: 53.71440550507745
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reconstructing 3D models from 2D images is one of the fundamental problems in
computer vision. In this work, we propose a deep learning technique for 3D
object reconstruction from a single image. Contrary to recent works that either
use 3D supervision or multi-view supervision, we use only single view images
with no pose information during training as well. This makes our approach more
practical requiring only an image collection of an object category and the
corresponding silhouettes. We learn both 3D point cloud reconstruction and pose
estimation networks in a self-supervised manner, making use of differentiable
point cloud renderer to train with 2D supervision. A key novelty of the
proposed technique is to impose 3D geometric reasoning into predicted 3D point
clouds by rotating them with randomly sampled poses and then enforcing cycle
consistency on both 3D reconstructions and poses. In addition, using
single-view supervision allows us to do test-time optimization on a given test
image. Experiments on the synthetic ShapeNet and real-world Pix3D datasets
demonstrate that our approach, despite using less supervision, can achieve
competitive performance compared to pose-supervised and multi-view supervised
approaches.
Related papers
- Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models [97.58685709663287]
generative pre-training can boost the performance of fundamental models in 2D vision.
In 3D vision, the over-reliance on Transformer-based backbones and the unordered nature of point clouds have restricted the further development of generative pre-training.
We propose a novel 3D-to-2D generative pre-training method that is adaptable to any point cloud model.
arXiv Detail & Related papers (2023-07-27T16:07:03Z) - Leveraging Single-View Images for Unsupervised 3D Point Cloud Completion [53.93172686610741]
Cross-PCC is an unsupervised point cloud completion method without requiring any 3D complete point clouds.
To take advantage of the complementary information from 2D images, we use a single-view RGB image to extract 2D features.
Our method even achieves comparable performance to some supervised methods.
arXiv Detail & Related papers (2022-12-01T15:11:21Z) - CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D
Point Cloud Understanding [2.8661021832561757]
CrossPoint is a simple cross-modal contrastive learning approach to learn transferable 3D point cloud representations.
Our approach outperforms the previous unsupervised learning methods on a diverse range of downstream tasks including 3D object classification and segmentation.
arXiv Detail & Related papers (2022-03-01T18:59:01Z) - Toward Realistic Single-View 3D Object Reconstruction with Unsupervised
Learning from Multiple Images [18.888384816156744]
We propose a novel unsupervised algorithm to learn a 3D reconstruction network from a multi-image dataset.
Our algorithm is more general and covers the symmetry-required scenario as a special case.
Our method surpasses the previous work in both quality and robustness.
arXiv Detail & Related papers (2021-09-06T08:34:04Z) - Unsupervised Learning of Fine Structure Generation for 3D Point Clouds
by 2D Projection Matching [66.98712589559028]
We propose an unsupervised approach for 3D point cloud generation with fine structures.
Our method can recover fine 3D structures from 2D silhouette images at different resolutions.
arXiv Detail & Related papers (2021-08-08T22:15:31Z) - Model-based 3D Hand Reconstruction via Self-Supervised Learning [72.0817813032385]
Reconstructing a 3D hand from a single-view RGB image is challenging due to various hand configurations and depth ambiguity.
We propose S2HAND, a self-supervised 3D hand reconstruction network that can jointly estimate pose, shape, texture, and the camera viewpoint.
For the first time, we demonstrate the feasibility of training an accurate 3D hand reconstruction network without relying on manual annotations.
arXiv Detail & Related papers (2021-03-22T10:12:43Z) - An Effective Loss Function for Generating 3D Models from Single 2D Image
without Rendering [0.0]
Differentiable rendering is a very successful technique that applies to a Single-View 3D Reconstruction.
Currents use losses based on pixels between a rendered image of some 3D reconstructed object and ground-truth images from given matched viewpoints to optimise parameters of the 3D shape.
We propose a novel effective loss function that evaluates how well the projections of reconstructed 3D point clouds cover the ground truth object's silhouette.
arXiv Detail & Related papers (2021-03-05T00:02:18Z) - Chained Representation Cycling: Learning to Estimate 3D Human Pose and
Shape by Cycling Between Representations [73.11883464562895]
We propose a new architecture that facilitates unsupervised, or lightly supervised, learning.
We demonstrate the method by learning 3D human pose and shape from un-paired and un-annotated images.
While we present results for modeling humans, our formulation is general and can be applied to other vision problems.
arXiv Detail & Related papers (2020-01-06T14:54:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.