$PC^2$: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D
Reconstruction
- URL: http://arxiv.org/abs/2302.10668v2
- Date: Thu, 23 Feb 2023 16:03:05 GMT
- Title: $PC^2$: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D
Reconstruction
- Authors: Luke Melas-Kyriazi, Christian Rupprecht, Andrea Vedaldi
- Abstract summary: Reconstructing the 3D shape of an object from a single RGB image is a long-standing and highly challenging problem in computer vision.
We propose a novel method for single-image 3D reconstruction which generates a sparse point cloud via a conditional denoising diffusion process.
- Score: 97.06927852165464
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Reconstructing the 3D shape of an object from a single RGB image is a
long-standing and highly challenging problem in computer vision. In this paper,
we propose a novel method for single-image 3D reconstruction which generates a
sparse point cloud via a conditional denoising diffusion process. Our method
takes as input a single RGB image along with its camera pose and gradually
denoises a set of 3D points, whose positions are initially sampled randomly
from a three-dimensional Gaussian distribution, into the shape of an object.
The key to our method is a geometrically-consistent conditioning process which
we call projection conditioning: at each step in the diffusion process, we
project local image features onto the partially-denoised point cloud from the
given camera pose. This projection conditioning process enables us to generate
high-resolution sparse geometries that are well-aligned with the input image,
and can additionally be used to predict point colors after shape
reconstruction. Moreover, due to the probabilistic nature of the diffusion
process, our method is naturally capable of generating multiple different
shapes consistent with a single input image. In contrast to prior work, our
approach not only performs well on synthetic benchmarks, but also gives large
qualitative improvements on complex real-world data.
Related papers
- RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images [13.051302134031808]
We introduce a novel method for calculating the 6DoF pose of an object using a single RGB-D image.
Unlike existing methods that either directly predict objects' poses or rely on sparse keypoints for pose recovery, our approach addresses this challenging task using dense correspondence.
arXiv Detail & Related papers (2024-05-14T10:10:45Z) - Wonder3D: Single Image to 3D using Cross-Domain Diffusion [105.16622018766236]
Wonder3D is a novel method for efficiently generating high-fidelity textured meshes from single-view images.
To holistically improve the quality, consistency, and efficiency of image-to-3D tasks, we propose a cross-domain diffusion model.
arXiv Detail & Related papers (2023-10-23T15:02:23Z) - Relightify: Relightable 3D Faces from a Single Image via Diffusion
Models [86.3927548091627]
We present the first approach to use diffusion models as a prior for highly accurate 3D facial BRDF reconstruction from a single image.
In contrast to existing methods, we directly acquire the observed texture from the input image, thus, resulting in more faithful and consistent estimation.
arXiv Detail & Related papers (2023-05-10T11:57:49Z) - 3D Surface Reconstruction in the Wild by Deforming Shape Priors from
Synthetic Data [24.97027425606138]
Reconstructing the underlying 3D surface of an object from a single image is a challenging problem.
We present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image.
Our approach achieves state-of-the-art reconstruction performance across several real-world datasets.
arXiv Detail & Related papers (2023-02-24T20:37:27Z) - Shape, Pose, and Appearance from a Single Image via Bootstrapped
Radiance Field Inversion [54.151979979158085]
We introduce a principled end-to-end reconstruction framework for natural images, where accurate ground-truth poses are not available.
We leverage an unconditional 3D-aware generator, to which we apply a hybrid inversion scheme where a model produces a first guess of the solution.
Our framework can de-render an image in as few as 10 steps, enabling its use in practical scenarios.
arXiv Detail & Related papers (2022-11-21T17:42:42Z) - Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images [82.32776379815712]
We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses.
We adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network.
Our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable function for test-time optimization.
arXiv Detail & Related papers (2022-04-21T03:42:31Z) - Learning Stereopsis from Geometric Synthesis for 6D Object Pose
Estimation [11.999630902627864]
Current monocular-based 6D object pose estimation methods generally achieve less competitive results than RGBD-based methods.
This paper proposes a 3D geometric volume based pose estimation method with a short baseline two-view setting.
Experiments show that our method outperforms state-of-the-art monocular-based methods, and is robust in different objects and scenes.
arXiv Detail & Related papers (2021-09-25T02:55:05Z) - From Points to Multi-Object 3D Reconstruction [71.17445805257196]
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image.
A keypoint detector localizes objects as center points and directly predicts all object properties, including 9-DoF bounding boxes and 3D shapes.
The presented approach performs lightweight reconstruction in a single-stage, it is real-time capable, fully differentiable and end-to-end trainable.
arXiv Detail & Related papers (2020-12-21T18:52:21Z) - Ladybird: Quasi-Monte Carlo Sampling for Deep Implicit Field Based 3D
Reconstruction with Symmetry [12.511526058118143]
We propose a sampling scheme that theoretically encourages generalization and results in fast convergence for SGD-based optimization algorithms.
Based on the reflective symmetry of an object, we propose a feature fusion method that alleviates issues due to self-occlusions.
Our proposed system Ladybird is able to create high quality 3D object reconstructions from a single input image.
arXiv Detail & Related papers (2020-07-27T09:17:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.