UNeR3D: Versatile and Scalable 3D RGB Point Cloud Generation from 2D
Images in Unsupervised Reconstruction
- URL: http://arxiv.org/abs/2312.06706v1
- Date: Sun, 10 Dec 2023 15:18:55 GMT
- Title: UNeR3D: Versatile and Scalable 3D RGB Point Cloud Generation from 2D
Images in Unsupervised Reconstruction
- Authors: Hongbin Lin, Juangui Xu, Qingfeng Xu, Zhengyu Hu, Handing Xu, Yunzhi
Chen, Yongjun Hu, Zhenguo Nie
- Abstract summary: UNeR3D sets a new standard for generating detailed 3D reconstructions solely from 2D views.
Our model significantly cuts down the training costs tied to supervised approaches.
UNeR3D ensures seamless color transitions, enhancing visual fidelity.
- Score: 2.7848140839111903
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In the realm of 3D reconstruction from 2D images, a persisting challenge is
to achieve high-precision reconstructions devoid of 3D Ground Truth data
reliance. We present UNeR3D, a pioneering unsupervised methodology that sets a
new standard for generating detailed 3D reconstructions solely from 2D views.
Our model significantly cuts down the training costs tied to supervised
approaches and introduces RGB coloration to 3D point clouds, enriching the
visual experience. Employing an inverse distance weighting technique for color
rendering, UNeR3D ensures seamless color transitions, enhancing visual
fidelity. Our model's flexible architecture supports training with any number
of views, and uniquely, it is not constrained by the number of views used
during training when performing reconstructions. It can infer with an arbitrary
count of views during inference, offering unparalleled versatility.
Additionally, the model's continuous spatial input domain allows the generation
of point clouds at any desired resolution, empowering the creation of
high-resolution 3D RGB point clouds. We solidify the reconstruction process
with a novel multi-view geometric loss and color loss, demonstrating that our
model excels with single-view inputs and beyond, thus reshaping the paradigm of
unsupervised learning in 3D vision. Our contributions signal a substantial leap
forward in 3D vision, offering new horizons for content creation across diverse
applications. Code is available at https://github.com/HongbinLin3589/UNeR3D.
Related papers
- LAM3D: Large Image-Point-Cloud Alignment Model for 3D Reconstruction from Single Image [64.94932577552458]
Large Reconstruction Models have made significant strides in the realm of automated 3D content generation from single or multiple input images.
Despite their success, these models often produce 3D meshes with geometric inaccuracies, stemming from the inherent challenges of deducing 3D shapes solely from image data.
We introduce a novel framework, the Large Image and Point Cloud Alignment Model (LAM3D), which utilizes 3D point cloud data to enhance the fidelity of generated 3D meshes.
arXiv Detail & Related papers (2024-05-24T15:09:12Z) - MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation [54.27399121779011]
We present MVD-Fusion: a method for single-view 3D inference via generative modeling of multi-view-consistent RGB-D images.
We show that our approach can yield more accurate synthesis compared to recent state-of-the-art, including distillation-based 3D inference and prior multi-view generation methods.
arXiv Detail & Related papers (2024-04-04T17:59:57Z) - 2L3: Lifting Imperfect Generated 2D Images into Accurate 3D [16.66666619143761]
Multi-view (MV) 3D reconstruction is a promising solution to fuse generated MV images into consistent 3D objects.
However, the generated images usually suffer from inconsistent lighting, misaligned geometry, and sparse views, leading to poor reconstruction quality.
We present a novel 3D reconstruction framework that leverages intrinsic decomposition guidance, transient-mono prior guidance, and view augmentation to cope with the three issues.
arXiv Detail & Related papers (2024-01-29T02:30:31Z) - Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture [47.44029968307207]
We propose a novel framework for simultaneous high-fidelity recovery of object shapes and textures from single-view images.
Our approach utilizes the proposed Single-view neural implicit Shape and Radiance field (SSR) representations to leverage both explicit 3D shape supervision and volume rendering.
A distinctive feature of our framework is its ability to generate fine-grained textured meshes while seamlessly integrating rendering capabilities into the single-view 3D reconstruction model.
arXiv Detail & Related papers (2023-11-01T11:46:15Z) - Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion [115.82306502822412]
StyleGAN has achieved great progress in 2D face reconstruction and semantic editing via image inversion and latent editing.
A corresponding generic 3D GAN inversion framework is still missing, limiting the applications of 3D face reconstruction and semantic editing.
We study the challenging problem of 3D GAN inversion where a latent code is predicted given a single face image to faithfully recover its 3D shapes and detailed textures.
arXiv Detail & Related papers (2022-12-14T18:49:50Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z) - Points2NeRF: Generating Neural Radiance Fields from 3D point cloud [0.0]
We propose representing 3D objects as Neural Radiance Fields (NeRFs)
We leverage a hypernetwork paradigm and train the model to take a 3D point cloud with the associated color values.
Our method provides efficient 3D object representation and offers several advantages over the existing approaches.
arXiv Detail & Related papers (2022-06-02T20:23:33Z) - Weakly Supervised Learning of Multi-Object 3D Scene Decompositions Using
Deep Shape Priors [69.02332607843569]
PriSMONet is a novel approach for learning Multi-Object 3D scene decomposition and representations from single images.
A recurrent encoder regresses a latent representation of 3D shape, pose and texture of each object from an input RGB image.
We evaluate the accuracy of our model in inferring 3D scene layout, demonstrate its generative capabilities, assess its generalization to real images, and point out benefits of the learned representation.
arXiv Detail & Related papers (2020-10-08T14:49:23Z) - SPSG: Self-Supervised Photometric Scene Generation from RGB-D Scans [34.397726189729994]
SPSG is a novel approach to generate high-quality, colored 3D models of scenes from RGB-D scan observations.
Our self-supervised approach learns to jointly inpaint geometry and color by correlating an incomplete RGB-D scan with a more complete version of that scan.
arXiv Detail & Related papers (2020-06-25T18:58:23Z) - From Image Collections to Point Clouds with Self-supervised Shape and
Pose Networks [53.71440550507745]
Reconstructing 3D models from 2D images is one of the fundamental problems in computer vision.
We propose a deep learning technique for 3D object reconstruction from a single image.
We learn both 3D point cloud reconstruction and pose estimation networks in a self-supervised manner.
arXiv Detail & Related papers (2020-05-05T04:25:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.