Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture
- URL: http://arxiv.org/abs/2311.00457v1
- Date: Wed, 1 Nov 2023 11:46:15 GMT
- Title: Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture
- Authors: Yixin Chen, Junfeng Ni, Nan Jiang, Yaowei Zhang, Yixin Zhu, Siyuan
Huang
- Abstract summary: We propose a novel framework for simultaneous high-fidelity recovery of object shapes and textures from single-view images.
Our approach utilizes the proposed Single-view neural implicit Shape and Radiance field (SSR) representations to leverage both explicit 3D shape supervision and volume rendering.
A distinctive feature of our framework is its ability to generate fine-grained textured meshes while seamlessly integrating rendering capabilities into the single-view 3D reconstruction model.
- Score: 47.44029968307207
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reconstructing detailed 3D scenes from single-view images remains a
challenging task due to limitations in existing approaches, which primarily
focus on geometric shape recovery, overlooking object appearances and fine
shape details. To address these challenges, we propose a novel framework for
simultaneous high-fidelity recovery of object shapes and textures from
single-view images. Our approach utilizes the proposed Single-view neural
implicit Shape and Radiance field (SSR) representations to leverage both
explicit 3D shape supervision and volume rendering of color, depth, and surface
normal images. To overcome shape-appearance ambiguity under partial
observations, we introduce a two-stage learning curriculum incorporating both
3D and 2D supervisions. A distinctive feature of our framework is its ability
to generate fine-grained textured meshes while seamlessly integrating rendering
capabilities into the single-view 3D reconstruction model. This integration
enables not only improved textured 3D object reconstruction by 27.7% and 11.6%
on the 3D-FRONT and Pix3D datasets, respectively, but also supports the
rendering of images from novel viewpoints. Beyond individual objects, our
approach facilitates composing object-level representations into flexible scene
representations, thereby enabling applications such as holistic scene
understanding and 3D scene editing. We conduct extensive experiments to
demonstrate the effectiveness of our method.
Related papers
- Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.
voxelization infers per-object occupancy probabilities at individual spatial locations.
Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z) - 3DFIRES: Few Image 3D REconstruction for Scenes with Hidden Surface [8.824340350342512]
3DFIRES is a novel system for scene-level 3D reconstruction from posed images.
We show it matches the efficacy of single-view reconstruction methods with only one input.
arXiv Detail & Related papers (2024-03-13T17:59:50Z) - Differentiable Blocks World: Qualitative 3D Decomposition by Rendering
Primitives [70.32817882783608]
We present an approach that produces a simple, compact, and actionable 3D world representation by means of 3D primitives.
Unlike existing primitive decomposition methods that rely on 3D input data, our approach operates directly on images.
We show that the resulting textured primitives faithfully reconstruct the input images and accurately model the visible 3D points.
arXiv Detail & Related papers (2023-07-11T17:58:31Z) - Structured 3D Features for Reconstructing Controllable Avatars [43.36074729431982]
We introduce Structured 3D Features, a model based on a novel implicit 3D representation that pools pixel-aligned image features onto dense 3D points sampled from a parametric, statistical human mesh surface.
We show that our S3F model surpasses the previous state-of-the-art on various tasks, including monocular 3D reconstruction, as well as albedo and shading estimation.
arXiv Detail & Related papers (2022-12-13T18:57:33Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z) - Object Wake-up: 3-D Object Reconstruction, Animation, and in-situ
Rendering from a Single Image [58.69732754597448]
Given a picture of a chair, could we extract the 3-D shape of the chair, animate its plausible articulations and motions, and render in-situ in its original image space?
We devise an automated approach to extract and manipulate articulated objects in single images.
arXiv Detail & Related papers (2021-08-05T16:20:12Z) - Weakly Supervised Learning of Multi-Object 3D Scene Decompositions Using
Deep Shape Priors [69.02332607843569]
PriSMONet is a novel approach for learning Multi-Object 3D scene decomposition and representations from single images.
A recurrent encoder regresses a latent representation of 3D shape, pose and texture of each object from an input RGB image.
We evaluate the accuracy of our model in inferring 3D scene layout, demonstrate its generative capabilities, assess its generalization to real images, and point out benefits of the learned representation.
arXiv Detail & Related papers (2020-10-08T14:49:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.