Towards High-Fidelity Single-view Holistic Reconstruction of Indoor
Scenes
- URL: http://arxiv.org/abs/2207.08656v1
- Date: Mon, 18 Jul 2022 14:54:57 GMT
- Title: Towards High-Fidelity Single-view Holistic Reconstruction of Indoor
Scenes
- Authors: Haolin Liu, Yujian Zheng, Guanying Chen, Shuguang Cui and Xiaoguang
Han
- Abstract summary: We present a new framework to reconstruct holistic 3D indoor scenes from single-view images.
We propose an instance-aligned implicit function (InstPIFu) for detailed object reconstruction.
Our code and model will be made publicly available.
- Score: 50.317223783035075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a new framework to reconstruct holistic 3D indoor scenes including
both room background and indoor objects from single-view images. Existing
methods can only produce 3D shapes of indoor objects with limited geometry
quality because of the heavy occlusion of indoor scenes. To solve this, we
propose an instance-aligned implicit function (InstPIFu) for detailed object
reconstruction. Combining with instance-aligned attention module, our method is
empowered to decouple mixed local features toward the occluded instances.
Additionally, unlike previous methods that simply represents the room
background as a 3D bounding box, depth map or a set of planes, we recover the
fine geometry of the background via implicit representation. Extensive
experiments on the e SUN RGB-D, Pix3D, 3D-FUTURE, and 3D-FRONT datasets
demonstrate that our method outperforms existing approaches in both background
and foreground object reconstruction. Our code and model will be made publicly
available.
Related papers
- Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture [47.44029968307207]
We propose a novel framework for simultaneous high-fidelity recovery of object shapes and textures from single-view images.
Our approach utilizes the proposed Single-view neural implicit Shape and Radiance field (SSR) representations to leverage both explicit 3D shape supervision and volume rendering.
A distinctive feature of our framework is its ability to generate fine-grained textured meshes while seamlessly integrating rendering capabilities into the single-view 3D reconstruction model.
arXiv Detail & Related papers (2023-11-01T11:46:15Z) - Anything-3D: Towards Single-view Anything Reconstruction in the Wild [61.090129285205805]
We introduce Anything-3D, a methodical framework that ingeniously combines a series of visual-language models and the Segment-Anything object segmentation model.
Our approach employs a BLIP model to generate textural descriptions, utilize the Segment-Anything model for the effective extraction of objects of interest, and leverages a text-to-image diffusion model to lift object into a neural radiance field.
arXiv Detail & Related papers (2023-04-19T16:39:51Z) - 3D Surface Reconstruction in the Wild by Deforming Shape Priors from
Synthetic Data [24.97027425606138]
Reconstructing the underlying 3D surface of an object from a single image is a challenging problem.
We present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image.
Our approach achieves state-of-the-art reconstruction performance across several real-world datasets.
arXiv Detail & Related papers (2023-02-24T20:37:27Z) - Learning Canonical 3D Object Representation for Fine-Grained Recognition [77.33501114409036]
We propose a novel framework for fine-grained object recognition that learns to recover object variation in 3D space from a single image.
We represent an object as a composition of 3D shape and its appearance, while eliminating the effect of camera viewpoint.
By incorporating 3D shape and appearance jointly in a deep representation, our method learns the discriminative representation of the object.
arXiv Detail & Related papers (2021-08-10T12:19:34Z) - Object Wake-up: 3-D Object Reconstruction, Animation, and in-situ
Rendering from a Single Image [58.69732754597448]
Given a picture of a chair, could we extract the 3-D shape of the chair, animate its plausible articulations and motions, and render in-situ in its original image space?
We devise an automated approach to extract and manipulate articulated objects in single images.
arXiv Detail & Related papers (2021-08-05T16:20:12Z) - Discovering 3D Parts from Image Collections [98.16987919686709]
We tackle the problem of 3D part discovery from only 2D image collections.
Instead of relying on manually annotated parts for supervision, we propose a self-supervised approach.
Our key insight is to learn a novel part shape prior that allows each part to fit an object shape faithfully while constrained to have simple geometry.
arXiv Detail & Related papers (2021-07-28T20:29:16Z) - From Points to Multi-Object 3D Reconstruction [71.17445805257196]
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image.
A keypoint detector localizes objects as center points and directly predicts all object properties, including 9-DoF bounding boxes and 3D shapes.
The presented approach performs lightweight reconstruction in a single-stage, it is real-time capable, fully differentiable and end-to-end trainable.
arXiv Detail & Related papers (2020-12-21T18:52:21Z) - CoReNet: Coherent 3D scene reconstruction from a single RGB image [43.74240268086773]
We build on advances in deep learning to reconstruct the shape of a single object given only one RBG image as input.
We propose three extensions: (1) ray-traced skip connections that propagate local 2D information to the output 3D volume in a physically correct manner; (2) a hybrid 3D volume representation that enables building translation equivariant models; and (3) a reconstruction loss tailored to capture overall object geometry.
We reconstruct all objects jointly in one pass, producing a coherent reconstruction, where all objects live in a single consistent 3D coordinate frame relative to the camera and they do not intersect in 3D space.
arXiv Detail & Related papers (2020-04-27T17:53:07Z) - Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction
for Indoor Scenes from a Single Image [24.99186733297264]
We propose an end-to-end solution to jointly reconstruct room layout, object bounding boxes and meshes from a single image.
Our method builds upon a holistic scene context and proposes a coarse-to-fine hierarchy with three components.
Experiments on the SUN RGB-D and Pix3D datasets demonstrate that our method consistently outperforms existing methods.
arXiv Detail & Related papers (2020-02-27T16:00:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.