MVLayoutNet:3D layout reconstruction with multi-view panoramas
- URL: http://arxiv.org/abs/2112.06133v1
- Date: Sun, 12 Dec 2021 03:04:32 GMT
- Title: MVLayoutNet:3D layout reconstruction with multi-view panoramas
- Authors: Zhihua Hu, Bo Duan, Yanfeng Zhang, Mingwei Sun, Jingwei Huang
- Abstract summary: MVNet is an end-to-end network for holistic 3D reconstruction from multi-view panoramas.
We jointly train a layout module to produce an initial layout and a novel MVS module to obtain accurate layout geometry.
Our method leads to coherent layout geometry that enables the reconstruction of an entire scene.
- Score: 12.981269280023469
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present MVLayoutNet, an end-to-end network for holistic 3D reconstruction
from multi-view panoramas. Our core contribution is to seamlessly combine
learned monocular layout estimation and multi-view stereo (MVS) for accurate
layout reconstruction in both 3D and image space. We jointly train a layout
module to produce an initial layout and a novel MVS module to obtain accurate
layout geometry. Unlike standard MVSNet [33], our MVS module takes a
newly-proposed layout cost volume, which aggregates multi-view costs at the
same depth layer into corresponding layout elements. We additionally provide an
attention-based scheme that guides the MVS module to focus on structural
regions. Such a design considers both local pixel-level costs and global
holistic information for better reconstruction. Experiments show that our
method outperforms state-of-the-arts in terms of depth rmse by 21.7% and 20.6%
on the 2D-3D-S [1] and ZInD [5] datasets. Finally, our method leads to coherent
layout geometry that enables the reconstruction of an entire scene.
Related papers
- 3DFIRES: Few Image 3D REconstruction for Scenes with Hidden Surface [8.824340350342512]
3DFIRES is a novel system for scene-level 3D reconstruction from posed images.
We show it matches the efficacy of single-view reconstruction methods with only one input.
arXiv Detail & Related papers (2024-03-13T17:59:50Z) - 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception [56.84921040837699]
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results.
We propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics.
We also present an unsupervised adaptation technique tailored for horizon-depth and ratio representations.
Our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
arXiv Detail & Related papers (2023-12-26T12:16:03Z) - GPR-Net: Multi-view Layout Estimation via a Geometry-aware Panorama
Registration Network [44.06968418800436]
We present a complete panoramic layout estimation framework that jointly learns panorama registration and layout estimation given a pair of panoramas.
The major improvement over PSMNet comes from a novel Geometry-aware Panorama Registration Network or GPR-Net.
Experimental results indicate that our method achieves state-of-the-art performance in both panorama registration and layout estimation on a large-scale indoor panorama dataset ZInD.
arXiv Detail & Related papers (2022-10-20T17:10:41Z) - Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images.
Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods.
The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z) - VPFusion: Joint 3D Volume and Pixel-Aligned Feature Fusion for Single
and Multi-view 3D Reconstruction [23.21446438011893]
VPFusionattains high-quality reconstruction using both - 3D feature volume to capture 3D-structure-aware context.
Existing approaches use RNN, feature pooling, or attention computed independently in each view for multi-view fusion.
We show improved multi-view feature fusion by establishing transformer-based pairwise view association.
arXiv Detail & Related papers (2022-03-14T23:30:58Z) - VoRTX: Volumetric 3D Reconstruction With Transformers for Voxelwise View
Selection and Fusion [68.68537312256144]
VoRTX is an end-to-end volumetric 3D reconstruction network using transformers for wide-baseline, multi-view feature fusion.
We train our model on ScanNet and show that it produces better reconstructions than state-of-the-art methods.
arXiv Detail & Related papers (2021-12-01T02:18:11Z) - 3DVNet: Multi-View Depth Prediction and Volumetric Refinement [68.68537312256144]
3DVNet is a novel multi-view stereo (MVS) depth-prediction method.
Our key idea is the use of a 3D scene-modeling network that iteratively updates a set of coarse depth predictions.
We show that our method exceeds state-of-the-art accuracy in both depth prediction and 3D reconstruction metrics.
arXiv Detail & Related papers (2021-12-01T00:52:42Z) - TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view
Stereo [55.30992853477754]
We present TANDEM, a real-time monocular tracking and dense framework.
For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of alignments.
TANDEM shows state-of-the-art real-time 3D reconstruction performance.
arXiv Detail & Related papers (2021-11-14T19:01:02Z) - LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth
Rendering [59.63979143021241]
We formulate the task of 360 layout estimation as a problem of predicting depth on the horizon line of a panorama.
We propose the Differentiable Depth Rendering procedure to make the conversion from layout to depth prediction differentiable.
Our method achieves state-of-the-art performance on numerous 360 layout benchmark datasets.
arXiv Detail & Related papers (2021-04-01T15:48:41Z) - General 3D Room Layout from a Single View by Render-and-Compare [36.94817376590415]
We present a novel method to reconstruct the 3D layout of a room from a single perspective view.
Our dataset consists of 293 images from ScanNet, which we annotated with precise 3D layouts.
arXiv Detail & Related papers (2020-01-07T16:14:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.