BS3D: Building-scale 3D Reconstruction from RGB-D Images
- URL: http://arxiv.org/abs/2301.01057v1
- Date: Tue, 3 Jan 2023 11:46:14 GMT
- Title: BS3D: Building-scale 3D Reconstruction from RGB-D Images
- Authors: Janne Mustaniemi, Juho Kannala, Esa Rahtu, Li Liu and Janne Heikkil\"a
- Abstract summary: We propose an easy-to-use framework for acquiring building-scale 3D reconstruction using a consumer depth camera.
Unlike complex and expensive acquisition setups, our system enables crowd-sourcing, which can greatly benefit data-hungry algorithms.
- Score: 25.604775584883413
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Various datasets have been proposed for simultaneous localization and mapping
(SLAM) and related problems. Existing datasets often include small
environments, have incomplete ground truth, or lack important sensor data, such
as depth and infrared images. We propose an easy-to-use framework for acquiring
building-scale 3D reconstruction using a consumer depth camera. Unlike complex
and expensive acquisition setups, our system enables crowd-sourcing, which can
greatly benefit data-hungry algorithms. Compared to similar systems, we utilize
raw depth maps for odometry computation and loop closure refinement which
results in better reconstructions. We acquire a building-scale 3D dataset
(BS3D) and demonstrate its value by training an improved monocular depth
estimation model. As a unique experiment, we benchmark visual-inertial odometry
methods using both color and active infrared images.
Related papers
- MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements [59.70107451308687]
We show for the first time that using 3D Gaussians for map representation with unposed camera images and inertial measurements can enable accurate SLAM.
Our method, MM3DGS, addresses the limitations of prior rendering by enabling faster scale awareness, and improved trajectory tracking.
We also release a multi-modal dataset, UT-MM, collected from a mobile robot equipped with a camera and an inertial measurement unit.
arXiv Detail & Related papers (2024-04-01T04:57:41Z) - AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation [51.143540967290114]
We propose a method that unlocks a wide range of previously-infeasible geometric augmentations for unsupervised depth computation and estimation.
This is achieved by reversing, or undo''-ing, geometric transformations to the coordinates of the output depth, warping the depth map back to the original reference frame.
arXiv Detail & Related papers (2023-10-15T05:15:45Z) - R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras [106.52409577316389]
R3D3 is a multi-camera system for dense 3D reconstruction and ego-motion estimation.
Our approach exploits spatial-temporal information from multiple cameras, and monocular depth refinement.
We show that this design enables a dense, consistent 3D reconstruction of challenging, dynamic outdoor environments.
arXiv Detail & Related papers (2023-08-28T17:13:49Z) - 3D Reconstruction of Spherical Images based on Incremental Structure
from Motion [2.6432771146480283]
This study investigates the algorithms for the relative orientation using spherical correspondences, absolute orientation using 3D correspondences between scene and spherical points, and the cost functions for BA (bundle adjustment) optimization.
An incremental SfM (Structure from Motion) workflow has been proposed for spherical images using the above-mentioned algorithms.
arXiv Detail & Related papers (2023-06-22T09:49:28Z) - MobileBrick: Building LEGO for 3D Reconstruction on Mobile Devices [78.20154723650333]
High-quality 3D ground-truth shapes are critical for 3D object reconstruction evaluation.
We introduce a novel multi-view RGBD dataset captured using a mobile device.
We obtain precise 3D ground-truth shape without relying on high-end 3D scanners.
arXiv Detail & Related papers (2023-03-03T14:02:50Z) - Beyond Visual Field of View: Perceiving 3D Environment with Echoes and
Vision [51.385731364529306]
This paper focuses on perceiving and navigating 3D environments using echoes and RGB image.
In particular, we perform depth estimation by fusing RGB image with echoes, received from multiple orientations.
We show that the echoes provide holistic and in-expensive information about the 3D structures complementing the RGB image.
arXiv Detail & Related papers (2022-07-03T22:31:47Z) - VR3Dense: Voxel Representation Learning for 3D Object Detection and
Monocular Dense Depth Reconstruction [0.951828574518325]
We introduce a method for jointly training 3D object detection and monocular dense depth reconstruction neural networks.
It takes as inputs, a LiDAR point-cloud, and a single RGB image during inference and produces object pose predictions as well as a densely reconstructed depth map.
While our object detection is trained in a supervised manner, the depth prediction network is trained with both self-supervised and supervised loss functions.
arXiv Detail & Related papers (2021-04-13T04:25:54Z) - Depth-Enhanced Feature Pyramid Network for Occlusion-Aware Verification
of Buildings from Oblique Images [15.466320414614971]
This paper proposes a fused feature pyramid network to detect changes in buildings in urban environments.
It uses both color and depth data for the 3D verification of existing buildings 2D footprints from oblique images.
We demonstrate that the proposed method can successfully detect all changed buildings.
arXiv Detail & Related papers (2020-11-26T10:51:36Z) - Learning to Detect 3D Reflection Symmetry for Single-View Reconstruction [32.14605731030579]
3D reconstruction from a single RGB image is a challenging problem in computer vision.
Previous methods are usually solely data-driven, which lead to inaccurate 3D shape recovery and limited generalization capability.
We present a geometry-based end-to-end deep learning framework that first detects the mirror plane of reflection symmetry that commonly exists in man-made objects and then predicts depth maps by finding the intra-image pixel-wise correspondence of the symmetry.
arXiv Detail & Related papers (2020-06-17T17:58:59Z) - Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images [59.906948203578544]
We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object.
We first estimate per-view depth maps using a deep multi-view stereo network.
These depth maps are used to coarsely align the different views.
We propose a novel multi-view reflectance estimation network architecture.
arXiv Detail & Related papers (2020-03-27T21:28:54Z) - Atlas: End-to-End 3D Scene Reconstruction from Posed Images [13.154808583020229]
We present an end-to-end 3D reconstruction method for a scene by directly regressing a truncated signed distance function (TSDF) from a set of posed RGB images.
A 2D CNN extracts features from each image independently which are then back-projected and accumulated into a voxel volume.
A 3D CNN refines the accumulated features and predicts the TSDF values.
arXiv Detail & Related papers (2020-03-23T17:59:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.