Refinement of Monocular Depth Maps via Multi-View Differentiable Rendering
- URL: http://arxiv.org/abs/2410.03861v1
- Date: Fri, 4 Oct 2024 18:50:28 GMT
- Title: Refinement of Monocular Depth Maps via Multi-View Differentiable Rendering
- Authors: Laura Fink, Linus Franke, Joachim Keinert, Marc Stamminger,
- Abstract summary: We present a novel approach to generate view consistent and detailed depth maps from a number of posed images.
We leverage advances in monocular depth estimation, which generate topologically complete, but metrically inaccurate depth maps.
Our method is able to generate dense, detailed, high-quality depth maps, also in challenging indoor scenarios, and outperforms state-of-the-art depth reconstruction approaches.
- Score: 4.717325308876748
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The accurate reconstruction of per-pixel depth for an image is vital for many tasks in computer graphics, computer vision, and robotics. In this paper, we present a novel approach to generate view consistent and detailed depth maps from a number of posed images. We leverage advances in monocular depth estimation, which generate topologically complete, but metrically inaccurate depth maps and refine them in a two-stage optimization process based on a differentiable renderer. Taking the monocular depth map as input, we first scale this map to absolute distances based on structure-from-motion and transform the depths to a triangle surface mesh. We then refine this depth mesh in a local optimization, enforcing photometric and geometric consistency. Our evaluation shows that our method is able to generate dense, detailed, high-quality depth maps, also in challenging indoor scenarios, and outperforms state-of-the-art depth reconstruction approaches. Overview and supplemental material of this project can be found at https://lorafib.github.io/ref_depth/.
Related papers
- Constraining Depth Map Geometry for Multi-View Stereo: A Dual-Depth
Approach with Saddle-shaped Depth Cells [23.345139129458122]
We show that different depth geometries have significant performance gaps, even using the same depth prediction error.
We introduce an ideal depth geometry composed of Saddle-Shaped Cells, whose predicted depth map oscillates upward and downward around the ground-truth surface.
Our method also points to a new research direction for considering depth geometry in MVS.
arXiv Detail & Related papers (2023-07-18T11:37:53Z) - TMO: Textured Mesh Acquisition of Objects with a Mobile Device by using
Differentiable Rendering [54.35405028643051]
We present a new pipeline for acquiring a textured mesh in the wild with a single smartphone.
Our method first introduces an RGBD-aided structure from motion, which can yield filtered depth maps.
We adopt the neural implicit surface reconstruction method, which allows for high-quality mesh.
arXiv Detail & Related papers (2023-03-27T10:07:52Z) - VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction [71.83308989022635]
In this paper, we advocate that replicating the traditional two stages framework with deep neural networks improves both the interpretability and the accuracy of the results.
Our network operates in two steps: 1) the local computation of the local depth maps with a deep MVS technique, and, 2) the depth maps and images' features fusion to build a single TSDF volume.
In order to improve the matching performance between images acquired from very different viewpoints, we introduce a rotation-invariant 3D convolution kernel called PosedConv.
arXiv Detail & Related papers (2021-08-19T11:33:58Z) - Monocular Depth Parameterizing Networks [15.791732557395552]
We propose a network structure that provides a parameterization of a set of depth maps with feasible shapes.
This allows us to search the shapes for a photo consistent solution with respect to other images.
Our experimental evaluation shows that our method generates more accurate depth maps and generalizes better than competing state-of-the-art approaches.
arXiv Detail & Related papers (2020-12-21T13:02:41Z) - Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for
3D Reconstruction [12.728154351588053]
We present an efficient multi-view stereo (MVS) network for 3D reconstruction from multiview images.
We introduce a coarseto-fine depth inference strategy to achieve high resolution depth.
arXiv Detail & Related papers (2020-11-25T13:34:11Z) - Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video.
Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z) - Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images [59.906948203578544]
We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object.
We first estimate per-view depth maps using a deep multi-view stereo network.
These depth maps are used to coarsely align the different views.
We propose a novel multi-view reflectance estimation network architecture.
arXiv Detail & Related papers (2020-03-27T21:28:54Z) - OmniSLAM: Omnidirectional Localization and Dense Mapping for
Wide-baseline Multi-camera Systems [88.41004332322788]
We present an omnidirectional localization and dense mapping system for a wide-baseline multiview stereo setup with ultra-wide field-of-view (FOV) fisheye cameras.
For more practical and accurate reconstruction, we first introduce improved and light-weighted deep neural networks for the omnidirectional depth estimation.
We integrate our omnidirectional depth estimates into the visual odometry (VO) and add a loop closing module for global consistency.
arXiv Detail & Related papers (2020-03-18T05:52:10Z) - Depth Completion Using a View-constrained Deep Prior [73.21559000917554]
Recent work has shown that the structure of convolutional neural networks (CNNs) induces a strong prior that favors natural images.
This prior, known as a deep image prior (DIP), is an effective regularizer in inverse problems such as image denoising and inpainting.
We extend the concept of the DIP to depth images. Given color images and noisy and incomplete target depth maps, we reconstruct a depth map restored by virtue of using the CNN network structure as a prior.
arXiv Detail & Related papers (2020-01-21T21:56:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.