NeuralFusion: Online Depth Fusion in Latent Space
- URL: http://arxiv.org/abs/2011.14791v1
- Date: Mon, 30 Nov 2020 13:50:59 GMT
- Title: NeuralFusion: Online Depth Fusion in Latent Space
- Authors: Silvan Weder, Johannes L. Sch\"onberger, Marc Pollefeys, Martin R.
Oswald
- Abstract summary: We present a novel online depth map fusion approach that learns depth map aggregation in a latent feature space.
Our approach is real-time capable, handles high noise levels, and is particularly able to deal with gross outliers common for photometric stereo-based depth maps.
- Score: 77.59420353185355
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We present a novel online depth map fusion approach that learns depth map
aggregation in a latent feature space. While previous fusion methods use an
explicit scene representation like signed distance functions (SDFs), we propose
a learned feature representation for the fusion. The key idea is a separation
between the scene representation used for the fusion and the output scene
representation, via an additional translator network. Our neural network
architecture consists of two main parts: a depth and feature fusion
sub-network, which is followed by a translator sub-network to produce the final
surface representation (e.g. TSDF) for visualization or other tasks. Our
approach is real-time capable, handles high noise levels, and is particularly
able to deal with gross outliers common for photometric stereo-based depth
maps. Experiments on real and synthetic data demonstrate improved results
compared to the state of the art, especially in challenging scenarios with
large amounts of noise and outliers.
Related papers
- Hi-Map: Hierarchical Factorized Radiance Field for High-Fidelity
Monocular Dense Mapping [51.739466714312805]
We introduce Hi-Map, a novel monocular dense mapping approach based on Neural Radiance Field (NeRF)
Hi-Map is exceptional in its capacity to achieve efficient and high-fidelity mapping using only posed RGB inputs.
arXiv Detail & Related papers (2024-01-06T12:32:25Z) - Learning Neural Implicit through Volume Rendering with Attentive Depth
Fusion Priors [32.63878457242185]
We learn neural implicit representations from multi-view RGBD images through volume rendering with an attentive depth fusion prior.
Our attention mechanism works with either a one-time fused TSDF that represents a whole scene or an incrementally fused TSDF that represents a partial scene.
Our evaluations on widely used benchmarks including synthetic and real-world scans show our superiority over the latest neural implicit methods.
arXiv Detail & Related papers (2023-10-17T21:45:51Z) - V-FUSE: Volumetric Depth Map Fusion with Long-Range Constraints [6.7197802356130465]
We introduce a learning-based depth map fusion framework that accepts a set of depth and confidence maps generated by a Multi-View Stereo (MVS) algorithm as input and improves them.
We also introduce a depth search window estimation sub-network trained jointly with the larger fusion sub-network to reduce the depth hypothesis search space along each ray.
Our method learns to model depth consensus and violations of visibility constraints directly from the data.
arXiv Detail & Related papers (2023-08-17T00:39:56Z) - VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction [71.83308989022635]
In this paper, we advocate that replicating the traditional two stages framework with deep neural networks improves both the interpretability and the accuracy of the results.
Our network operates in two steps: 1) the local computation of the local depth maps with a deep MVS technique, and, 2) the depth maps and images' features fusion to build a single TSDF volume.
In order to improve the matching performance between images acquired from very different viewpoints, we introduce a rotation-invariant 3D convolution kernel called PosedConv.
arXiv Detail & Related papers (2021-08-19T11:33:58Z) - A Real-Time Online Learning Framework for Joint 3D Reconstruction and
Semantic Segmentation of Indoor Scenes [87.74952229507096]
This paper presents a real-time online vision framework to jointly recover an indoor scene's 3D structure and semantic label.
Given noisy depth maps, a camera trajectory, and 2D semantic labels at train time, the proposed neural network learns to fuse the depth over frames with suitable semantic labels in the scene space.
arXiv Detail & Related papers (2021-08-11T14:29:01Z) - Volumetric Propagation Network: Stereo-LiDAR Fusion for Long-Range Depth
Estimation [81.08111209632501]
We propose a geometry-aware stereo-LiDAR fusion network for long-range depth estimation.
We exploit sparse and accurate point clouds as a cue for guiding correspondences of stereo images in a unified 3D volume space.
Our network achieves state-of-the-art performance on the KITTI and the Virtual- KITTI datasets.
arXiv Detail & Related papers (2021-03-24T03:24:46Z) - RoutedFusion: Learning Real-time Depth Map Fusion [73.0378509030908]
We present a novel real-time capable machine learning-based method for depth map fusion.
We propose a neural network that predicts non-linear updates to better account for typical fusion errors.
Our network is composed of a 2D depth routing network and a 3D depth fusion network which efficiently handle sensor-specific noise and outliers.
arXiv Detail & Related papers (2020-01-13T16:46:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.