Learning Online Multi-Sensor Depth Fusion
- URL: http://arxiv.org/abs/2204.03353v1
- Date: Thu, 7 Apr 2022 10:45:32 GMT
- Title: Learning Online Multi-Sensor Depth Fusion
- Authors: Erik Sandstr\"om, Martin R. Oswald, Suryansh Kumar, Silvan Weder,
Fisher Yu, Cristian Sminchisescu, Luc Van Gool
- Abstract summary: SenFuNet is a depth fusion approach that learns sensor-specific noise and outlier statistics.
We conduct experiments with various sensor combinations on the real-world CoRBS and Scene3D datasets.
- Score: 100.84519175539378
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Many hand-held or mixed reality devices are used with a single sensor for 3D
reconstruction, although they often comprise multiple sensors. Multi-sensor
depth fusion is able to substantially improve the robustness and accuracy of 3D
reconstruction methods, but existing techniques are not robust enough to handle
sensors which operate with diverse value ranges as well as noise and outlier
statistics. To this end, we introduce SenFuNet, a depth fusion approach that
learns sensor-specific noise and outlier statistics and combines the data
streams of depth frames from different sensors in an online fashion. Our method
fuses multi-sensor depth streams regardless of time synchronization and
calibration and generalizes well with little training data. We conduct
experiments with various sensor combinations on the real-world CoRBS and
Scene3D datasets, as well as the Replica dataset. Experiments demonstrate that
our fusion strategy outperforms traditional and recent online depth fusion
approaches. In addition, the combination of multiple sensors yields more robust
outlier handling and precise surface reconstruction than the use of a single
sensor.
Related papers
- MSSIDD: A Benchmark for Multi-Sensor Denoising [55.41612200877861]
We introduce a new benchmark, the Multi-Sensor SIDD dataset, which is the first raw-domain dataset designed to evaluate the sensor transferability of denoising models.
We propose a sensor consistency training framework that enables denoising models to learn the sensor-invariant features.
arXiv Detail & Related papers (2024-11-18T13:32:59Z) - Robust Depth Enhancement via Polarization Prompt Fusion Tuning [112.88371907047396]
We present a framework that leverages polarization imaging to improve inaccurate depth measurements from various depth sensors.
Our method first adopts a learning-based strategy where a neural network is trained to estimate a dense and complete depth map from polarization data and a sensor depth map from different sensors.
To further improve the performance, we propose a Polarization Prompt Fusion Tuning (PPFT) strategy to effectively utilize RGB-based models pre-trained on large-scale datasets.
arXiv Detail & Related papers (2024-04-05T17:55:33Z) - Virtual Fusion with Contrastive Learning for Single Sensor-based
Activity Recognition [5.225544155289783]
Various types of sensors can be used for Human Activity Recognition (HAR)
Sometimes a single sensor cannot fully observe the user's motions from its perspective, which causes wrong predictions.
We propose Virtual Fusion - a new method that takes advantage of unlabeled data from multiple time-synchronized sensors during training, but only needs one sensor for inference.
arXiv Detail & Related papers (2023-12-01T17:03:27Z) - Towards a Robust Sensor Fusion Step for 3D Object Detection on Corrupted
Data [4.3012765978447565]
This work presents a novel fusion step that addresses data corruptions and makes sensor fusion for 3D object detection more robust.
We demonstrate that our method performs on par with state-of-the-art approaches on normal data and outperforms them on misaligned data.
arXiv Detail & Related papers (2023-06-12T18:06:29Z) - Multi-Modal 3D Object Detection by Box Matching [109.43430123791684]
We propose a novel Fusion network by Box Matching (FBMNet) for multi-modal 3D detection.
With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features.
arXiv Detail & Related papers (2023-05-12T18:08:51Z) - Representation Learning for Remote Sensing: An Unsupervised Sensor
Fusion Approach [0.0]
We propose Contrastive Sensor Fusion, which exploits coterminous data from multiple sources to learn useful representations of every possible combination of those sources.
Using a dataset of 47 million unlabeled coterminous image triplets, we train an encoder to produce meaningful representations from any possible combination of channels from the input sensors.
These representations outperform fully supervised ImageNet weights on a remote sensing classification task and improve as more sensors are fused.
arXiv Detail & Related papers (2021-08-11T08:32:58Z) - CalibDNN: Multimodal Sensor Calibration for Perception Using Deep Neural
Networks [27.877734292570967]
We propose a novel deep learning-driven technique (CalibDNN) for accurate calibration among multimodal sensor, specifically LiDAR-Camera pairs.
The entire processing is fully automatic with a single model and single iteration.
Results comparison among different methods and extensive experiments on different datasets demonstrates the state-of-the-art performance.
arXiv Detail & Related papers (2021-03-27T02:43:37Z) - GEM: Glare or Gloom, I Can Still See You -- End-to-End Multimodal Object
Detector [11.161639542268015]
We propose sensor-aware multi-modal fusion strategies for 2D object detection in harsh-lighting conditions.
Our network learns to estimate the measurement reliability of each sensor modality in the form of scalar weights and masks.
We show that the proposed strategies out-perform the existing state-of-the-art methods on the FLIR-Thermal dataset.
arXiv Detail & Related papers (2021-02-24T14:56:37Z) - Deep Continuous Fusion for Multi-Sensor 3D Object Detection [103.5060007382646]
We propose a novel 3D object detector that can exploit both LIDAR as well as cameras to perform very accurate localization.
We design an end-to-end learnable architecture that exploits continuous convolutions to fuse image and LIDAR feature maps at different levels of resolution.
arXiv Detail & Related papers (2020-12-20T18:43:41Z) - Learning Selective Sensor Fusion for States Estimation [47.76590539558037]
We propose SelectFusion, an end-to-end selective sensor fusion module.
During prediction, the network is able to assess the reliability of the latent features from different sensor modalities.
We extensively evaluate all fusion strategies in both public datasets and on progressively degraded datasets.
arXiv Detail & Related papers (2019-12-30T20:25:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.