Decoder Modulation for Indoor Depth Completion
- URL: http://arxiv.org/abs/2005.08607v2
- Date: Mon, 8 Feb 2021 08:20:51 GMT
- Title: Decoder Modulation for Indoor Depth Completion
- Authors: Dmitry Senushkin, Mikhail Romanov, Ilia Belikov, Anton Konushin,
Nikolay Patakin
- Abstract summary: Depth completion recovers a dense depth map from sensor measurements.
Current methods are mostly tailored for very sparse depth measurements from LiDARs in outdoor settings.
We propose a new model that takes into account the statistical difference between such regions.
- Score: 2.099922236065961
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Depth completion recovers a dense depth map from sensor measurements. Current
methods are mostly tailored for very sparse depth measurements from LiDARs in
outdoor settings, while for indoor scenes Time-of-Flight (ToF) or structured
light sensors are mostly used. These sensors provide semi-dense maps, with
dense measurements in some regions and almost empty in others. We propose a new
model that takes into account the statistical difference between such regions.
Our main contribution is a new decoder modulation branch added to the
encoder-decoder architecture. The encoder extracts features from the
concatenated RGB image and raw depth. Given the mask of missing values as
input, the proposed modulation branch controls the decoding of a dense depth
map from these features differently for different regions. This is implemented
by modifying the spatial distribution of output signals inside the decoder via
Spatially-Adaptive Denormalization (SPADE) blocks. Our second contribution is a
novel training strategy that allows us to train on a semi-dense sensor data
when the ground truth depth map is not available. Our model achieves the state
of the art results on indoor Matterport3D dataset. Being designed for
semi-dense input depth, our model is still competitive with LiDAR-oriented
approaches on the KITTI dataset. Our training strategy significantly improves
prediction quality with no dense ground truth available, as validated on the
NYUv2 dataset.
Related papers
- SDformer: Efficient End-to-End Transformer for Depth Completion [5.864200786548098]
Depth completion aims to predict dense depth maps with sparse depth measurements from a depth sensor.
Currently, Convolutional Neural Network (CNN) based models are the most popular methods applied to depth completion tasks.
To overcome the drawbacks of CNNs, a more effective and powerful method has been presented, which is an adaptive self-attention setting sequence-to-sequence model.
arXiv Detail & Related papers (2024-09-12T15:52:08Z) - Robust Depth Enhancement via Polarization Prompt Fusion Tuning [112.88371907047396]
We present a framework that leverages polarization imaging to improve inaccurate depth measurements from various depth sensors.
Our method first adopts a learning-based strategy where a neural network is trained to estimate a dense and complete depth map from polarization data and a sensor depth map from different sensors.
To further improve the performance, we propose a Polarization Prompt Fusion Tuning (PPFT) strategy to effectively utilize RGB-based models pre-trained on large-scale datasets.
arXiv Detail & Related papers (2024-04-05T17:55:33Z) - LiDAR Meta Depth Completion [47.99004789132264]
We propose a meta depth completion network that uses data patterns to learn a task network to solve a given depth completion task effectively.
While using a single model, our method yields significantly better results than a non-adaptive baseline trained on different LiDAR patterns.
These advantages allow flexible deployment of a single depth completion model on different sensors.
arXiv Detail & Related papers (2023-07-24T13:05:36Z) - UnLoc: A Universal Localization Method for Autonomous Vehicles using
LiDAR, Radar and/or Camera Input [51.150605800173366]
UnLoc is a novel unified neural modeling approach for localization with multi-sensor input in all weather conditions.
Our method is extensively evaluated on Oxford Radar RobotCar, ApolloSouthBay and Perth-WA datasets.
arXiv Detail & Related papers (2023-07-03T04:10:55Z) - HiMODE: A Hybrid Monocular Omnidirectional Depth Estimation Model [3.5290359800552946]
HiMODE is a novel monocular omnidirectional depth estimation model based on a CNN+Transformer architecture.
We show that HiMODE can achieve state-of-the-art performance for 360deg monocular depth estimation.
arXiv Detail & Related papers (2022-04-11T11:11:43Z) - Neural RF SLAM for unsupervised positioning and mapping with channel
state information [51.484516640867525]
We present a neural network architecture for jointly learning user locations and environment mapping up to isometry.
The proposed model learns an interpretable latent, i.e., user location, by just enforcing a physics-based decoder.
arXiv Detail & Related papers (2022-03-15T21:32:44Z) - Consistent Depth Prediction under Various Illuminations using Dilated
Cross Attention [1.332560004325655]
We propose to use internet 3D indoor scenes and manually tune their illuminations to render photo-realistic RGB photos and their corresponding depth and BRDF maps.
We perform cross attention on these dilated features to retain the consistency of depth prediction under different illuminations.
Our method is evaluated by comparing it with current state-of-the-art methods on Vari dataset and a significant improvement is observed in experiments.
arXiv Detail & Related papers (2021-12-15T10:02:46Z) - Sparse Auxiliary Networks for Unified Monocular Depth Prediction and
Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars.
In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors.
We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z) - CodeVIO: Visual-Inertial Odometry with Learned Optimizable Dense Depth [83.77839773394106]
We present a lightweight, tightly-coupled deep depth network and visual-inertial odometry system.
We provide the network with previously marginalized sparse features from VIO to increase the accuracy of initial depth prediction.
We show that it can run in real-time with single-thread execution while utilizing GPU acceleration only for the network and code Jacobian.
arXiv Detail & Related papers (2020-12-18T09:42:54Z) - Project to Adapt: Domain Adaptation for Depth Completion from Noisy and
Sparse Sensor Data [26.050220048154596]
We propose a domain adaptation approach for sparse-to-dense depth completion that is trained from synthetic data, without annotations in the real domain or additional sensors.
Our approach simulates the real sensor noise in an RGB+LiDAR set-up, and consists of three modules: simulating the real LiDAR input in the synthetic domain via projections, filtering the real noisy LiDAR for supervision and adapting the synthetic RGB image using a CycleGAN approach.
arXiv Detail & Related papers (2020-08-03T17:21:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.