Depth Completion via Inductive Fusion of Planar LIDAR and Monocular
Camera
- URL: http://arxiv.org/abs/2009.01875v1
- Date: Thu, 3 Sep 2020 18:39:57 GMT
- Title: Depth Completion via Inductive Fusion of Planar LIDAR and Monocular
Camera
- Authors: Chen Fu, Chiyu Dong, Christoph Mertz and John M. Dolan
- Abstract summary: We introduce an inductive late-fusion block which better fuses different sensor modalities inspired by a probability model.
This block uses the dense context features to guide the depth prediction based on demonstrations by sparse depth features.
Our method shows promising results compared to previous approaches on both the benchmark datasets and simulated dataset.
- Score: 27.978780155504467
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern high-definition LIDAR is expensive for commercial autonomous driving
vehicles and small indoor robots. An affordable solution to this problem is
fusion of planar LIDAR with RGB images to provide a similar level of perception
capability. Even though state-of-the-art methods provide approaches to predict
depth information from limited sensor input, they are usually a simple
concatenation of sparse LIDAR features and dense RGB features through an
end-to-end fusion architecture. In this paper, we introduce an inductive
late-fusion block which better fuses different sensor modalities inspired by a
probability model. The proposed demonstration and aggregation network
propagates the mixed context and depth features to the prediction network and
serves as a prior knowledge of the depth completion. This late-fusion block
uses the dense context features to guide the depth prediction based on
demonstrations by sparse depth features. In addition to evaluating the proposed
method on benchmark depth completion datasets including NYUDepthV2 and KITTI,
we also test the proposed method on a simulated planar LIDAR dataset. Our
method shows promising results compared to previous approaches on both the
benchmark datasets and simulated dataset with various 3D densities.
Related papers
- Robust Depth Enhancement via Polarization Prompt Fusion Tuning [112.88371907047396]
We present a framework that leverages polarization imaging to improve inaccurate depth measurements from various depth sensors.
Our method first adopts a learning-based strategy where a neural network is trained to estimate a dense and complete depth map from polarization data and a sensor depth map from different sensors.
To further improve the performance, we propose a Polarization Prompt Fusion Tuning (PPFT) strategy to effectively utilize RGB-based models pre-trained on large-scale datasets.
arXiv Detail & Related papers (2024-04-05T17:55:33Z) - Metrically Scaled Monocular Depth Estimation through Sparse Priors for
Underwater Robots [0.0]
We formulate a deep learning model that fuses sparse depth measurements from triangulated features to improve the depth predictions.
The network is trained in a supervised fashion on the forward-looking underwater dataset, FLSea.
The method achieves real-time performance, running at 160 FPS on a laptop GPU and 7 FPS on a single CPU core.
arXiv Detail & Related papers (2023-10-25T16:32:31Z) - Attentive Multimodal Fusion for Optical and Scene Flow [24.08052492109655]
Existing methods typically rely solely on RGB images or fuse the modalities at later stages.
We propose a novel deep neural network approach named FusionRAFT, which enables early-stage information fusion between sensor modalities.
Our approach exhibits improved robustness in the presence of noise and low-lighting conditions that affect the RGB images.
arXiv Detail & Related papers (2023-07-28T04:36:07Z) - Consistent Depth Prediction under Various Illuminations using Dilated
Cross Attention [1.332560004325655]
We propose to use internet 3D indoor scenes and manually tune their illuminations to render photo-realistic RGB photos and their corresponding depth and BRDF maps.
We perform cross attention on these dilated features to retain the consistency of depth prediction under different illuminations.
Our method is evaluated by comparing it with current state-of-the-art methods on Vari dataset and a significant improvement is observed in experiments.
arXiv Detail & Related papers (2021-12-15T10:02:46Z) - Sparse Auxiliary Networks for Unified Monocular Depth Prediction and
Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars.
In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors.
We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z) - Depth-conditioned Dynamic Message Propagation for Monocular 3D Object
Detection [86.25022248968908]
We learn context- and depth-aware feature representation to solve the problem of monocular 3D object detection.
We show state-of-the-art results among the monocular-based approaches on the KITTI benchmark dataset.
arXiv Detail & Related papers (2021-03-30T16:20:24Z) - Learning Selective Mutual Attention and Contrast for RGB-D Saliency
Detection [145.4919781325014]
How to effectively fuse cross-modal information is the key problem for RGB-D salient object detection.
Many models use the feature fusion strategy but are limited by the low-order point-to-point fusion methods.
We propose a novel mutual attention model by fusing attention and contexts from different modalities.
arXiv Detail & Related papers (2020-10-12T08:50:10Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - MSDPN: Monocular Depth Prediction with Partial Laser Observation using
Multi-stage Neural Networks [1.1602089225841632]
We propose a deep-learning-based multi-stage network architecture called Multi-Stage Depth Prediction Network (MSDPN)
MSDPN is proposed to predict a dense depth map using a 2D LiDAR and a monocular camera.
As verified experimentally, our network yields promising performance against state-of-the-art methods.
arXiv Detail & Related papers (2020-08-04T08:27:40Z) - A Single Stream Network for Robust and Real-time RGB-D Salient Object
Detection [89.88222217065858]
We design a single stream network to use the depth map to guide early fusion and middle fusion between RGB and depth.
This model is 55.5% lighter than the current lightest model and runs at a real-time speed of 32 FPS when processing a $384 times 384$ image.
arXiv Detail & Related papers (2020-07-14T04:40:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.