Deformable spatial propagation network for depth completion
- URL: http://arxiv.org/abs/2007.04251v2
- Date: Sun, 19 Jul 2020 09:52:56 GMT
- Title: Deformable spatial propagation network for depth completion
- Authors: Zheyuan Xu, Hongche Yin, Jian Yao
- Abstract summary: We propose a deformable spatial propagation network (DSPN) to adaptively generates different receptive field and affinity matrix for each pixel.
It allows the network obtain information with much fewer but more relevant pixels for propagation.
- Score: 2.5306673456895306
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Depth completion has attracted extensive attention recently due to the
development of autonomous driving, which aims to recover dense depth map from
sparse depth measurements. Convolutional spatial propagation network (CSPN) is
one of the state-of-the-art methods in this task, which adopt a linear
propagation model to refine coarse depth maps with local context. However, the
propagation of each pixel occurs in a fixed receptive field. This may not be
the optimal for refinement since different pixel needs different local context.
To tackle this issue, in this paper, we propose a deformable spatial
propagation network (DSPN) to adaptively generates different receptive field
and affinity matrix for each pixel. It allows the network obtain information
with much fewer but more relevant pixels for propagation. Experimental results
on KITTI depth completion benchmark demonstrate that our proposed method
achieves the state-of-the-art performance.
Related papers
- Pixel-Aligned Multi-View Generation with Depth Guided Decoder [86.1813201212539]
We propose a novel method for pixel-level image-to-multi-view generation.
Unlike prior work, we incorporate attention layers across multi-view images in the VAE decoder of a latent video diffusion model.
Our model enables better pixel alignment across multi-view images.
arXiv Detail & Related papers (2024-08-26T04:56:41Z) - GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs [49.55919802779889]
We propose a Graph Convolution based Spatial Propagation Network (GraphCSPN) as a general approach for depth completion.
In this work, we leverage convolution neural networks as well as graph neural networks in a complementary way for geometric representation learning.
Our method achieves the state-of-the-art performance, especially when compared in the case of using only a few propagation steps.
arXiv Detail & Related papers (2022-10-19T17:56:03Z) - FSOINet: Feature-Space Optimization-Inspired Network for Image
Compressive Sensing [11.352530132548912]
We propose the idea of achieving information flow phase by phase in feature space and design a Feature-Space Optimization-Inspired Network (dubbed FSOINet)
Experiments show that the proposed FSOINet outperforms the existing state-of-the-art methods by a large margin both quantitatively and qualitatively.
arXiv Detail & Related papers (2022-04-12T03:30:22Z) - DVMN: Dense Validity Mask Network for Depth Completion [0.0]
We develop a guided convolutional neural network focusing on gathering dense and valid information from sparse depth maps.
We evaluate our Dense Validity Mask Network (DVMN) on the KITTI depth completion benchmark and achieve state of the art results.
arXiv Detail & Related papers (2021-07-14T13:57:44Z) - LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution
Homography Estimation [52.63874513999119]
Cross-resolution image alignment is a key problem in multiscale giga photography.
Existing deep homography methods neglecting the explicit formulation of correspondences between them, which leads to degraded accuracy in cross-resolution challenges.
We propose a local transformer network embedded within a multiscale structure to explicitly learn correspondences between the multimodal inputs.
arXiv Detail & Related papers (2021-06-08T02:51:45Z) - PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View
Depth Estimation with Neural Positional Encoding and Distilled Matting Loss [49.66736599668501]
We propose a self-supervised single-view pixel-level accurate depth estimation network, called PLADE-Net.
Our method shows unprecedented accuracy levels, exceeding 95% in terms of the $delta1$ metric on the KITTI dataset.
arXiv Detail & Related papers (2021-03-12T15:54:46Z) - An Empirical Method to Quantify the Peripheral Performance Degradation
in Deep Networks [18.808132632482103]
convolutional neural network (CNN) kernels compound with each convolutional layer.
Deeper and deeper networks combined with stride-based down-sampling means that the propagation of this region can end up covering a non-negligable portion of the image.
Our dataset is constructed by inserting objects into high resolution backgrounds, thereby allowing us to crop sub-images which place target objects at specific locations relative to the image border.
By probing the behaviour of Mask R-CNN across a selection of target locations, we see clear patterns of performance degredation near the image boundary, and in particular in the image corners.
arXiv Detail & Related papers (2020-12-04T18:00:47Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - Probabilistic Pixel-Adaptive Refinement Networks [21.233814875276803]
Image-adaptive post-processing methods have shown beneficial by leveraging the high-resolution input image(s) as guidance data.
We introduce probabilistic pixel-adaptive convolutions (PPACs), which not only depend on image guidance data for filtering, but also respect the reliability of per-pixel predictions.
We demonstrate their utility in refinement networks for optical flow and semantic segmentation, where PPACs lead to a clear reduction in boundary artifacts.
arXiv Detail & Related papers (2020-03-31T17:53:21Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.