Learning Inverse Laplacian Pyramid for Progressive Depth Completion
- URL: http://arxiv.org/abs/2502.07289v1
- Date: Tue, 11 Feb 2025 06:21:42 GMT
- Title: Learning Inverse Laplacian Pyramid for Progressive Depth Completion
- Authors: Kun Wang, Zhiqiang Yan, Junkai Fan, Jun Li, Jian Yang,
- Abstract summary: LP-Net is an innovative framework that implements a multi-scale, progressive prediction paradigm based on Laplacian Pyramid decomposition.
At the time of submission, LP-Net ranks 1st among all peer-reviewed methods on the official KITTI leaderboard.
- Score: 18.977393635158048
- License:
- Abstract: Depth completion endeavors to reconstruct a dense depth map from sparse depth measurements, leveraging the information provided by a corresponding color image. Existing approaches mostly hinge on single-scale propagation strategies that iteratively ameliorate initial coarse depth estimates through pixel-level message passing. Despite their commendable outcomes, these techniques are frequently hampered by computational inefficiencies and a limited grasp of scene context. To circumvent these challenges, we introduce LP-Net, an innovative framework that implements a multi-scale, progressive prediction paradigm based on Laplacian Pyramid decomposition. Diverging from propagation-based approaches, LP-Net initiates with a rudimentary, low-resolution depth prediction to encapsulate the global scene context, subsequently refining this through successive upsampling and the reinstatement of high-frequency details at incremental scales. We have developed two novel modules to bolster this strategy: 1) the Multi-path Feature Pyramid module, which segregates feature maps into discrete pathways, employing multi-scale transformations to amalgamate comprehensive spatial information, and 2) the Selective Depth Filtering module, which dynamically learns to apply both smoothness and sharpness filters to judiciously mitigate noise while accentuating intricate details. By integrating these advancements, LP-Net not only secures state-of-the-art (SOTA) performance across both outdoor and indoor benchmarks such as KITTI, NYUv2, and TOFDC, but also demonstrates superior computational efficiency. At the time of submission, LP-Net ranks 1st among all peer-reviewed methods on the official KITTI leaderboard.
Related papers
- Mesh Denoising Transformer [104.5404564075393]
Mesh denoising is aimed at removing noise from input meshes while preserving their feature structures.
SurfaceFormer is a pioneering Transformer-based mesh denoising framework.
New representation known as Local Surface Descriptor captures local geometric intricacies.
Denoising Transformer module receives the multimodal information and achieves efficient global feature aggregation.
arXiv Detail & Related papers (2024-05-10T15:27:43Z) - Bilateral Propagation Network for Depth Completion [41.163328523175466]
Depth completion aims to derive a dense depth map from sparse depth measurements with a synchronized color image.
Current state-of-the-art (SOTA) methods are predominantly propagation-based, which work as an iterative refinement on the initial estimated dense depth.
We present a Bilateral Propagation Network (BP-Net), that propagates depth at the earliest stage to avoid directly convolving on sparse data.
arXiv Detail & Related papers (2024-03-17T16:48:46Z) - LRRU: Long-short Range Recurrent Updating Networks for Depth Completion [45.48580252300282]
Long-short Range Recurrent Updating (LRRU) network is proposed to accomplish depth completion more efficiently.
LRRU first roughly fills the sparse input to obtain an initial dense depth map, and then iteratively updates it through learned spatially-variant kernels.
Our initial depth map has coarse but complete scene depth information, which helps relieve the burden of directly regressing the dense depth from sparse ones.
arXiv Detail & Related papers (2023-10-13T09:04:52Z) - Learning an Efficient Multimodal Depth Completion Model [11.740546882538142]
RGB image-guided sparse depth completion has attracted extensive attention recently, but still faces some problems.
The proposed method can outperform some state-of-the-art methods with a lightweight architecture.
The method also wins the championship in the MIPI2022 RGB+TOF depth completion challenge.
arXiv Detail & Related papers (2022-08-23T07:03:14Z) - Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of
Semantics and Depth [83.94528876742096]
We tackle the MTL problem of two dense tasks, ie, semantic segmentation and depth estimation, and present a novel attention module called Cross-Channel Attention Module (CCAM)
In a true symbiotic spirit, we then formulate a novel data augmentation for the semantic segmentation task using predicted depth called AffineMix, and a simple depth augmentation using predicted semantics called ColorAug.
Finally, we validate the performance gain of the proposed method on the Cityscapes dataset, which helps us achieve state-of-the-art results for a semi-supervised joint model based on depth and semantic
arXiv Detail & Related papers (2022-06-21T17:40:55Z) - Densely Nested Top-Down Flows for Salient Object Detection [137.74130900326833]
This paper revisits the role of top-down modeling in salient object detection.
It designs a novel densely nested top-down flows (DNTDF)-based framework.
In every stage of DNTDF, features from higher levels are read in via the progressive compression shortcut paths (PCSP)
arXiv Detail & Related papers (2021-02-18T03:14:02Z) - FCFR-Net: Feature Fusion based Coarse-to-Fine Residual Learning for
Monocular Depth Completion [15.01291779855834]
Recent approaches mainly formulate the depth completion as a one-stage end-to-end learning task.
We propose a novel end-to-end residual learning framework, which formulates the depth completion as a two-stage learning task.
arXiv Detail & Related papers (2020-12-15T13:09:56Z) - Deep Shells: Unsupervised Shape Correspondence with Optimal Transport [52.646396621449]
We propose a novel unsupervised learning approach to 3D shape correspondence.
We show that the proposed method significantly improves over the state-of-the-art on multiple datasets.
arXiv Detail & Related papers (2020-10-28T22:24:07Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - FPCR-Net: Feature Pyramidal Correlation and Residual Reconstruction for
Optical Flow Estimation [72.41370576242116]
We propose a semi-supervised Feature Pyramidal Correlation and Residual Reconstruction Network (FPCR-Net) for optical flow estimation from frame pairs.
It consists of two main modules: pyramid correlation mapping and residual reconstruction.
Experiment results show that the proposed scheme achieves the state-of-the-art performance, with improvement by 0.80, 1.15 and 0.10 in terms of average end-point error (AEE) against competing baseline methods.
arXiv Detail & Related papers (2020-01-17T07:13:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.