Hierarchical Normalization for Robust Monocular Depth Estimation
- URL: http://arxiv.org/abs/2210.09670v1
- Date: Tue, 18 Oct 2022 08:18:29 GMT
- Title: Hierarchical Normalization for Robust Monocular Depth Estimation
- Authors: Chi Zhang, Wei Yin, Zhibin Wang, Gang Yu, Bin Fu, Chunhua Shen
- Abstract summary: We propose a novel multi-scale depth normalization method that hierarchically normalizes the depth representations based on spatial information and depth.
Our experiments show that the proposed normalization strategy remarkably outperforms previous normalization methods.
- Score: 85.2304122536962
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we address monocular depth estimation with deep neural
networks. To enable training of deep monocular estimation models with various
sources of datasets, state-of-the-art methods adopt image-level normalization
strategies to generate affine-invariant depth representations. However,
learning with image-level normalization mainly emphasizes the relations of
pixel representations with the global statistic in the images, such as the
structure of the scene, while the fine-grained depth difference may be
overlooked. In this paper, we propose a novel multi-scale depth normalization
method that hierarchically normalizes the depth representations based on
spatial information and depth distributions. Compared with previous
normalization strategies applied only at the holistic image level, the proposed
hierarchical normalization can effectively preserve the fine-grained details
and improve accuracy. We present two strategies that define the hierarchical
normalization contexts in the depth domain and the spatial domain,
respectively. Our extensive experiments show that the proposed normalization
strategy remarkably outperforms previous normalization methods, and we set new
state-of-the-art on five zero-shot transfer benchmark datasets.
Related papers
- Scale Propagation Network for Generalizable Depth Completion [16.733495588009184]
We propose a novel scale propagation normalization (SP-Norm) method to propagate scales from input to output.
We also develop a new network architecture based on SP-Norm and the ConvNeXt V2 backbone.
Our model consistently achieves the best accuracy with faster speed and lower memory when compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-10-24T03:53:06Z) - GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image [94.56927147492738]
We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes from single images.
We show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage.
We propose a simple yet effective strategy to segregate the complex data distribution of various scenes into distinct sub-distributions.
arXiv Detail & Related papers (2024-03-18T17:50:41Z) - DELAD: Deep Landweber-guided deconvolution with Hessian and sparse prior [0.22940141855172028]
We present a model for non-blind image deconvolution that incorporates the classic iterative method into a deep learning application.
We build our network based on the iterative Landweber deconvolution algorithm, which is integrated with trainable convolutional layers to enhance the recovered image structures and details.
arXiv Detail & Related papers (2022-09-30T11:15:03Z) - Deep Recursive Embedding for High-Dimensional Data [9.611123249318126]
We propose to combine deep neural networks (DNN) with mathematics-guided embedding rules for high-dimensional data embedding.
We introduce a generic deep embedding network (DEN) framework, which is able to learn a parametric mapping from high-dimensional space to low-dimensional space.
arXiv Detail & Related papers (2021-10-31T23:22:33Z) - VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction [71.83308989022635]
In this paper, we advocate that replicating the traditional two stages framework with deep neural networks improves both the interpretability and the accuracy of the results.
Our network operates in two steps: 1) the local computation of the local depth maps with a deep MVS technique, and, 2) the depth maps and images' features fusion to build a single TSDF volume.
In order to improve the matching performance between images acquired from very different viewpoints, we introduce a rotation-invariant 3D convolution kernel called PosedConv.
arXiv Detail & Related papers (2021-08-19T11:33:58Z) - Deep Reparametrization of Multi-Frame Super-Resolution and Denoising [167.42453826365434]
We propose a deep reparametrization of the maximum a posteriori formulation commonly employed in multi-frame image restoration tasks.
Our approach is derived by introducing a learned error metric and a latent representation of the target image.
We validate our approach through comprehensive experiments on burst denoising and burst super-resolution datasets.
arXiv Detail & Related papers (2021-08-18T17:57:02Z) - Semantic-Guided Representation Enhancement for Self-supervised Monocular
Trained Depth Estimation [39.845944724079814]
Self-supervised depth estimation has shown its great effectiveness in producing high quality depth maps given only image sequences as input.
However, its performance usually drops when estimating on border areas or objects with thin structures due to the limited depth representation ability.
We propose a semantic-guided depth representation enhancement method, which promotes both local and global depth feature representations.
arXiv Detail & Related papers (2020-12-15T02:24:57Z) - Depth image denoising using nuclear norm and learning graph model [107.51199787840066]
Group-based image restoration methods are more effective in gathering the similarity among patches.
For each patch, we find and group the most similar patches within a searching window.
The proposed method is superior to other current state-of-the-art denoising methods in both subjective and objective criterion.
arXiv Detail & Related papers (2020-08-09T15:12:16Z) - Deformable spatial propagation network for depth completion [2.5306673456895306]
We propose a deformable spatial propagation network (DSPN) to adaptively generates different receptive field and affinity matrix for each pixel.
It allows the network obtain information with much fewer but more relevant pixels for propagation.
arXiv Detail & Related papers (2020-07-08T16:39:50Z) - Optimization Theory for ReLU Neural Networks Trained with Normalization
Layers [82.61117235807606]
The success of deep neural networks in part due to the use of normalization layers.
Our analysis shows how the introduction of normalization changes the landscape and can enable faster activation.
arXiv Detail & Related papers (2020-06-11T23:55:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.