NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth
Estimation
- URL: http://arxiv.org/abs/2203.01502v1
- Date: Thu, 3 Mar 2022 03:27:20 GMT
- Title: NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth
Estimation
- Authors: Weihao Yuan, Xiaodong Gu, Zuozhuo Dai, Siyu Zhu, Ping Tan
- Abstract summary: Estimating the accurate depth from a single image is challenging since it is inherently ambiguous and ill-posed.
We take the path of CRFs optimization and leverage the potential of fully-connected CRFs.
Our method significantly improves the performance across all metrics on both the KITTI and NYUv2 datasets.
- Score: 42.062788492398674
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Estimating the accurate depth from a single image is challenging since it is
inherently ambiguous and ill-posed. While recent works design increasingly
complicated and powerful networks to directly regress the depth map, we take
the path of CRFs optimization. Due to the expensive computation, CRFs are
usually performed between neighborhoods rather than the whole graph. To
leverage the potential of fully-connected CRFs, we split the input into windows
and perform the FC-CRFs optimization within each window, which reduces the
computation complexity and makes FC-CRFs feasible. To better capture the
relationships between nodes in the graph, we exploit the multi-head attention
mechanism to compute a multi-head potential function, which is fed to the
networks to output an optimized depth map. Then we build a bottom-up-top-down
structure, where this neural window FC-CRFs module serves as the decoder, and a
vision transformer serves as the encoder. The experiments demonstrate that our
method significantly improves the performance across all metrics on both the
KITTI and NYUv2 datasets, compared to previous methods. Furthermore, the
proposed method can be directly applied to panorama images and outperforms all
previous panorama methods on the MatterPort3D dataset. The source code of our
method will be made public.
Related papers
- LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation [64.34935748707673]
Recent deep neural networks (DNNs) have made impressive progress in performance by introducing learned data priors.
We propose a novel method of Learning Resampling (termed LeRF) which takes advantage of both the structural priors learned by DNNs and the locally continuous assumption.
LeRF assigns spatially varying resampling functions to input image pixels and learns to predict the shapes of these resampling functions with a neural network.
arXiv Detail & Related papers (2024-07-13T16:09:45Z) - NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction.
The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network.
A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z) - Regularized Frank-Wolfe for Dense CRFs: Generalizing Mean Field and
Beyond [19.544213396776268]
We introduce regularized Frank-Wolfe, a general and effective CNN baseline inference for dense conditional fields.
We show that our new algorithms, with our new algorithms, with our new datasets, with significant improvements in strong strong neural networks.
arXiv Detail & Related papers (2021-10-27T20:44:47Z) - Continuous Conditional Random Field Convolution for Point Cloud
Segmentation [12.154944192318936]
conditional random fields (CRFs) are usually formulated as discrete models in label space to encourage label consistency.
In this paper, we reconsider the CRF in feature space for point cloud segmentation because it can capture the structure of features well.
Experiments on various point cloud benchmarks demonstrate the effectiveness and robustness of the proposed method.
arXiv Detail & Related papers (2021-10-12T15:35:38Z) - Single Image Depth Estimation using Wavelet Decomposition [37.486778463181]
We present a novel method for predicting accurate depths from monocular images with high efficiency.
This optimal efficiency is achieved by exploiting wavelet decomposition.
We demonstrate that we can reconstruct high-fidelity depth maps by predicting sparse wavelet coefficients.
arXiv Detail & Related papers (2021-06-03T17:42:25Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - TFill: Image Completion via a Transformer-Based Architecture [69.62228639870114]
We propose treating image completion as a directionless sequence-to-sequence prediction task.
We employ a restrictive CNN with small and non-overlapping RF for token representation.
In a second phase, to improve appearance consistency between visible and generated regions, a novel attention-aware layer (AAL) is introduced.
arXiv Detail & Related papers (2021-04-02T01:42:01Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.