IEBins: Iterative Elastic Bins for Monocular Depth Estimation
- URL: http://arxiv.org/abs/2309.14137v1
- Date: Mon, 25 Sep 2023 13:48:39 GMT
- Title: IEBins: Iterative Elastic Bins for Monocular Depth Estimation
- Authors: Shuwei Shao, Zhongcai Pei, Xingming Wu, Zhong Liu, Weihai Chen,
Zhengguo Li
- Abstract summary: We propose a novel concept of iterative elastic bins (IEBins) for the classification-regression-based MDE.
The proposed IEBins aims to search for high-quality depth by progressively optimizing the search range.
We develop a dedicated framework composed of a feature extractor and an iterative framework benefiting from the GRU-based architecture.
- Score: 25.71386321706134
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monocular depth estimation (MDE) is a fundamental topic of geometric computer
vision and a core technique for many downstream applications. Recently, several
methods reframe the MDE as a classification-regression problem where a linear
combination of probabilistic distribution and bin centers is used to predict
depth. In this paper, we propose a novel concept of iterative elastic bins
(IEBins) for the classification-regression-based MDE. The proposed IEBins aims
to search for high-quality depth by progressively optimizing the search range,
which involves multiple stages and each stage performs a finer-grained depth
search in the target bin on top of its previous stage. To alleviate the
possible error accumulation during the iterative process, we utilize a novel
elastic target bin to replace the original target bin, the width of which is
adjusted elastically based on the depth uncertainty. Furthermore, we develop a
dedicated framework composed of a feature extractor and an iterative optimizer
that has powerful temporal context modeling capabilities benefiting from the
GRU-based architecture. Extensive experiments on the KITTI, NYU-Depth-v2 and
SUN RGB-D datasets demonstrate that the proposed method surpasses prior
state-of-the-art competitors. The source code is publicly available at
https://github.com/ShuweiShao/IEBins.
Related papers
- Hyperboloid GPLVM for Discovering Continuous Hierarchies via Nonparametric Estimation [41.13597666007784]
Dimensionality reduction (DR) offers a useful representation of complex high-dimensional data.
Recent DR methods focus on hyperbolic geometry to derive a faithful low-dimensional representation of hierarchical data.
This paper presents hGP-LVMs to embed high-dimensional hierarchical data with implicit continuity via nonparametric estimation.
arXiv Detail & Related papers (2024-10-22T05:07:30Z) - Single Image Depth Prediction Made Better: A Multivariate Gaussian Take [163.14849753700682]
We introduce an approach that performs continuous modeling of per-pixel depth.
Our method's accuracy (named MG) is among the top on the KITTI depth-prediction benchmark leaderboard.
arXiv Detail & Related papers (2023-03-31T16:01:03Z) - Probabilistic partition of unity networks for high-dimensional
regression problems [1.0227479910430863]
We explore the partition of unity network (PPOU-Net) model in the context of high-dimensional regression problems.
We propose a general framework focusing on adaptive dimensionality reduction.
The PPOU-Nets consistently outperform the baseline fully-connected neural networks of comparable sizes in numerical experiments.
arXiv Detail & Related papers (2022-10-06T06:01:36Z) - Non-parametric Depth Distribution Modelling based Depth Inference for
Multi-view Stereo [43.415242967722804]
Recent cost volume pyramid based deep neural networks have unlocked the potential of efficiently leveraging high-resolution images for depth inference from multi-view stereo.
In general, those approaches assume that the depth of each pixel follows a unimodal distribution.
We propose constructing the cost volume by non-parametric depth distribution modeling to handle pixels with unimodal and multi-modal distributions.
arXiv Detail & Related papers (2022-05-08T05:13:04Z) - BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation [46.678016537618845]
We present a novel framework called BinsFormer, tailored for the classification-regression-based depth estimation.
It mainly focuses on two crucial components in the specific task: 1) proper generation of adaptive bins and 2) sufficient interaction between probability distribution and bins predictions.
Experiments on the KITTI, NYU, and SUN RGB-D datasets demonstrate that BinsFormer surpasses state-of-the-art monocular depth estimation methods with prominent margins.
arXiv Detail & Related papers (2022-04-03T04:38:02Z) - Accelerated replica exchange stochastic gradient Langevin diffusion
enhanced Bayesian DeepONet for solving noisy parametric PDEs [7.337247167823921]
We propose a training framework for replica-exchange Langevin diffusion that exploits the neural network architecture of DeepONets.
We show that the proposed framework's exploration and exploitation capabilities enable improved training convergence for DeepONets in noisy scenarios.
We also show that replica-exchange Langeving Diffusion also improves the DeepONet's mean prediction accuracy in noisy scenarios.
arXiv Detail & Related papers (2021-11-03T19:23:59Z) - Manifold Topology Divergence: a Framework for Comparing Data Manifolds [109.0784952256104]
We develop a framework for comparing data manifold, aimed at the evaluation of deep generative models.
Based on the Cross-Barcode, we introduce the Manifold Topology Divergence score (MTop-Divergence)
We demonstrate that the MTop-Divergence accurately detects various degrees of mode-dropping, intra-mode collapse, mode invention, and image disturbance.
arXiv Detail & Related papers (2021-06-08T00:30:43Z) - Depth-conditioned Dynamic Message Propagation for Monocular 3D Object
Detection [86.25022248968908]
We learn context- and depth-aware feature representation to solve the problem of monocular 3D object detection.
We show state-of-the-art results among the monocular-based approaches on the KITTI benchmark dataset.
arXiv Detail & Related papers (2021-03-30T16:20:24Z) - CodeVIO: Visual-Inertial Odometry with Learned Optimizable Dense Depth [83.77839773394106]
We present a lightweight, tightly-coupled deep depth network and visual-inertial odometry system.
We provide the network with previously marginalized sparse features from VIO to increase the accuracy of initial depth prediction.
We show that it can run in real-time with single-thread execution while utilizing GPU acceleration only for the network and code Jacobian.
arXiv Detail & Related papers (2020-12-18T09:42:54Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.