Versatile Depth Estimator Based on Common Relative Depth Estimation and
Camera-Specific Relative-to-Metric Depth Conversion
- URL: http://arxiv.org/abs/2303.10991v1
- Date: Mon, 20 Mar 2023 10:19:50 GMT
- Title: Versatile Depth Estimator Based on Common Relative Depth Estimation and
Camera-Specific Relative-to-Metric Depth Conversion
- Authors: Jinyoung Jun, Jae-Han Lee, and Chang-Su Kim
- Abstract summary: We propose a versatile depth estimator (VDE) composed of a common relative depth estimator (CRDE) and multiple relative-to-metric converters (R2MCs)
The proposed VDE can cope with diverse scenes, including both indoor and outdoor scenes, with only a 1.12% parameter increase per camera.
Experimental results demonstrate that VDE supports multiple cameras effectively and efficiently and also achieves state-of-the-art performance in the conventional single-camera scenario.
- Score: 36.36012484044768
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A typical monocular depth estimator is trained for a single camera, so its
performance drops severely on images taken with different cameras. To address
this issue, we propose a versatile depth estimator (VDE), composed of a common
relative depth estimator (CRDE) and multiple relative-to-metric converters
(R2MCs). The CRDE extracts relative depth information, and each R2MC converts
the relative information to predict metric depths for a specific camera. The
proposed VDE can cope with diverse scenes, including both indoor and outdoor
scenes, with only a 1.12\% parameter increase per camera. Experimental results
demonstrate that VDE supports multiple cameras effectively and efficiently and
also achieves state-of-the-art performance in the conventional single-camera
scenario.
Related papers
- RePoseD: Efficient Relative Pose Estimation With Known Depth Information [45.40994214285799]
We propose a novel framework for estimating the relative pose of two cameras from point correspondences with associated monocular depths.
New solvers outperform state-of-the-art depth-aware solvers in terms of speed and accuracy.
arXiv Detail & Related papers (2025-01-13T23:13:33Z) - GVDepth: Zero-Shot Monocular Depth Estimation for Ground Vehicles based on Probabilistic Cue Fusion [7.588468985212172]
Generalizing metric monocular depth estimation presents a significant challenge due to its ill-posed nature.
We propose a novel canonical representation that maintains consistency across varied camera setups.
We also propose a novel architecture that adaptively and probabilistically fuses depths estimated via object size and vertical image position cues.
arXiv Detail & Related papers (2024-12-08T22:04:34Z) - SM4Depth: Seamless Monocular Metric Depth Estimation across Multiple Cameras and Scenes by One Model [72.0795843450604]
Current approaches face challenges in maintaining consistent accuracy across diverse scenes.
These methods rely on extensive datasets comprising millions, if not tens of millions, of data for training.
This paper presents SM$4$Depth, a model that seamlessly works for both indoor and outdoor scenes.
arXiv Detail & Related papers (2024-03-13T14:08:25Z) - SDGE: Stereo Guided Depth Estimation for 360$^\circ$ Camera Sets [65.64958606221069]
Multi-camera systems are often used in autonomous driving to achieve a 360$circ$ perception.
These 360$circ$ camera sets often have limited or low-quality overlap regions, making multi-view stereo methods infeasible for the entire image.
We propose the Stereo Guided Depth Estimation (SGDE) method, which enhances depth estimation of the full image by explicitly utilizing multi-view stereo results on the overlap.
arXiv Detail & Related papers (2024-02-19T02:41:37Z) - Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model [34.85279074665031]
Methods for monocular depth estimation have made significant strides on standard benchmarks, but zero-shot metric depth estimation remains unsolved.
Recent work has proposed specialized multi-head architectures for jointly modeling indoor and outdoor scenes.
We advocate a generic, task-agnostic diffusion model, with several advancements such as log-scale depth parameterization.
arXiv Detail & Related papers (2023-12-20T18:27:47Z) - Robust Self-Supervised Extrinsic Self-Calibration [25.727912226753247]
Multi-camera self-supervised monocular depth estimation from videos is a promising way to reason about the environment.
We introduce a novel method for extrinsic calibration that builds upon the principles of self-supervised monocular depth and ego-motion learning.
arXiv Detail & Related papers (2023-08-04T06:20:20Z) - Multi-Camera Collaborative Depth Prediction via Consistent Structure
Estimation [75.99435808648784]
We propose a novel multi-camera collaborative depth prediction method.
It does not require large overlapping areas while maintaining structure consistency between cameras.
Experimental results on DDAD and NuScenes datasets demonstrate the superior performance of our method.
arXiv Detail & Related papers (2022-10-05T03:44:34Z) - Unsupervised Visible-light Images Guided Cross-Spectrum Depth Estimation
from Dual-Modality Cameras [33.77748026254935]
Cross-spectrum depth estimation aims to provide a depth map in all illumination conditions with a pair of dual-spectrum images.
In this paper, we propose an unsupervised visible-light image guided cross-spectrum (i.e., thermal and visible-light, TIR-VIS in short) depth estimation framework.
Our method achieves better performance than the compared existing methods.
arXiv Detail & Related papers (2022-04-30T12:58:35Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - Robust Consistent Video Depth Estimation [65.53308117778361]
We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video.
Our algorithm combines two complementary techniques: (1) flexible deformation-splines for low-frequency large-scale alignment and (2) geometry-aware depth filtering for high-frequency alignment of fine depth details.
In contrast to prior approaches, our method does not require camera poses as input and achieves robust reconstruction for challenging hand-held cell phone captures containing a significant amount of noise, shake, motion blur, and rolling shutter deformations.
arXiv Detail & Related papers (2020-12-10T18:59:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.