DepthP+P: Metric Accurate Monocular Depth Estimation using Planar and
Parallax
- URL: http://arxiv.org/abs/2301.02092v1
- Date: Thu, 5 Jan 2023 14:53:21 GMT
- Title: DepthP+P: Metric Accurate Monocular Depth Estimation using Planar and
Parallax
- Authors: Sadra Safadoust, Fatma G\"uney
- Abstract summary: Current self-supervised monocular depth estimation methods are mostly based on estimating a rigid-body motion representing camera motion.
We propose DepthP+P, a method that learns to estimate outputs in metric scale by following the traditional planar parallax paradigm.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current self-supervised monocular depth estimation methods are mostly based
on estimating a rigid-body motion representing camera motion. These methods
suffer from the well-known scale ambiguity problem in their predictions. We
propose DepthP+P, a method that learns to estimate outputs in metric scale by
following the traditional planar parallax paradigm. We first align the two
frames using a common ground plane which removes the effect of the rotation
component in the camera motion. With two neural networks, we predict the depth
and the camera translation, which is easier to predict alone compared to
predicting it together with rotation. By assuming a known camera height, we can
then calculate the induced 2D image motion of a 3D point and use it for
reconstructing the target image in a self-supervised monocular approach. We
perform experiments on the KITTI driving dataset and show that the planar
parallax approach, which only needs to predict camera translation, can be a
metrically accurate alternative to the current methods that rely on estimating
6DoF camera motion.
Related papers
- Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation [74.28509379811084]
Metric3D v2 is a geometric foundation model for zero-shot metric depth and surface normal estimation from a single image.
We propose solutions for both metric depth estimation and surface normal estimation.
Our method enables the accurate recovery of metric 3D structures on randomly collected internet images.
arXiv Detail & Related papers (2024-03-22T02:30:46Z) - SDGE: Stereo Guided Depth Estimation for 360$^\circ$ Camera Sets [65.64958606221069]
Multi-camera systems are often used in autonomous driving to achieve a 360$circ$ perception.
These 360$circ$ camera sets often have limited or low-quality overlap regions, making multi-view stereo methods infeasible for the entire image.
We propose the Stereo Guided Depth Estimation (SGDE) method, which enhances depth estimation of the full image by explicitly utilizing multi-view stereo results on the overlap.
arXiv Detail & Related papers (2024-02-19T02:41:37Z) - Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image [85.91935485902708]
We show that the key to a zero-shot single-view metric depth model lies in the combination of large-scale data training and resolving the metric ambiguity from various camera models.
We propose a canonical camera space transformation module, which explicitly addresses the ambiguity problems and can be effortlessly plugged into existing monocular models.
Our method enables the accurate recovery of metric 3D structures on randomly collected internet images.
arXiv Detail & Related papers (2023-07-20T16:14:23Z) - Tame a Wild Camera: In-the-Wild Monocular Camera Calibration [12.55056916519563]
Previous methods for the monocular camera calibration rely on specific 3D objects or strong geometry prior.
Our method is assumption-free and calibrates the complete $4$ Degree-of-Freedom (DoF) intrinsic parameters.
We demonstrate downstream applications in image manipulation detection & restoration, uncalibrated two-view pose estimation, and 3D sensing.
arXiv Detail & Related papers (2023-06-19T14:55:26Z) - Monocular 3D Object Detection with Depth from Motion [74.29588921594853]
We take advantage of camera ego-motion for accurate object depth estimation and detection.
Our framework, named Depth from Motion (DfM), then uses the established geometry to lift 2D image features to the 3D space and detects 3D objects thereon.
Our framework outperforms state-of-the-art methods by a large margin on the KITTI benchmark.
arXiv Detail & Related papers (2022-07-26T15:48:46Z) - ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving
Cameras in the Wild [57.37891682117178]
We present a robust dense indirect structure-from-motion method for videos that is based on dense correspondence from pairwise optical flow.
A novel neural network architecture is proposed for processing irregular point trajectory data.
Experiments on MPI Sintel dataset show that our system produces significantly more accurate camera trajectories.
arXiv Detail & Related papers (2022-07-19T09:19:45Z) - DiffPoseNet: Direct Differentiable Camera Pose Estimation [11.941057800943653]
We introduce a network NFlowNet, for normal flow estimation which is used to enforce robust and direct constraints.
We perform extensive qualitative and quantitative evaluation of the proposed DiffPoseNet's sensitivity to noise and its generalization across datasets.
arXiv Detail & Related papers (2022-03-21T17:54:30Z) - Attentive and Contrastive Learning for Joint Depth and Motion Field
Estimation [76.58256020932312]
Estimating the motion of the camera together with the 3D structure of the scene from a monocular vision system is a complex task.
We present a self-supervised learning framework for 3D object motion field estimation from monocular videos.
arXiv Detail & Related papers (2021-10-13T16:45:01Z) - Beyond Weak Perspective for Monocular 3D Human Pose Estimation [6.883305568568084]
We consider the task of 3D joints location and orientation prediction from a monocular video.
We first infer 2D joints locations with an off-the-shelf pose estimation algorithm.
We then adhere to the SMPLify algorithm which receives those initial parameters.
arXiv Detail & Related papers (2020-09-14T16:23:14Z) - Unsupervised Learning of Camera Pose with Compositional Re-estimation [10.251550038802343]
Given an input video sequence, our goal is to estimate the camera pose (i.e. the camera motion) between consecutive frames.
We propose an alternative approach that utilizes a compositional re-estimation process for camera pose estimation.
Our approach significantly improves the predicted camera motion both quantitatively and visually.
arXiv Detail & Related papers (2020-01-17T18:59:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.