PlaneDepth: Self-supervised Depth Estimation via Orthogonal Planes
- URL: http://arxiv.org/abs/2210.01612v3
- Date: Tue, 28 Mar 2023 05:06:59 GMT
- Title: PlaneDepth: Self-supervised Depth Estimation via Orthogonal Planes
- Authors: Ruoyu Wang, Zehao Yu and Shenghua Gao
- Abstract summary: Multiple near frontal-parallel planes based depth estimation demonstrated impressive results in self-supervised monocular depth estimation (MDE)
We propose the PlaneDepth, a novel planes based presentation, including vertical planes and ground planes.
Our method can extract the ground plane in an unsupervised manner, which is important for autonomous driving.
- Score: 41.517947010531074
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multiple near frontal-parallel planes based depth representation demonstrated
impressive results in self-supervised monocular depth estimation (MDE).
Whereas, such a representation would cause the discontinuity of the ground as
it is perpendicular to the frontal-parallel planes, which is detrimental to the
identification of drivable space in autonomous driving. In this paper, we
propose the PlaneDepth, a novel orthogonal planes based presentation, including
vertical planes and ground planes. PlaneDepth estimates the depth distribution
using a Laplacian Mixture Model based on orthogonal planes for an input image.
These planes are used to synthesize a reference view to provide the
self-supervision signal. Further, we find that the widely used resizing and
cropping data augmentation breaks the orthogonality assumptions, leading to
inferior plane predictions. We address this problem by explicitly constructing
the resizing cropping transformation to rectify the predefined planes and
predicted camera pose. Moreover, we propose an augmented self-distillation loss
supervised with a bilateral occlusion mask to boost the robustness of
orthogonal planes representation for occlusions. Thanks to our orthogonal
planes representation, we can extract the ground plane in an unsupervised
manner, which is important for autonomous driving. Extensive experiments on the
KITTI dataset demonstrate the effectiveness and efficiency of our method. The
code is available at https://github.com/svip-lab/PlaneDepth.
Related papers
- MonoPlane: Exploiting Monocular Geometric Cues for Generalizable 3D Plane Reconstruction [37.481945507799594]
This paper presents a generalizable 3D plane detection and reconstruction framework named MonoPlane.
We first leverage large-scale pre-trained neural networks to obtain the depth and surface normals from a single image.
These monocular geometric cues are then incorporated into a proximity-guided RANSAC framework to sequentially fit each plane instance.
arXiv Detail & Related papers (2024-11-02T12:15:29Z) - Plane2Depth: Hierarchical Adaptive Plane Guidance for Monocular Depth Estimation [38.81275292687583]
We propose Plane2Depth, which adaptively utilizes plane information to improve depth prediction within a hierarchical framework.
In the proposed plane guided depth generator (PGDG), we design a set of plane queries as prototypes to softly model planes in the scene and predict per-pixel plane coefficients.
In the proposed adaptive plane query aggregation (APGA) module, we introduce a novel feature interaction approach to improve the aggregation of multi-scale plane features.
arXiv Detail & Related papers (2024-09-04T07:45:06Z) - AirPlanes: Accurate Plane Estimation via 3D-Consistent Embeddings [26.845588648999417]
We tackle the problem of estimating the planar surfaces in a 3D scene from posed images.
We propose a method that predicts multi-view consistent plane embeddings that complement geometry when clustering points into planes.
We show through extensive evaluation on the ScanNetV2 dataset that our new method outperforms existing approaches.
arXiv Detail & Related papers (2024-06-13T09:49:31Z) - Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction [84.94140661523956]
We propose a tri-perspective view (TPV) representation which accompanies BEV with two additional perpendicular planes.
We model each point in the 3D space by summing its projected features on the three planes.
Experiments show that our model trained with sparse supervision effectively predicts the semantic occupancy for all voxels.
arXiv Detail & Related papers (2023-02-15T17:58:10Z) - Ground Plane Matters: Picking Up Ground Plane Prior in Monocular 3D
Object Detection [92.75961303269548]
The ground plane prior is a very informative geometry clue in monocular 3D object detection (M3OD)
We propose a Ground Plane Enhanced Network (GPENet) which resolves both issues at one go.
Our GPENet can outperform other methods and achieve state-of-the-art performance, well demonstrating the effectiveness and the superiority of the proposed approach.
arXiv Detail & Related papers (2022-11-03T02:21:35Z) - Occupancy Planes for Single-view RGB-D Human Reconstruction [120.5818162569105]
Single-view RGB-D human reconstruction with implicit functions is often formulated as per-point classification.
We propose the occupancy planes (OPlanes) representation, which enables to formulate single-view RGB-D human reconstruction as occupancy prediction on planes which slice through the camera's view frustum.
arXiv Detail & Related papers (2022-08-04T17:59:56Z) - Pose Estimation for Vehicle-mounted Cameras via Horizontal and Vertical
Planes [37.653076607939745]
We propose two novel solvers for estimating the egomotion of a calibrated camera mounted to a moving vehicle from a single affine correspondence.
Both methods are solved via a linear system with a small matrix coefficient, thus, being extremely efficient.
They are tested on synthetic data and on publicly available real-world datasets.
arXiv Detail & Related papers (2020-08-13T08:01:48Z) - Plan2Vec: Unsupervised Representation Learning by Latent Plans [106.37274654231659]
We introduce plan2vec, an unsupervised representation learning approach that is inspired by reinforcement learning.
Plan2vec constructs a weighted graph on an image dataset using near-neighbor distances, and then extrapolates this local metric to a global embedding by distilling path-integral over planned path.
We demonstrate the effectiveness of plan2vec on one simulated and two challenging real-world image datasets.
arXiv Detail & Related papers (2020-05-07T17:52:23Z) - From Planes to Corners: Multi-Purpose Primitive Detection in Unorganized
3D Point Clouds [59.98665358527686]
We propose a new method for segmentation-free joint estimation of orthogonal planes.
Such unified scene exploration allows for multitudes of applications such as semantic plane detection or local and global scan alignment.
Our experiments demonstrate the validity of our approach in numerous scenarios from wall detection to 6D tracking.
arXiv Detail & Related papers (2020-01-21T06:51:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.