Improved Point Transformation Methods For Self-Supervised Depth
Prediction
- URL: http://arxiv.org/abs/2102.09142v1
- Date: Thu, 18 Feb 2021 03:42:40 GMT
- Title: Improved Point Transformation Methods For Self-Supervised Depth
Prediction
- Authors: Chen Ziwen, Zixuan Guo, Jerod Weinman
- Abstract summary: Given stereo or egomotion image pairs, a popular and successful method for unsupervised learning of monocular depth estimation is to measure the quality of image reconstructions resulting from the learned depth predictions.
This paper introduces a z-buffering algorithm that correctly and efficiently handles points occluded after transformation to a novel viewpoint.
Because our algorithm is implemented with operators typical of machine learning libraries, it can be incorporated into any existing unsupervised depth learning framework with automatic support for differentiation.
- Score: 4.103701929881022
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given stereo or egomotion image pairs, a popular and successful method for
unsupervised learning of monocular depth estimation is to measure the quality
of image reconstructions resulting from the learned depth predictions.
Continued research has improved the overall approach in recent years, yet the
common framework still suffers from several important limitations, particularly
when dealing with points occluded after transformation to a novel viewpoint.
While prior work has addressed this problem heuristically, this paper
introduces a z-buffering algorithm that correctly and efficiently handles
occluded points. Because our algorithm is implemented with operators typical of
machine learning libraries, it can be incorporated into any existing
unsupervised depth learning framework with automatic support for
differentiation. Additionally, because points having negative depth after
transformation often signify erroneously shallow depth predictions, we
introduce a loss function to penalize this undesirable behavior explicitly.
Experimental results on the KITTI data set show that the z-buffer and negative
depth loss both improve the performance of a state of the art depth-prediction
network.
Related papers
- Adaptive Learning for Multi-view Stereo Reconstruction [6.635583283522551]
We first analyze existing loss functions' properties for deep depth based MVS approaches.
We then propose a novel loss function, named adaptive Wasserstein loss, which is able to narrow down the difference between the true and predicted probability distributions of depth.
Experiments on different benchmarks, including DTU, Tanks and Temples and BlendedMVS, show that the proposed method with the adaptive Wasserstein loss and the offset module achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-04-08T04:13:35Z) - Depth Estimation Algorithm Based on Transformer-Encoder and Feature
Fusion [3.490784807576072]
This research adopts a transformer model, initially renowned for its success in natural language processing, to capture intricate spatial relationships in visual data for depth estimation tasks.
A significant innovation of the research is the integration of a composite loss function that combines Structural Similarity Index Measure (SSIM) with Mean Squared Error (MSE).
This research approach addresses the challenges of over-smoothing often seen in MSE-based losses and enhances the model's ability to predict depth maps that are not only accurate but also maintain structural coherence with the input images.
arXiv Detail & Related papers (2024-03-03T02:10:00Z) - AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation [51.143540967290114]
We propose a method that unlocks a wide range of previously-infeasible geometric augmentations for unsupervised depth computation and estimation.
This is achieved by reversing, or undo''-ing, geometric transformations to the coordinates of the output depth, warping the depth map back to the original reference frame.
arXiv Detail & Related papers (2023-10-15T05:15:45Z) - VA-DepthNet: A Variational Approach to Single Image Depth Prediction [163.14849753700682]
VA-DepthNet is a simple, effective, and accurate deep neural network approach for the single-image depth prediction problem.
The paper demonstrates the usefulness of the proposed approach via extensive evaluation and ablation analysis over several benchmark datasets.
arXiv Detail & Related papers (2023-02-13T17:55:58Z) - Robust Depth Completion with Uncertainty-Driven Loss Functions [60.9237639890582]
We introduce uncertainty-driven loss functions to improve the robustness of depth completion and handle the uncertainty in depth completion.
Our method has been tested on KITTI Depth Completion Benchmark and achieved the state-of-the-art robustness performance in terms of MAE, IMAE, and IRMSE metrics.
arXiv Detail & Related papers (2021-12-15T05:22:34Z) - Probabilistic and Geometric Depth: Detecting Objects in Perspective [78.00922683083776]
3D object detection is an important capability needed in various practical applications such as driver assistance systems.
Monocular 3D detection, as an economical solution compared to conventional settings relying on binocular vision or LiDAR, has drawn increasing attention recently but still yields unsatisfactory results.
This paper first presents a systematic study on this problem and observes that the current monocular 3D detection problem can be simplified as an instance depth estimation problem.
arXiv Detail & Related papers (2021-07-29T16:30:33Z) - Towards Better Generalization: Joint Depth-Pose Learning without PoseNet [36.414471128890284]
We tackle the essential problem of scale inconsistency for self-supervised joint depth-pose learning.
Most existing methods assume that a consistent scale of depth and pose can be learned across all input samples.
We propose a novel system that explicitly disentangles scale from the network estimation.
arXiv Detail & Related papers (2020-04-03T00:28:09Z) - Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video.
Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z) - Beyond Dropout: Feature Map Distortion to Regularize Deep Neural
Networks [107.77595511218429]
In this paper, we investigate the empirical Rademacher complexity related to intermediate layers of deep neural networks.
We propose a feature distortion method (Disout) for addressing the aforementioned problem.
The superiority of the proposed feature map distortion for producing deep neural network with higher testing performance is analyzed and demonstrated.
arXiv Detail & Related papers (2020-02-23T13:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.