Unsupervised Light Field Depth Estimation via Multi-view Feature
Matching with Occlusion Prediction
- URL: http://arxiv.org/abs/2301.08433v2
- Date: Fri, 18 Aug 2023 08:11:15 GMT
- Title: Unsupervised Light Field Depth Estimation via Multi-view Feature
Matching with Occlusion Prediction
- Authors: Shansi Zhang, Nan Meng and Edmund Y. Lam
- Abstract summary: It is costly to obtain sufficient depth labels for supervised training.
In this paper, we propose an unsupervised framework to estimate depth from LF images.
- Score: 15.421219881815956
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Depth estimation from light field (LF) images is a fundamental step for
numerous applications. Recently, learning-based methods have achieved higher
accuracy and efficiency than the traditional methods. However, it is costly to
obtain sufficient depth labels for supervised training. In this paper, we
propose an unsupervised framework to estimate depth from LF images. First, we
design a disparity estimation network (DispNet) with a coarse-to-fine structure
to predict disparity maps from different view combinations. It explicitly
performs multi-view feature matching to learn the correspondences effectively.
As occlusions may cause the violation of photo-consistency, we introduce an
occlusion prediction network (OccNet) to predict the occlusion maps, which are
used as the element-wise weights of photometric loss to solve the occlusion
issue and assist the disparity learning. With the disparity maps estimated by
multiple input combinations, we then propose a disparity fusion strategy based
on the estimated errors with effective occlusion handling to obtain the final
disparity map with higher accuracy. Experimental results demonstrate that our
method achieves superior performance on both the dense and sparse LF images,
and also shows better robustness and generalization on the real-world LF images
compared to the other methods.
Related papers
- Revisiting Disparity from Dual-Pixel Images: Physics-Informed Lightweight Depth Estimation [3.6337378417255177]
We propose a lightweight disparity estimation method based on a completion-based network.
By modeling the DP-specific disparity error parametrically and using it for sampling during training, the network acquires the unique properties of DP.
As a result, the proposed method achieved state-of-the-art results while reducing the overall system size to 1/5 of that of the conventional method.
arXiv Detail & Related papers (2024-11-06T09:03:53Z) - OccCasNet: Occlusion-aware Cascade Cost Volume for Light Field Depth
Estimation [26.572015989990845]
We propose an occlusion-aware cascade cost volume for LF depth (disparity) estimation.
Our strategy reduces the sampling number while keeping the sampling interval constant during the construction of a finer cost volume.
Our method achieves a superior balance between accuracy and efficiency and ranks first in terms of MSE and Q25 metrics.
arXiv Detail & Related papers (2023-05-28T12:31:27Z) - Single Image Depth Prediction Made Better: A Multivariate Gaussian Take [163.14849753700682]
We introduce an approach that performs continuous modeling of per-pixel depth.
Our method's accuracy (named MG) is among the top on the KITTI depth-prediction benchmark leaderboard.
arXiv Detail & Related papers (2023-03-31T16:01:03Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - Differentiable Diffusion for Dense Depth Estimation from Multi-view
Images [31.941861222005603]
We present a method to estimate dense depth by optimizing a sparse set of points such that their diffusion into a depth map minimizes a multi-view reprojection error from RGB supervision.
We also develop an efficient optimization routine that can simultaneously optimize the 50k+ points required for complex scene reconstruction.
arXiv Detail & Related papers (2021-06-16T16:17:34Z) - An Adaptive Framework for Learning Unsupervised Depth Completion [59.17364202590475]
We present a method to infer a dense depth map from a color image and associated sparse depth measurements.
We show that regularization and co-visibility are related via the fitness of the model to data and can be unified into a single framework.
arXiv Detail & Related papers (2021-06-06T02:27:55Z) - SAFENet: Self-Supervised Monocular Depth Estimation with Semantic-Aware
Feature Extraction [27.750031877854717]
We propose SAFENet that is designed to leverage semantic information to overcome the limitations of the photometric loss.
Our key idea is to exploit semantic-aware depth features that integrate the semantic and geometric knowledge.
Experiments on KITTI dataset demonstrate that our methods compete or even outperform the state-of-the-art methods.
arXiv Detail & Related papers (2020-10-06T17:22:25Z) - Adaptive confidence thresholding for monocular depth estimation [83.06265443599521]
We propose a new approach to leverage pseudo ground truth depth maps of stereo images generated from self-supervised stereo matching methods.
The confidence map of the pseudo ground truth depth map is estimated to mitigate performance degeneration by inaccurate pseudo depth maps.
Experimental results demonstrate superior performance to state-of-the-art monocular depth estimation methods.
arXiv Detail & Related papers (2020-09-27T13:26:16Z) - A Lightweight Neural Network for Monocular View Generation with
Occlusion Handling [46.74874316127603]
We present a very lightweight neural network architecture, trained on stereo data pairs, which performs view synthesis from one single image.
The work outperforms visually and metric-wise state-of-the-art approaches on the challenging KITTI dataset.
arXiv Detail & Related papers (2020-07-24T15:29:01Z) - Light Field Spatial Super-resolution via Deep Combinatorial Geometry
Embedding and Structural Consistency Regularization [99.96632216070718]
Light field (LF) images acquired by hand-held devices usually suffer from low spatial resolution.
The high-dimensional spatiality characteristic and complex geometrical structure of LF images make the problem more challenging than traditional single-image SR.
We propose a novel learning-based LF framework, in which each view of an LF image is first individually super-resolved.
arXiv Detail & Related papers (2020-04-05T14:39:57Z) - Deep Semantic Matching with Foreground Detection and Cycle-Consistency [103.22976097225457]
We address weakly supervised semantic matching based on a deep network.
We explicitly estimate the foreground regions to suppress the effect of background clutter.
We develop cycle-consistent losses to enforce the predicted transformations across multiple images to be geometrically plausible and consistent.
arXiv Detail & Related papers (2020-03-31T22:38:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.