Learning Multi-modal Information for Robust Light Field Depth Estimation
- URL: http://arxiv.org/abs/2104.05971v1
- Date: Tue, 13 Apr 2021 06:51:27 GMT
- Title: Learning Multi-modal Information for Robust Light Field Depth Estimation
- Authors: Yongri Piao, Xinxin Ji, Miao Zhang, Yukun Zhang
- Abstract summary: Existing learning-based depth estimation methods from the focal stack lead to suboptimal performance because of the defocus blur.
We propose a multi-modal learning method for robust light field depth estimation.
Our method achieves superior performance than existing representative methods on two light field datasets.
- Score: 32.64928379844675
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Light field data has been demonstrated to facilitate the depth estimation
task. Most learning-based methods estimate the depth infor-mation from EPI or
sub-aperture images, while less methods pay attention to the focal stack.
Existing learning-based depth estimation methods from the focal stack lead to
suboptimal performance because of the defocus blur. In this paper, we propose a
multi-modal learning method for robust light field depth estimation. We first
excavate the internal spatial correlation by designing a context reasoning unit
which separately extracts comprehensive contextual information from the focal
stack and RGB images. Then we integrate the contextual information by
exploiting a attention-guide cross-modal fusion module. Extensive experiments
demonstrate that our method achieves superior performance than existing
representative methods on two light field datasets. Moreover, visual results on
a mobile phone dataset show that our method can be widely used in daily life.
Related papers
- Robust Depth Enhancement via Polarization Prompt Fusion Tuning [112.88371907047396]
We present a framework that leverages polarization imaging to improve inaccurate depth measurements from various depth sensors.
Our method first adopts a learning-based strategy where a neural network is trained to estimate a dense and complete depth map from polarization data and a sensor depth map from different sensors.
To further improve the performance, we propose a Polarization Prompt Fusion Tuning (PPFT) strategy to effectively utilize RGB-based models pre-trained on large-scale datasets.
arXiv Detail & Related papers (2024-04-05T17:55:33Z) - Towards Real-World Focus Stacking with Deep Learning [97.34754533628322]
We introduce a new dataset consisting of 94 high-resolution bursts of raw images with focus bracketing.
This dataset is used to train the first deep learning algorithm for focus stacking capable of handling bursts of sufficient length for real-world applications.
arXiv Detail & Related papers (2023-11-29T17:49:33Z) - Towards Multimodal Depth Estimation from Light Fields [29.26003765978794]
Current depth estimation methods only consider a single "true" depth, even when multiple objects at different depths contributed to the color of a single pixel.
We argue that this is due current methods only considering a single "true" depth, even when multiple objects at different depths contributed to the color of a single pixel.
We contribute the first "multimodal light field depth dataset" that contains the depths of all objects which contribute to the color of a pixel.
arXiv Detail & Related papers (2022-03-30T18:00:00Z) - Unsupervised Learning Based Focal Stack Camera Depth Estimation [2.0625936401496237]
unsupervised deep learning based method to estimate depth from focal stack camera images.
On the NYU-v2 dataset, our method achieves much better depth estimation accuracy compared to single-image based methods.
arXiv Detail & Related papers (2022-03-14T02:52:23Z) - Occlusion-aware Unsupervised Learning of Depth from 4-D Light Fields [50.435129905215284]
We present an unsupervised learning-based depth estimation method for 4-D light field processing and analysis.
Based on the basic knowledge of the unique geometry structure of light field data, we explore the angular coherence among subsets of the light field views to estimate depth maps.
Our method can significantly shrink the performance gap between the previous unsupervised method and supervised ones, and produce depth maps with comparable accuracy to traditional methods with obviously reduced computational cost.
arXiv Detail & Related papers (2021-06-06T06:19:50Z) - Dynamic Fusion Network For Light Field Depth Estimation [32.64928379844675]
We propose a dynamically multi modal learning strategy which incorporates RGB data and the focal stack in our framework.
The success of our method is demonstrated by achieving the state of the art performance on two datasets.
arXiv Detail & Related papers (2021-04-13T06:45:11Z) - A learning-based view extrapolation method for axial super-resolution [52.748944517480155]
Axial light field resolution refers to the ability to distinguish features at different depths by refocusing.
We propose a learning-based method to extrapolate novel views from axial volumes of sheared epipolar plane images.
arXiv Detail & Related papers (2021-03-11T07:22:13Z) - View-consistent 4D Light Field Depth Estimation [37.04038603184669]
We propose a method to compute depth maps for every sub-aperture image in a light field in a view consistent way.
Our method precisely defines depth edges via EPIs, then we diffuse these edges spatially within the central view.
arXiv Detail & Related papers (2020-09-09T01:47:34Z) - Learning Light Field Angular Super-Resolution via a Geometry-Aware
Network [101.59693839475783]
We propose an end-to-end learning-based approach aiming at angularly super-resolving a sparsely-sampled light field with a large baseline.
Our method improves the PSNR of the second best method up to 2 dB in average, while saves the execution time 48$times$.
arXiv Detail & Related papers (2020-02-26T02:36:57Z) - Single Image Depth Estimation Trained via Depth from Defocus Cues [105.67073923825842]
Estimating depth from a single RGB image is a fundamental task in computer vision.
In this work, we rely, instead of different views, on depth from focus cues.
We present results that are on par with supervised methods on KITTI and Make3D datasets and outperform unsupervised learning approaches.
arXiv Detail & Related papers (2020-01-14T20:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.