Dynamic Fusion Network For Light Field Depth Estimation
- URL: http://arxiv.org/abs/2104.05969v1
- Date: Tue, 13 Apr 2021 06:45:11 GMT
- Title: Dynamic Fusion Network For Light Field Depth Estimation
- Authors: Yongri Piao, Yukun Zhang, Miao Zhang, Xinxin Ji
- Abstract summary: We propose a dynamically multi modal learning strategy which incorporates RGB data and the focal stack in our framework.
The success of our method is demonstrated by achieving the state of the art performance on two datasets.
- Score: 32.64928379844675
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Focus based methods have shown promising results for the task of depth
estimation. However, most existing focus based depth estimation approaches
depend on maximal sharpness of the focal stack. Out of focus information in the
focal stack poses challenges for this task. In this paper, we propose a
dynamically multi modal learning strategy which incorporates RGB data and the
focal stack in our framework. Our goal is to deeply excavate the spatial
correlation in the focal stack by designing the spatial correlation perception
module and dynamically fuse multi modal information between RGB data and the
focal stack in a adaptive way by designing the multi modal dynamic fusion
module. The success of our method is demonstrated by achieving the state of the
art performance on two datasets. Furthermore, we test our network on a set of
different focused images generated by a smart phone camera to prove that the
proposed method not only broke the limitation of only using light field data,
but also open a path toward practical applications of depth estimation on
common consumer level cameras data.
Related papers
- Depth Estimation Based on 3D Gaussian Splatting Siamese Defocus [14.354405484663285]
We propose a self-supervised framework based on 3D Gaussian splatting and Siamese networks for depth estimation in 3D geometry.
The proposed framework has been validated on both artificially synthesized and real blurred datasets.
arXiv Detail & Related papers (2024-09-18T21:36:37Z) - Learning Monocular Depth from Focus with Event Focal Stack [6.200121342586474]
We propose the EDFF Network to estimate sparse depth from the Event Focal Stack.
We use the event voxel grid to encode intensity change information and project event time surface into the depth domain.
A Focal-Distance-guided Cross-Modal Attention Module is presented to fuse the information mentioned above.
arXiv Detail & Related papers (2024-05-11T07:54:49Z) - Towards Real-World Focus Stacking with Deep Learning [97.34754533628322]
We introduce a new dataset consisting of 94 high-resolution bursts of raw images with focus bracketing.
This dataset is used to train the first deep learning algorithm for focus stacking capable of handling bursts of sufficient length for real-world applications.
arXiv Detail & Related papers (2023-11-29T17:49:33Z) - Guided Focal Stack Refinement Network for Light Field Salient Object
Detection [20.42257631830276]
Light field salient object detection (SOD) is an emerging research direction attributed to the richness of light field data.
We propose to utilize multi-modal features to refine focal stacks in a guided manner, resulting in a novel guided focal stack refinement network called GFRNet.
Experimental results on four benchmark datasets demonstrate the superiority of our GFRNet model against 12 state-of-the-art models.
arXiv Detail & Related papers (2023-05-09T08:32:06Z) - Fully Self-Supervised Depth Estimation from Defocus Clue [79.63579768496159]
We propose a self-supervised framework that estimates depth purely from a sparse focal stack.
We show that our framework circumvents the needs for the depth and AIF image ground-truth, and receives superior predictions.
arXiv Detail & Related papers (2023-03-19T19:59:48Z) - Multi-task Learning for Monocular Depth and Defocus Estimations with
Real Images [3.682618267671887]
Most existing methods treat depth estimation and defocus estimation as two separate tasks, ignoring the strong connection between them.
We propose a multi-task learning network consisting of an encoder with two decoders to estimate the depth and defocus map from a single focused image.
Our depth and defocus estimations achieve significantly better performance than other state-of-art algorithms.
arXiv Detail & Related papers (2022-08-21T08:59:56Z) - Single image deep defocus estimation and its applications [82.93345261434943]
We train a deep neural network to classify image patches into one of the 20 levels of blurriness.
The trained model is used to determine the patch blurriness which is then refined by applying an iterative weighted guided filter.
The result is a defocus map that carries the information of the degree of blurriness for each pixel.
arXiv Detail & Related papers (2021-07-30T06:18:16Z) - Learning Multi-modal Information for Robust Light Field Depth Estimation [32.64928379844675]
Existing learning-based depth estimation methods from the focal stack lead to suboptimal performance because of the defocus blur.
We propose a multi-modal learning method for robust light field depth estimation.
Our method achieves superior performance than existing representative methods on two light field datasets.
arXiv Detail & Related papers (2021-04-13T06:51:27Z) - High-resolution Depth Maps Imaging via Attention-based Hierarchical
Multi-modal Fusion [84.24973877109181]
We propose a novel attention-based hierarchical multi-modal fusion network for guided DSR.
We show that our approach outperforms state-of-the-art methods in terms of reconstruction accuracy, running speed and memory efficiency.
arXiv Detail & Related papers (2021-04-04T03:28:33Z) - Light Field Reconstruction via Deep Adaptive Fusion of Hybrid Lenses [67.01164492518481]
This paper explores the problem of reconstructing high-resolution light field (LF) images from hybrid lenses.
We propose a novel end-to-end learning-based approach, which can comprehensively utilize the specific characteristics of the input.
Our framework could potentially decrease the cost of high-resolution LF data acquisition and benefit LF data storage and transmission.
arXiv Detail & Related papers (2021-02-14T06:44:47Z) - Single Image Depth Estimation Trained via Depth from Defocus Cues [105.67073923825842]
Estimating depth from a single RGB image is a fundamental task in computer vision.
In this work, we rely, instead of different views, on depth from focus cues.
We present results that are on par with supervised methods on KITTI and Make3D datasets and outperform unsupervised learning approaches.
arXiv Detail & Related papers (2020-01-14T20:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.