Disparity Estimation Using a Quad-Pixel Sensor
- URL: http://arxiv.org/abs/2409.00665v1
- Date: Sun, 1 Sep 2024 08:50:32 GMT
- Title: Disparity Estimation Using a Quad-Pixel Sensor
- Authors: Zhuofeng Wu, Doehyung Lee, Zihua Liu, Kazunori Yoshizaki, Yusuke Monno, Masatoshi Okutomi,
- Abstract summary: A quad-pixel (QP) sensor is increasingly integrated into commercial mobile cameras.
We propose a QP disparity estimation network (QPDNet)
We present a synthetic pipeline to generate a training dataset from an existing RGB-Depth dataset.
- Score: 12.34044154078824
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A quad-pixel (QP) sensor is increasingly integrated into commercial mobile cameras. The QP sensor has a unit of 2$\times$2 four photodiodes under a single microlens, generating multi-directional phase shifting when out-focus blurs occur. Similar to a dual-pixel (DP) sensor, the phase shifting can be regarded as stereo disparity and utilized for depth estimation. Based on this, we propose a QP disparity estimation network (QPDNet), which exploits abundant QP information by fusing vertical and horizontal stereo-matching correlations for effective disparity estimation. We also present a synthetic pipeline to generate a training dataset from an existing RGB-Depth dataset. Experimental results demonstrate that our QPDNet outperforms state-of-the-art stereo and DP methods. Our code and synthetic dataset are available at https://github.com/Zhuofeng-Wu/QPDNet.
Related papers
- bit2bit: 1-bit quanta video reconstruction via self-supervised photon prediction [57.199618102578576]
We propose bit2bit, a new method for reconstructing high-quality image stacks at original resolution from sparse binary quantatemporal image data.
Inspired by recent work on Poisson denoising, we developed an algorithm that creates a dense image sequence from sparse binary photon data.
We present a novel dataset containing a wide range of real SPAD high-speed videos under various challenging imaging conditions.
arXiv Detail & Related papers (2024-10-30T17:30:35Z) - Learning to Make Keypoints Sub-Pixel Accurate [80.55676599677824]
This work addresses the challenge of sub-pixel accuracy in detecting 2D local features.
We propose a novel network that enhances any detector with sub-pixel precision by learning an offset vector for detected features.
arXiv Detail & Related papers (2024-07-16T12:39:56Z) - Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging [25.851398356458425]
Single-shot 3D sensing is useful in many application areas such as microscopy, medical imaging, surgical navigation, and autonomous driving.
We propose CADS (Coded Aperture Dual-Pixel Sensing), in which we use a coded aperture in the imaging lens along with a DP sensor.
Our resulting CADS imaging system demonstrates improvement of >1.5dB PSNR in all-in-focus (AIF) estimates and 5-6% in depth estimation quality over naive DP sensing.
arXiv Detail & Related papers (2024-02-28T06:45:47Z) - Continuous Cost Aggregation for Dual-Pixel Disparity Extraction [3.1153758106426603]
We propose a continuous cost aggregation scheme for Dual-Pixel (DP) images.
The proposed algorithm fits parabolas to matching costs and aggregates parabola coefficients along image paths.
Experiments on DP data from both DSLR and phone cameras show that the proposed scheme attains state-of-the-art performance in DP disparity estimation.
arXiv Detail & Related papers (2023-06-13T17:26:50Z) - Learning Dual-Pixel Alignment for Defocus Deblurring [73.80328094662976]
We propose a Dual-Pixel Alignment Network (DPANet) for defocus deblurring.
It is notably superior to state-of-the-art deblurring methods in reducing defocus blur while recovering visually plausible sharp structures and textures.
arXiv Detail & Related papers (2022-04-26T07:02:58Z) - High-Resolution Depth Maps Based on TOF-Stereo Fusion [27.10059147107254]
We propose a novel TOF-stereo fusion method based on an efficient seed-growing algorithm.
We show that the proposed algorithm outperforms 2D image-based stereo algorithms.
The algorithm potentially exhibits real-time performance on a single CPU.
arXiv Detail & Related papers (2021-07-30T15:11:42Z) - EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic Segmentation [62.210091681352914]
We study multi-sensor fusion for 3D semantic segmentation for many applications, such as autonomous driving and robotics.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF)
We propose a two-stream network to extract features from the two modalities separately. The extracted features are fused by effective residual-based fusion modules.
arXiv Detail & Related papers (2021-06-21T10:47:26Z) - Learning to Reduce Defocus Blur by Realistically Modeling Dual-Pixel
Data [42.06108142009718]
Recent work has shown impressive results on data-driven deblurring using the two-image views available on modern dual-pixel (DP) sensors.
Despite many cameras having DP sensors, only a limited number provide access to the low-level DP sensor images.
We propose a procedure to generate realistic DP data synthetically.
arXiv Detail & Related papers (2020-12-06T13:12:43Z) - Dual Pixel Exploration: Simultaneous Depth Estimation and Image
Restoration [77.1056200937214]
We study the formation of the DP pair which links the blur and the depth information.
We propose an end-to-end DDDNet (DP-based Depth and De Network) to jointly estimate the depth and restore the image.
arXiv Detail & Related papers (2020-12-01T06:53:57Z) - Learning Camera Miscalibration Detection [83.38916296044394]
This paper focuses on a data-driven approach to learn the detection of miscalibration in vision sensors, specifically RGB cameras.
Our contributions include a proposed miscalibration metric for RGB cameras and a novel semi-synthetic dataset generation pipeline based on this metric.
By training a deep convolutional neural network, we demonstrate the effectiveness of our pipeline to identify whether a recalibration of the camera's intrinsic parameters is required or not.
arXiv Detail & Related papers (2020-05-24T10:32:49Z) - Fully Embedding Fast Convolutional Networks on Pixel Processor Arrays [16.531637803429277]
We present a novel method of CNN inference for pixel processor array ( PPA) vision sensors.
Our approach can perform convolutional layers, max pooling, ReLu, and a final fully connected layer entirely upon the PPA sensor.
This is the first work demonstrating CNN inference conducted entirely upon the processor array of a PPA vision sensor device, requiring no external processing.
arXiv Detail & Related papers (2020-04-27T01:00:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.