CodedStereo: Learned Phase Masks for Large Depth-of-field Stereo
- URL: http://arxiv.org/abs/2104.04641v1
- Date: Fri, 9 Apr 2021 23:44:52 GMT
- Title: CodedStereo: Learned Phase Masks for Large Depth-of-field Stereo
- Authors: Shiyu Tan, Yicheng Wu, Shoou-I Yu, Ashok Veeraraghavan
- Abstract summary: Conventional stereo suffers from a fundamental trade-off between imaging volume and signal-to-noise ratio.
We propose a novel end-to-end learning-based technique to overcome this limitation.
We show a 6x increase in volume that can be imaged in simulation.
- Score: 24.193656749401075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conventional stereo suffers from a fundamental trade-off between imaging
volume and signal-to-noise ratio (SNR) -- due to the conflicting impact of
aperture size on both these variables. Inspired by the extended depth of field
cameras, we propose a novel end-to-end learning-based technique to overcome
this limitation, by introducing a phase mask at the aperture plane of the
cameras in a stereo imaging system. The phase mask creates a depth-dependent
point spread function, allowing us to recover sharp image texture and stereo
correspondence over a significantly extended depth of field (EDOF) than
conventional stereo. The phase mask pattern, the EDOF image reconstruction, and
the stereo disparity estimation are all trained together using an end-to-end
learned deep neural network. We perform theoretical analysis and
characterization of the proposed approach and show a 6x increase in volume that
can be imaged in simulation. We also build an experimental prototype and
validate the approach using real-world results acquired using this prototype
system.
Related papers
- Adaptive Stereo Depth Estimation with Multi-Spectral Images Across All Lighting Conditions [58.88917836512819]
We propose a novel framework incorporating stereo depth estimation to enforce accurate geometric constraints.
To mitigate the effects of poor lighting on stereo matching, we introduce Degradation Masking.
Our method achieves state-of-the-art (SOTA) performance on the Multi-Spectral Stereo (MS2) dataset.
arXiv Detail & Related papers (2024-11-06T03:30:46Z) - Pixel-Aligned Multi-View Generation with Depth Guided Decoder [86.1813201212539]
We propose a novel method for pixel-level image-to-multi-view generation.
Unlike prior work, we incorporate attention layers across multi-view images in the VAE decoder of a latent video diffusion model.
Our model enables better pixel alignment across multi-view images.
arXiv Detail & Related papers (2024-08-26T04:56:41Z) - Stereo-Depth Fusion through Virtual Pattern Projection [37.519762078762575]
This paper presents a novel general-purpose stereo and depth data fusion paradigm.
It mimics the active stereo principle by replacing the unreliable physical pattern projector with a depth sensor.
It works by projecting virtual patterns consistent with the scene geometry onto the left and right images acquired by a conventional stereo camera.
arXiv Detail & Related papers (2024-06-06T17:59:58Z) - Aperture Diffraction for Compact Snapshot Spectral Imaging [27.321750056840706]
We demonstrate a compact, cost-effective snapshot spectral imaging system named Aperture Diffraction Imaging Spectrometer (ADIS)
A new optical design that each point in the object space is multiplexed to discrete encoding locations on the mosaic filter sensor is introduced.
The Cascade Shift-Shuffle Spectral Transformer (CSST) with strong perception of the diffraction degeneration is designed to solve a sparsity-constrained inverse problem.
arXiv Detail & Related papers (2023-09-27T16:48:46Z) - S^2-Transformer for Mask-Aware Hyperspectral Image Reconstruction [48.83280067393851]
A representative hyperspectral image acquisition procedure conducts a 3D-to-2D encoding by the coded aperture snapshot spectral imager (CASSI)
Two major challenges stand in the way of a high-fidelity reconstruction: (i) To obtain 2D measurements, CASSI dislocates multiple channels by disperser-titling and squeezes them onto the same spatial region, yielding an entangled data loss.
We propose a spatial-spectral (S2-) transformer architecture with a mask-aware learning strategy to tackle these challenges.
arXiv Detail & Related papers (2022-09-24T19:26:46Z) - End-to-end Learning for Joint Depth and Image Reconstruction from
Diffracted Rotation [10.896567381206715]
We propose a novel end-to-end learning approach for depth from diffracted rotation.
Our approach requires a significantly less complex model and less training data, yet it is superior to existing methods in the task of monocular depth estimation.
arXiv Detail & Related papers (2022-04-14T16:14:37Z) - Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo [103.08512487830669]
We present a modern solution to the multi-view photometric stereo problem (MVPS)
We procure the surface orientation using a photometric stereo (PS) image formation model and blend it with a multi-view neural radiance field representation to recover the object's surface geometry.
Our method performs neural rendering of multi-view images while utilizing surface normals estimated by a deep photometric stereo network.
arXiv Detail & Related papers (2021-10-11T20:20:03Z) - Polka Lines: Learning Structured Illumination and Reconstruction for
Active Stereo [52.68109922159688]
We introduce a novel differentiable image formation model for active stereo, relying on both wave and geometric optics, and a novel trinocular reconstruction network.
The jointly optimized pattern, which we dub "Polka Lines," together with the reconstruction network, achieve state-of-the-art active-stereo depth estimates across imaging conditions.
arXiv Detail & Related papers (2020-11-26T04:02:43Z) - Du$^2$Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels [16.797169907541164]
We present a novel approach based on neural networks for depth estimation that combines stereo from dual cameras with stereo from a dual-pixel sensor.
Our network uses a novel architecture to fuse these two sources of information and can overcome the limitations of pure binocular stereo matching.
arXiv Detail & Related papers (2020-03-31T15:39:43Z) - Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images [59.906948203578544]
We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object.
We first estimate per-view depth maps using a deep multi-view stereo network.
These depth maps are used to coarsely align the different views.
We propose a novel multi-view reflectance estimation network architecture.
arXiv Detail & Related papers (2020-03-27T21:28:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.