FloatingFusion: Depth from ToF and Image-stabilized Stereo Cameras
- URL: http://arxiv.org/abs/2210.02785v1
- Date: Thu, 6 Oct 2022 09:57:09 GMT
- Title: FloatingFusion: Depth from ToF and Image-stabilized Stereo Cameras
- Authors: Andreas Meuleman, Hakyeong Kim, James Tompkin, Min H. Kim
- Abstract summary: smartphones now have multimodal camera systems with time-of-flight (ToF) depth sensors and multiple color cameras.
producing accurate high-resolution depth is still challenging due to the low resolution and limited active illumination power of ToF sensors.
We propose an automatic calibration technique based on dense 2D/3D matching that can estimate camera parameters from a single snapshot.
- Score: 37.812681878193914
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-accuracy per-pixel depth is vital for computational photography, so
smartphones now have multimodal camera systems with time-of-flight (ToF) depth
sensors and multiple color cameras. However, producing accurate high-resolution
depth is still challenging due to the low resolution and limited active
illumination power of ToF sensors. Fusing RGB stereo and ToF information is a
promising direction to overcome these issues, but a key problem remains: to
provide high-quality 2D RGB images, the main color sensor's lens is optically
stabilized, resulting in an unknown pose for the floating lens that breaks the
geometric relationships between the multimodal image sensors. Leveraging ToF
depth estimates and a wide-angle RGB camera, we design an automatic calibration
technique based on dense 2D/3D matching that can estimate camera extrinsic,
intrinsic, and distortion parameters of a stabilized main RGB sensor from a
single snapshot. This lets us fuse stereo and ToF cues via a correlation
volume. For fusion, we apply deep learning via a real-world training dataset
with depth supervision estimated by a neural reconstruction method. For
evaluation, we acquire a test dataset using a commercial high-power depth
camera and show that our approach achieves higher accuracy than existing
baselines.
Related papers
- RGB Guided ToF Imaging System: A Survey of Deep Learning-based Methods [30.34690112905212]
Integrating an RGB camera into a ToF imaging system has become a significant technique for perceiving the real world.
This paper comprehensively reviews the works related to RGB guided ToF imaging, including network structures, learning strategies, evaluation metrics, benchmark datasets, and objective functions.
arXiv Detail & Related papers (2024-05-16T17:59:58Z) - Multi-Modal Neural Radiance Field for Monocular Dense SLAM with a
Light-Weight ToF Sensor [58.305341034419136]
We present the first dense SLAM system with a monocular camera and a light-weight ToF sensor.
We propose a multi-modal implicit scene representation that supports rendering both the signals from the RGB camera and light-weight ToF sensor.
Experiments demonstrate that our system well exploits the signals of light-weight ToF sensors and achieves competitive results.
arXiv Detail & Related papers (2023-08-28T07:56:13Z) - Symmetric Uncertainty-Aware Feature Transmission for Depth
Super-Resolution [52.582632746409665]
We propose a novel Symmetric Uncertainty-aware Feature Transmission (SUFT) for color-guided DSR.
Our method achieves superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-06-01T06:35:59Z) - Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized
Photography [54.36608424943729]
We show that in a ''long-burst'', forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth.
We devise a test-time optimization approach that fits a neural RGB-D representation to long-burst data and simultaneously estimates scene depth and camera motion.
arXiv Detail & Related papers (2022-12-22T18:54:34Z) - DELTAR: Depth Estimation from a Light-weight ToF Sensor and RGB Image [39.389538555506256]
We propose DELTAR, a novel method to empower light-weight ToF sensors with the capability of measuring high resolution and accurate depth.
As the core of DELTAR, a feature extractor customized for depth distribution and an attention-based neural architecture is proposed to fuse the information from the color and ToF domain efficiently.
Experiments show that our method produces more accurate depth than existing frameworks designed for depth completion and depth super-resolution and achieves on par performance with a commodity-level RGB-D sensor.
arXiv Detail & Related papers (2022-09-27T13:11:37Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - Wild ToFu: Improving Range and Quality of Indirect Time-of-Flight Depth
with RGB Fusion in Challenging Environments [56.306567220448684]
We propose a new learning based end-to-end depth prediction network which takes noisy raw I-ToF signals as well as an RGB image.
We show more than 40% RMSE improvement on the final depth map compared to the baseline approach.
arXiv Detail & Related papers (2021-12-07T15:04:14Z) - The Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural
Depth Refinement [25.637162990928676]
We show how we can combine dense micro-baseline parallax cues with kilopixel LiDAR depth estimates during viewfinding.
The proposed method brings high-resolution depth estimates to 'point-and-shoot' tabletop photography and requires no additional hardware, artificial hand motion, or user interaction beyond the press of a button.
arXiv Detail & Related papers (2021-11-26T20:24:07Z) - High-Resolution Depth Maps Based on TOF-Stereo Fusion [27.10059147107254]
We propose a novel TOF-stereo fusion method based on an efficient seed-growing algorithm.
We show that the proposed algorithm outperforms 2D image-based stereo algorithms.
The algorithm potentially exhibits real-time performance on a single CPU.
arXiv Detail & Related papers (2021-07-30T15:11:42Z) - Self-supervised Depth Denoising Using Lower- and Higher-quality RGB-D
sensors [8.34403807284064]
We propose a self-supervised depth denoising approach to denoise and refine depth coming from a low quality sensor.
We record simultaneous RGB-D sequences with unzynchronized lower- and higher-quality cameras and solve a challenging problem of aligning sequences both temporally and spatially.
We then learn a deep neural network to denoise the lower-quality depth using the matched higher-quality data as a source of supervision signal.
arXiv Detail & Related papers (2020-09-10T11:18:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.