Supersampling of Data from Structured-light Scanner with Deep Learning
- URL: http://arxiv.org/abs/2311.07432v2
- Date: Mon, 26 Feb 2024 09:53:20 GMT
- Title: Supersampling of Data from Structured-light Scanner with Deep Learning
- Authors: Martin Melicher\v{c}\'ik, Luk\'a\v{s} Gajdo\v{s}ech, Viktor Kocur,
Martin Madaras
- Abstract summary: Two deep learning models FDSR and DKN are modified to work with high-resolution data.
The resulting high-resolution depth maps are evaluated using qualitative and quantitative metrics.
- Score: 1.6385815610837167
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper focuses on increasing the resolution of depth maps obtained from
3D cameras using structured light technology. Two deep learning models FDSR and
DKN are modified to work with high-resolution data, and data pre-processing
techniques are implemented for stable training. The models are trained on our
custom dataset of 1200 3D scans. The resulting high-resolution depth maps are
evaluated using qualitative and quantitative metrics. The approach for depth
map upsampling offers benefits such as reducing the processing time of a
pipeline by first downsampling a high-resolution depth map, performing various
processing steps at the lower resolution and upsampling the resulting depth map
or increasing the resolution of a point cloud captured in lower resolution by a
cheaper device. The experiments demonstrate that the FDSR model excels in terms
of faster processing time, making it a suitable choice for applications where
speed is crucial. On the other hand, the DKN model provides results with higher
precision, making it more suitable for applications that prioritize accuracy.
Related papers
- Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized
Photography [54.36608424943729]
We show that in a ''long-burst'', forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth.
We devise a test-time optimization approach that fits a neural RGB-D representation to long-burst data and simultaneously estimates scene depth and camera motion.
arXiv Detail & Related papers (2022-12-22T18:54:34Z) - Efficient High-Resolution Deep Learning: A Survey [90.76576712433595]
Cameras in modern devices such as smartphones, satellites and medical equipment are capable of capturing very high resolution images and videos.
Such high-resolution data often need to be processed by deep learning models for cancer detection, automated road navigation, weather prediction, surveillance, optimizing agricultural processes and many other applications.
Using high-resolution images and videos as direct inputs for deep learning models creates many challenges due to their high number of parameters, computation cost, inference latency and GPU memory consumption.
Several works in the literature propose better alternatives in order to deal with the challenges of high-resolution data and improve accuracy and speed while complying with hardware limitations
arXiv Detail & Related papers (2022-07-26T17:13:53Z) - RA-Depth: Resolution Adaptive Self-Supervised Monocular Depth Estimation [27.679479140943503]
We propose a resolution adaptive self-supervised monocular depth estimation method (RA-Depth) by learning the scale invariance of the scene depth.
RA-Depth achieves state-of-the-art performance, and also exhibits a good ability of resolution adaptation.
arXiv Detail & Related papers (2022-07-25T08:49:59Z) - A Low Memory Footprint Quantized Neural Network for Depth Completion of
Very Sparse Time-of-Flight Depth Maps [14.885472968649937]
We simulate ToF datasets for indoor 3D perception with challenging sparsity levels.
Our model achieves optimal depth map quality by means of input pre-processing and carefully tuned training.
We also achieve low memory footprint for weights and activations by means of mixed precision quantization-at-training techniques.
arXiv Detail & Related papers (2022-05-25T17:11:31Z) - SALISA: Saliency-based Input Sampling for Efficient Video Object
Detection [58.22508131162269]
We propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detection.
We show that SALISA significantly improves the detection of small objects.
arXiv Detail & Related papers (2022-04-05T17:59:51Z) - Sparse Depth Completion with Semantic Mesh Deformation Optimization [4.03103540543081]
We propose a neural network with post-optimization, which takes an RGB image and sparse depth samples as input and predicts the complete depth map.
Our evaluation results outperform the existing work consistently on both indoor and outdoor datasets.
arXiv Detail & Related papers (2021-12-10T13:01:06Z) - Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images
with Virtual Depth [64.29043589521308]
We propose a rendering module to augment the training data by synthesizing images with virtual-depths.
The rendering module takes as input the RGB image and its corresponding sparse depth image, outputs a variety of photo-realistic synthetic images.
Besides, we introduce an auxiliary module to improve the detection model by jointly optimizing it through a depth estimation task.
arXiv Detail & Related papers (2021-07-28T11:00:47Z) - Towards Unpaired Depth Enhancement and Super-Resolution in the Wild [121.96527719530305]
State-of-the-art data-driven methods of depth map super-resolution rely on registered pairs of low- and high-resolution depth maps of the same scenes.
We consider an approach to depth map enhancement based on learning from unpaired data.
arXiv Detail & Related papers (2021-05-25T16:19:16Z) - A new public Alsat-2B dataset for single-image super-resolution [1.284647943889634]
The paper introduces a novel public remote sensing dataset (Alsat2B) of low and high spatial resolution images (10m and 2.5m respectively) for the single-image super-resolution task.
The high-resolution images are obtained through pan-sharpening.
The obtained results reveal that the proposed scheme is promising and highlight the challenges in the dataset.
arXiv Detail & Related papers (2021-03-21T10:47:38Z) - Learning When and Where to Zoom with Deep Reinforcement Learning [101.79271767464947]
We propose a reinforcement learning approach to identify when and where to use/acquire high resolution data conditioned on paired, cheap, low resolution images.
We conduct experiments on CIFAR10, CIFAR100, ImageNet and fMoW datasets where we use significantly less high resolution data while maintaining similar accuracy to models which use full high resolution images.
arXiv Detail & Related papers (2020-03-01T07:16:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.