Physics-based Differentiable Depth Sensor Simulation
- URL: http://arxiv.org/abs/2103.16563v1
- Date: Tue, 30 Mar 2021 17:59:43 GMT
- Title: Physics-based Differentiable Depth Sensor Simulation
- Authors: Benjamin Planche, Rajat Vikram Singh
- Abstract summary: We introduce a novel end-to-end differentiable simulation pipeline for the generation of realistic 2.5D scans.
Each module can be differentiated w.r.t sensor and scene parameters.
Our simulation greatly improves the performance of the resulting models on real scans.
- Score: 5.134435281973137
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gradient-based algorithms are crucial to modern computer-vision and graphics
applications, enabling learning-based optimization and inverse problems. For
example, photorealistic differentiable rendering pipelines for color images
have been proven highly valuable to applications aiming to map 2D and 3D
domains. However, to the best of our knowledge, no effort has been made so far
towards extending these gradient-based methods to the generation of depth
(2.5D) images, as simulating structured-light depth sensors implies solving
complex light transport and stereo-matching problems. In this paper, we
introduce a novel end-to-end differentiable simulation pipeline for the
generation of realistic 2.5D scans, built on physics-based 3D rendering and
custom block-matching algorithms. Each module can be differentiated w.r.t
sensor and scene parameters; e.g., to automatically tune the simulation for new
devices over some provided scans or to leverage the pipeline as a 3D-to-2.5D
transformer within larger computer-vision applications. Applied to the training
of deep-learning methods for various depth-based recognition tasks
(classification, pose estimation, semantic segmentation), our simulation
greatly improves the performance of the resulting models on real scans, thereby
demonstrating the fidelity and value of its synthetic depth data compared to
previous static simulations and learning-based domain adaptation schemes.
Related papers
- Efficient Physics-Based Learned Reconstruction Methods for Real-Time 3D
Near-Field MIMO Radar Imaging [0.0]
Near-field multiple-input multiple-output (MIMO) radar imaging systems have recently gained significant attention.
In this paper, we develop novel non-iterative deep learning-based reconstruction methods for real-time near-field imaging.
The goal is to achieve high image quality with low computational cost at settings.
arXiv Detail & Related papers (2023-12-28T11:05:36Z) - RiCS: A 2D Self-Occlusion Map for Harmonizing Volumetric Objects [68.85305626324694]
Ray-marching in Camera Space (RiCS) is a new method to represent the self-occlusions of foreground objects in 3D into a 2D self-occlusion map.
We show that our representation map not only allows us to enhance the image quality but also to model temporally coherent complex shadow effects.
arXiv Detail & Related papers (2022-05-14T05:35:35Z) - Sparse Depth Completion with Semantic Mesh Deformation Optimization [4.03103540543081]
We propose a neural network with post-optimization, which takes an RGB image and sparse depth samples as input and predicts the complete depth map.
Our evaluation results outperform the existing work consistently on both indoor and outdoor datasets.
arXiv Detail & Related papers (2021-12-10T13:01:06Z) - Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images
with Virtual Depth [64.29043589521308]
We propose a rendering module to augment the training data by synthesizing images with virtual-depths.
The rendering module takes as input the RGB image and its corresponding sparse depth image, outputs a variety of photo-realistic synthetic images.
Besides, we introduce an auxiliary module to improve the detection model by jointly optimizing it through a depth estimation task.
arXiv Detail & Related papers (2021-07-28T11:00:47Z) - Deep Direct Volume Rendering: Learning Visual Feature Mappings From
Exemplary Images [57.253447453301796]
We introduce Deep Direct Volume Rendering (DeepDVR), a generalization of Direct Volume Rendering (DVR) that allows for the integration of deep neural networks into the DVR algorithm.
We conceptualize the rendering in a latent color space, thus enabling the use of deep architectures to learn implicit mappings for feature extraction and classification.
Our generalization serves to derive novel volume rendering architectures that can be trained end-to-end directly from examples in image space.
arXiv Detail & Related papers (2021-06-09T23:03:00Z) - Using Adaptive Gradient for Texture Learning in Single-View 3D
Reconstruction [0.0]
Learning-based approaches for 3D model reconstruction have attracted attention owing to its modern applications.
We present a novel sampling algorithm by optimizing the gradient of predicted coordinates based on the variance on the sampling image.
We also adopt Frechet Inception Distance (FID) to form a loss function in learning, which helps bridging the gap between rendered images and input images.
arXiv Detail & Related papers (2021-04-29T07:52:54Z) - Geometric Correspondence Fields: Learned Differentiable Rendering for 3D
Pose Refinement in the Wild [96.09941542587865]
We present a novel 3D pose refinement approach based on differentiable rendering for objects of arbitrary categories in the wild.
In this way, we precisely align 3D models to objects in RGB images which results in significantly improved 3D pose estimates.
We evaluate our approach on the challenging Pix3D dataset and achieve up to 55% relative improvement compared to state-of-the-art refinement methods in multiple metrics.
arXiv Detail & Related papers (2020-07-17T12:34:38Z) - Exploring the Capabilities and Limits of 3D Monocular Object Detection
-- A Study on Simulation and Real World Data [0.0]
3D object detection based on monocular camera data is key enabler for autonomous driving.
Recent deep learning methods show promising results to recover depth information from single images.
In this paper, we evaluate the performance of a 3D object detection pipeline which is parameterizable with different depth estimation configurations.
arXiv Detail & Related papers (2020-05-15T09:05:17Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z) - Two-shot Spatially-varying BRDF and Shape Estimation [89.29020624201708]
We propose a novel deep learning architecture with a stage-wise estimation of shape and SVBRDF.
We create a large-scale synthetic training dataset with domain-randomized geometry and realistic materials.
Experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.
arXiv Detail & Related papers (2020-04-01T12:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.