A novel approach for holographic 3D content generation without depth map
- URL: http://arxiv.org/abs/2309.14967v1
- Date: Tue, 26 Sep 2023 14:37:31 GMT
- Title: A novel approach for holographic 3D content generation without depth map
- Authors: Hakdong Kim, Minkyu Jee, Yurim Lee, Kyudam Choi, MinSung Yoon and
Cheongwon Kim
- Abstract summary: We propose a deep learning-based method to synthesize the volumetric digital holograms using only the given RGB image.
Through experiments, we demonstrate that the volumetric hologram generated through our proposed model is more accurate than that of competitive models.
- Score: 2.905273049932301
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In preparation for observing holographic 3D content, acquiring a set of RGB
color and depth map images per scene is necessary to generate
computer-generated holograms (CGHs) when using the fast Fourier transform (FFT)
algorithm. However, in real-world situations, these paired formats of RGB color
and depth map images are not always fully available. We propose a deep
learning-based method to synthesize the volumetric digital holograms using only
the given RGB image, so that we can overcome environments where RGB color and
depth map images are partially provided. The proposed method uses only the
input of RGB image to estimate its depth map and then generate its CGH
sequentially. Through experiments, we demonstrate that the volumetric hologram
generated through our proposed model is more accurate than that of competitive
models, under the situation that only RGB color data can be provided.
Related papers
- Depth-based Privileged Information for Boosting 3D Human Pose Estimation on RGB [48.31210455404533]
Heatmap-based 3D pose estimator is able to hallucinate depth information from the RGB frames given at inference time.
depth information is used exclusively during training by enforcing our RGB-based hallucination network to learn similar features to a backbone pre-trained only on depth data.
arXiv Detail & Related papers (2024-09-17T11:59:34Z) - $PC^2$: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D
Reconstruction [97.06927852165464]
Reconstructing the 3D shape of an object from a single RGB image is a long-standing and highly challenging problem in computer vision.
We propose a novel method for single-image 3D reconstruction which generates a sparse point cloud via a conditional denoising diffusion process.
arXiv Detail & Related papers (2023-02-21T13:37:07Z) - Beyond Visual Field of View: Perceiving 3D Environment with Echoes and
Vision [51.385731364529306]
This paper focuses on perceiving and navigating 3D environments using echoes and RGB image.
In particular, we perform depth estimation by fusing RGB image with echoes, received from multiple orientations.
We show that the echoes provide holistic and in-expensive information about the 3D structures complementing the RGB image.
arXiv Detail & Related papers (2022-07-03T22:31:47Z) - Depth-SIMS: Semi-Parametric Image and Depth Synthesis [23.700034054124604]
We present a method that generates RGB canvases with well aligned segmentation maps and sparse depth maps, coupled with an in-painting network that transforms the RGB canvases into high quality RGB images.
We benchmark our method in terms of structural alignment and image quality, showing an increase in mIoU over SOTA by 3.7 percentage points and a highly competitive FID.
We analyse the quality of the generated data as training data for semantic segmentation and depth completion, and show that our approach is more suited for this purpose than other methods.
arXiv Detail & Related papers (2022-03-07T13:58:32Z) - Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images [89.81919625224103]
Training deep models for RGB-D salient object detection (SOD) often requires a large number of labeled RGB-D images.
We present a Dual-Semi RGB-D Salient Object Detection Network (DS-Net) to leverage unlabeled RGB images for boosting RGB-D saliency detection.
arXiv Detail & Related papers (2022-01-01T03:02:27Z) - RGB-D Image Inpainting Using Generative Adversarial Network with a Late
Fusion Approach [14.06830052027649]
Diminished reality is a technology that aims to remove objects from video images and fills in the missing region with plausible pixels.
We propose an RGB-D image inpainting method using generative adversarial network.
arXiv Detail & Related papers (2021-10-14T14:44:01Z) - Colored Point Cloud to Image Alignment [15.828285556159026]
We introduce a differential optimization method that aligns a colored point cloud to a given color image via iterative geometric and color matching.
We find the transformation between the camera image and the point cloud colors by iterating between matching the relative location of the point cloud and matching colors.
arXiv Detail & Related papers (2021-10-07T08:12:56Z) - Semantic-embedded Unsupervised Spectral Reconstruction from Single RGB
Images in the Wild [48.44194221801609]
We propose a new lightweight and end-to-end learning-based framework to tackle this challenge.
We progressively spread the differences between input RGB images and re-projected RGB images from recovered HS images via effective camera spectral response function estimation.
Our method significantly outperforms state-of-the-art unsupervised methods and even exceeds the latest supervised method under some settings.
arXiv Detail & Related papers (2021-08-15T05:19:44Z) - Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD
Images [69.5662419067878]
Grounding referring expressions in RGBD image has been an emerging field.
We present a novel task of 3D visual grounding in single-view RGBD image where the referred objects are often only partially scanned due to occlusion.
Our approach first fuses the language and the visual features at the bottom level to generate a heatmap that localizes the relevant regions in the RGBD image.
Then our approach conducts an adaptive feature learning based on the heatmap and performs the object-level matching with another visio-linguistic fusion to finally ground the referred object.
arXiv Detail & Related papers (2021-03-14T11:18:50Z) - NormalGAN: Learning Detailed 3D Human from a Single RGB-D Image [34.79657678041356]
We propose a fast adversarial learning-based method to reconstruct the complete and detailed 3D human from a single RGB-D image.
Given a consumer RGB-D sensor, NormalGAN can generate the complete and detailed 3D human reconstruction results in 20 fps.
arXiv Detail & Related papers (2020-07-30T09:35:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.