Learning to compose 6-DoF omnidirectional videos using multi-sphere
images
- URL: http://arxiv.org/abs/2103.05842v1
- Date: Wed, 10 Mar 2021 03:09:55 GMT
- Title: Learning to compose 6-DoF omnidirectional videos using multi-sphere
images
- Authors: Jisheng Li, Yuze He, Yubin Hu, Yuxing Han, Jiangtao Wen
- Abstract summary: We propose a system that uses a 3D ConvNet to generate a multi-sphere images representation that can be experienced in 6-DoF VR.
The system utilizes conventional omnidirectional VR camera footage directly without the need for a depth map or segmentation mask.
A ground truth generation approach for high-quality artifact-free 6-DoF contents is proposed and can be used by the research and development community.
- Score: 16.423725132964776
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Omnidirectional video is an essential component of Virtual Reality. Although
various methods have been proposed to generate content that can be viewed with
six degrees of freedom (6-DoF), existing systems usually involve complex depth
estimation, image in-painting or stitching pre-processing. In this paper, we
propose a system that uses a 3D ConvNet to generate a multi-sphere images (MSI)
representation that can be experienced in 6-DoF VR. The system utilizes
conventional omnidirectional VR camera footage directly without the need for a
depth map or segmentation mask, thereby significantly simplifying the overall
complexity of the 6-DoF omnidirectional video composition. By using a newly
designed weighted sphere sweep volume (WSSV) fusing technique, our approach is
compatible with most panoramic VR camera setups. A ground truth generation
approach for high-quality artifact-free 6-DoF contents is proposed and can be
used by the research and development community for 6-DoF content generation.
Related papers
- GFlow: Recovering 4D World from Monocular Video [58.63051670458107]
We introduce GFlow, a framework that lifts a video (3D) to a 4D explicit representation, entailing a flow of Gaussian splatting through space and time.
GFlow first clusters the scene into still and moving parts, then applies a sequential optimization process.
GFlow transcends the boundaries of mere 4D reconstruction.
arXiv Detail & Related papers (2024-05-28T17:59:22Z) - MSI-NeRF: Linking Omni-Depth with View Synthesis through Multi-Sphere Image aided Generalizable Neural Radiance Field [1.3162012586770577]
We introduce MSI-NeRF, which combines deep learning omnidirectional depth estimation and novel view synthesis.
We construct a multi-sphere image as a cost volume through feature extraction and warping of the input images.
Our network has the generalization ability to reconstruct unknown scenes efficiently using only four images.
arXiv Detail & Related papers (2024-03-16T07:26:50Z) - Den-SOFT: Dense Space-Oriented Light Field DataseT for 6-DOF Immersive Experience [28.651514326042648]
We have built a custom mobile multi-camera large-space dense light field capture system.
Our aim is to contribute to the development of popular 3D scene reconstruction algorithms.
The collected dataset is much denser than existing datasets.
arXiv Detail & Related papers (2024-03-15T02:39:44Z) - MuRF: Multi-Baseline Radiance Fields [117.55811938988256]
We present Multi-Baseline Radiance Fields (MuRF), a feed-forward approach to solving sparse view synthesis.
MuRF achieves state-of-the-art performance across multiple different baseline settings.
We also show promising zero-shot generalization abilities on the Mip-NeRF 360 dataset.
arXiv Detail & Related papers (2023-12-07T18:59:56Z) - PERF: Panoramic Neural Radiance Field from a Single Panorama [109.31072618058043]
PERF is a novel view synthesis framework that trains a panoramic neural radiance field from a single panorama.
We propose a novel collaborative RGBD inpainting method and a progressive inpainting-and-erasing method to lift up a 360-degree 2D scene to a 3D scene.
Our PERF can be widely used for real-world applications, such as panorama-to-3D, text-to-3D, and 3D scene stylization applications.
arXiv Detail & Related papers (2023-10-25T17:59:01Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - Learning to Deblur and Rotate Motion-Blurred Faces [43.673660541417995]
We train a neural network to reconstruct a 3D video representation from a single image and the corresponding face gaze.
We then provide a camera viewpoint relative to the estimated gaze and the blurry image as input to an encoder-decoder network to generate a video of sharp frames with a novel camera viewpoint.
arXiv Detail & Related papers (2021-12-14T17:51:19Z) - Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo [103.08512487830669]
We present a modern solution to the multi-view photometric stereo problem (MVPS)
We procure the surface orientation using a photometric stereo (PS) image formation model and blend it with a multi-view neural radiance field representation to recover the object's surface geometry.
Our method performs neural rendering of multi-view images while utilizing surface normals estimated by a deep photometric stereo network.
arXiv Detail & Related papers (2021-10-11T20:20:03Z) - Real-time dense 3D Reconstruction from monocular video data captured by
low-cost UAVs [0.3867363075280543]
Real-time 3D reconstruction enables fast dense mapping of the environment which benefits numerous applications, such as navigation or live evaluation of an emergency.
In contrast to most real-time capable approaches, our approach does not need an explicit depth sensor.
By exploiting the self-motion of the unmanned aerial vehicle (UAV) flying with oblique view around buildings, we estimate both camera trajectory and depth for selected images with enough novel content.
arXiv Detail & Related papers (2021-04-21T13:12:17Z) - MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere
Images [26.899767088485184]
We introduce a method to convert stereo 360deg (omnidirectional stereo) imagery into a layered, multi-sphere image representation for 6DoF rendering.
This significantly improves comfort for the viewer, and can be inferred and rendered in real time on modern GPU hardware.
arXiv Detail & Related papers (2020-08-14T18:33:05Z) - Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images [59.906948203578544]
We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object.
We first estimate per-view depth maps using a deep multi-view stereo network.
These depth maps are used to coarsely align the different views.
We propose a novel multi-view reflectance estimation network architecture.
arXiv Detail & Related papers (2020-03-27T21:28:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.