VirtualCube: An Immersive 3D Video Communication System
- URL: http://arxiv.org/abs/2112.06730v1
- Date: Mon, 13 Dec 2021 15:34:08 GMT
- Title: VirtualCube: An Immersive 3D Video Communication System
- Authors: Yizhong Zhang, Jiaolong Yang, Zhen Liu, Ruicheng Wang, Guojun Chen,
Xin Tong, and Baining Guo
- Abstract summary: The VirtualCube system is a 3D video conference system that attempts to overcome some limitations of conventional technologies.
The key ingredient is VirtualCube, an abstract representation of a real-world cubicle instrumented with RGBD cameras for capturing the 3D geometry and texture of a user.
- Score: 22.603545138780287
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The VirtualCube system is a 3D video conference system that attempts to
overcome some limitations of conventional technologies. The key ingredient is
VirtualCube, an abstract representation of a real-world cubicle instrumented
with RGBD cameras for capturing the 3D geometry and texture of a user. We
design VirtualCube so that the task of data capturing is standardized and
significantly simplified, and everything can be built using off-the-shelf
hardware. We use VirtualCubes as the basic building blocks of a virtual
conferencing environment, and we provide each VirtualCube user with a
surrounding display showing life-size videos of remote participants. To achieve
real-time rendering of remote participants, we develop the V-Cube View
algorithm, which uses multi-view stereo for more accurate depth estimation and
Lumi-Net rendering for better rendering quality. The VirtualCube system
correctly preserves the mutual eye gaze between participants, allowing them to
establish eye contact and be aware of who is visually paying attention to them.
The system also allows a participant to have side discussions with remote
participants as if they were in the same room. Finally, the system sheds lights
on how to support the shared space of work items (e.g., documents and
applications) and track the visual attention of participants to work items.
Related papers
- iControl3D: An Interactive System for Controllable 3D Scene Generation [57.048647153684485]
iControl3D is a novel interactive system that empowers users to generate and render customizable 3D scenes with precise control.
We leverage 3D meshes as an intermediary proxy to iteratively merge individual 2D diffusion-generated images into a cohesive and unified 3D scene representation.
Our neural rendering interface enables users to build a radiance field of their scene online and navigate the entire scene.
arXiv Detail & Related papers (2024-08-03T06:35:09Z) - EvaSurf: Efficient View-Aware Implicit Textured Surface Reconstruction on Mobile Devices [53.28220984270622]
We present an implicit textured $textbfSurf$ace reconstruction method on mobile devices.
Our method can reconstruct high-quality appearance and accurate mesh on both synthetic and real-world datasets.
Our method can be trained in just 1-2 hours using a single GPU and run on mobile devices at over 40 FPS (Frames Per Second)
arXiv Detail & Related papers (2023-11-16T11:30:56Z) - FLARE: Fast Learning of Animatable and Relightable Mesh Avatars [64.48254296523977]
Our goal is to efficiently learn personalized animatable 3D head avatars from videos that are geometrically accurate, realistic, relightable, and compatible with current rendering systems.
We introduce FLARE, a technique that enables the creation of animatable and relightable avatars from a single monocular video.
arXiv Detail & Related papers (2023-10-26T16:13:00Z) - Virtual Reality in Metaverse over Wireless Networks with User-centered
Deep Reinforcement Learning [8.513938423514636]
We introduce a multi-user VR computation offloading over wireless communication scenario.
In addition, we devised a novel user-centered deep reinforcement learning approach to find a near-optimal solution.
arXiv Detail & Related papers (2023-03-08T03:10:41Z) - Neural Rendering in a Room: Amodal 3D Understanding and Free-Viewpoint
Rendering for the Closed Scene Composed of Pre-Captured Objects [40.59508249969956]
We present a novel solution to mimic such human perception capability based on a new paradigm of amodal 3D scene understanding with neural rendering for a closed scene.
We first learn the prior knowledge of the objects in a closed scene via an offline stage, which facilitates an online stage to understand the room with unseen furniture arrangement.
During the online stage, given a panoramic image of the scene in different layouts, we utilize a holistic neural-rendering-based optimization framework to efficiently estimate the correct 3D scene layout and deliver realistic free-viewpoint rendering.
arXiv Detail & Related papers (2022-05-05T15:34:09Z) - Attention based Occlusion Removal for Hybrid Telepresence Systems [5.006086647446482]
We propose a novel attention-enabled encoder-decoder architecture for HMD de-occlusion.
We report superior qualitative and quantitative results over state-of-the-art methods.
We also present applications of this approach to hybrid video teleconferencing using existing animation and 3D face reconstruction pipelines.
arXiv Detail & Related papers (2021-12-02T10:18:22Z) - Pixel Codec Avatars [99.36561532588831]
Pixel Codec Avatars (PiCA) is a deep generative model of 3D human faces.
On a single Oculus Quest 2 mobile VR headset, 5 avatars are rendered in realtime in the same scene.
arXiv Detail & Related papers (2021-04-09T23:17:36Z) - Unmasking Communication Partners: A Low-Cost AI Solution for Digitally
Removing Head-Mounted Displays in VR-Based Telepresence [62.997667081978825]
Face-to-face conversation in Virtual Reality (VR) is a challenge when participants wear head-mounted displays (HMD)
Past research has shown that high-fidelity face reconstruction with personal avatars in VR is possible under laboratory conditions with high-cost hardware.
We propose one of the first low-cost systems for this task which uses only open source, free software and affordable hardware.
arXiv Detail & Related papers (2020-11-06T23:17:12Z) - Weakly Supervised Learning of Multi-Object 3D Scene Decompositions Using
Deep Shape Priors [69.02332607843569]
PriSMONet is a novel approach for learning Multi-Object 3D scene decomposition and representations from single images.
A recurrent encoder regresses a latent representation of 3D shape, pose and texture of each object from an input RGB image.
We evaluate the accuracy of our model in inferring 3D scene layout, demonstrate its generative capabilities, assess its generalization to real images, and point out benefits of the learned representation.
arXiv Detail & Related papers (2020-10-08T14:49:23Z) - SAILenv: Learning in Virtual Visual Environments Made Simple [16.979621213790015]
We present a novel platform that allows researchers to experiment visual recognition in virtual 3D scenes.
A few lines of code are needed to interface every algorithm with the virtual world, and non-3D-graphics experts can easily customize the 3D environment itself.
Our framework yields pixel-level semantic and instance labeling, depth, and, to the best of our knowledge, it is the only one that provides motion-related information directly inherited from the 3D engine.
arXiv Detail & Related papers (2020-07-16T09:50:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.