A Monocular SLAM-based Multi-User Positioning System with Image Occlusion in Augmented Reality
- URL: http://arxiv.org/abs/2411.10940v1
- Date: Sun, 17 Nov 2024 02:39:30 GMT
- Title: A Monocular SLAM-based Multi-User Positioning System with Image Occlusion in Augmented Reality
- Authors: Wei-Hsiang Lien, Benedictus Kent Chandra, Robin Fischer, Ya-Hui Tang, Shiann-Jang Wang, Wei-En Hsu, Li-Chen Fu,
- Abstract summary: We propose a multi-user localization system based on ORB-SLAM2 using monocular RGB images as a development platform based on the Unity 3D game engine.
This system not only performs user localization but also places a common virtual object on a planar surface so that every user holds a proper perspective view of the object.
The positioning information is passed among every user's AR devices via a central server, based on which the relative position and movement of other users in the space of a specific user are presented.
- Score: 2.8155732302036176
- License:
- Abstract: In recent years, with the rapid development of augmented reality (AR) technology, there is an increasing demand for multi-user collaborative experiences. Unlike for single-user experiences, ensuring the spatial localization of every user and maintaining synchronization and consistency of positioning and orientation across multiple users is a significant challenge. In this paper, we propose a multi-user localization system based on ORB-SLAM2 using monocular RGB images as a development platform based on the Unity 3D game engine. This system not only performs user localization but also places a common virtual object on a planar surface (such as table) in the environment so that every user holds a proper perspective view of the object. These generated virtual objects serve as reference points for multi-user position synchronization. The positioning information is passed among every user's AR devices via a central server, based on which the relative position and movement of other users in the space of a specific user are presented via virtual avatars all with respect to these virtual objects. In addition, we use deep learning techniques to estimate the depth map of an image from a single RGB image to solve occlusion problems in AR applications, making virtual objects appear more natural in AR scenes.
Related papers
- GPAvatar: Generalizable and Precise Head Avatar from Image(s) [71.555405205039]
GPAvatar is a framework that reconstructs 3D head avatars from one or several images in a single forward pass.
The proposed method achieves faithful identity reconstruction, precise expression control, and multi-view consistency.
arXiv Detail & Related papers (2024-01-18T18:56:34Z) - ROAM: Robust and Object-Aware Motion Generation Using Neural Pose
Descriptors [73.26004792375556]
This paper shows that robustness and generalisation to novel scene objects in 3D object-aware character synthesis can be achieved by training a motion model with as few as one reference object.
We leverage an implicit feature representation trained on object-only datasets, which encodes an SE(3)-equivariant descriptor field around the object.
We demonstrate substantial improvements in 3D virtual character motion and interaction quality and robustness to scenarios with unseen objects.
arXiv Detail & Related papers (2023-08-24T17:59:51Z) - AnyDoor: Zero-shot Object-level Image Customization [63.44307304097742]
This work presents AnyDoor, a diffusion-based image generator with the power to teleport target objects to new scenes at user-specified locations.
Our model is trained only once and effortlessly generalizes to diverse object-scene combinations at the inference stage.
arXiv Detail & Related papers (2023-07-18T17:59:02Z) - User-centric Heterogeneous-action Deep Reinforcement Learning for
Virtual Reality in the Metaverse over Wireless Networks [8.513938423514636]
In this paper, we consider a system consisting of a Metaverse server and multiple VR users.
In our multi-user VR scenario for the Metaverse, users have different characteristics and demands for Frames Per Second (FPS)
Our proposed user-centric DRL algorithm is called User-centric Critic with Heterogenous Actors (UCHA)
arXiv Detail & Related papers (2023-02-03T00:12:12Z) - Multi-Camera Lighting Estimation for Photorealistic Front-Facing Mobile
Augmented Reality [6.41726492515401]
Lighting understanding plays an important role in virtual object composition, including mobile augmented reality (AR) applications.
We propose to leverage dual-camera streaming to generate a high-quality environment map by combining multi-view lighting reconstruction and parametric directional lighting estimation.
arXiv Detail & Related papers (2023-01-15T16:52:59Z) - The Gesture Authoring Space: Authoring Customised Hand Gestures for
Grasping Virtual Objects in Immersive Virtual Environments [81.5101473684021]
This work proposes a hand gesture authoring tool for object specific grab gestures allowing virtual objects to be grabbed as in the real world.
The presented solution uses template matching for gesture recognition and requires no technical knowledge to design and create custom tailored hand gestures.
The study showed that gestures created with the proposed approach are perceived by users as a more natural input modality than the others.
arXiv Detail & Related papers (2022-07-03T18:33:33Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z) - A Multi-user Oriented Live Free-viewpoint Video Streaming System Based
On View Interpolation [15.575219833681635]
We introduce a CNN-based view algorithm to synthesis dense virtual views in real time.
We also build an end-to-end live free-viewpoint system with a multi-user oriented streaming strategy.
arXiv Detail & Related papers (2021-12-20T15:17:57Z) - SceneGen: Generative Contextual Scene Augmentation using Scene Graph
Priors [3.1969855247377827]
We introduce SceneGen, a generative contextual augmentation framework that predicts virtual object positions and orientations within existing scenes.
SceneGen takes a semantically segmented scene as input, and outputs positional and orientational probability maps for placing virtual content.
We formulate a novel spatial Scene Graph representation, which encapsulates explicit topological properties between objects, object groups, and the room.
To demonstrate our system in action, we develop an Augmented Reality application, in which objects can be contextually augmented in real-time.
arXiv Detail & Related papers (2020-09-25T18:36:27Z) - Augment Yourself: Mixed Reality Self-Augmentation Using Optical
See-through Head-mounted Displays and Physical Mirrors [49.49841698372575]
Optical see-though head-mounted displays (OST HMDs) are one of the key technologies for merging virtual objects and physical scenes to provide an immersive mixed reality (MR) environment to its user.
We propose a novel concept and prototype system that combines OST HMDs and physical mirrors to enable self-augmentation and provide an immersive MR environment centered around the user.
Our system, to the best of our knowledge the first of its kind, estimates the user's pose in the virtual image generated by the mirror using an RGBD camera attached to the HMD and anchors virtual objects to the reflection rather
arXiv Detail & Related papers (2020-07-06T16:53:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.