Using CNNs For Users Segmentation In Video See-Through Augmented
Virtuality
- URL: http://arxiv.org/abs/2001.00487v1
- Date: Thu, 2 Jan 2020 15:22:36 GMT
- Title: Using CNNs For Users Segmentation In Video See-Through Augmented
Virtuality
- Authors: Pierre-Olivier Pigny and Lionel Dominjon
- Abstract summary: We present preliminary results on the use of deep learning techniques to integrate the users self-body and other participants into a head-mounted video see-through augmented virtuality scenario.
We propose to use a convolutional neural network for real time semantic segmentation of users bodies in the stereoscopic RGB video streams acquired from the perspective of the user.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present preliminary results on the use of deep learning
techniques to integrate the users self-body and other participants into a
head-mounted video see-through augmented virtuality scenario. It has been
previously shown that seeing users bodies in such simulations may improve the
feeling of both self and social presence in the virtual environment, as well as
user performance. We propose to use a convolutional neural network for real
time semantic segmentation of users bodies in the stereoscopic RGB video
streams acquired from the perspective of the user. We describe design issues as
well as implementation details of the system and demonstrate the feasibility of
using such neural networks for merging users bodies in an augmented virtuality
simulation.
Related papers
- Learning High-Quality Navigation and Zooming on Omnidirectional Images in Virtual Reality [37.564863636844905]
We present a novel system, called OmniVR, designed to enhance visual clarity during VR navigation.
Our system enables users to effortlessly locate and zoom in on the objects of interest in VR.
arXiv Detail & Related papers (2024-05-01T07:08:24Z) - Neural Clustering based Visual Representation Learning [61.72646814537163]
Clustering is one of the most classic approaches in machine learning and data analysis.
We propose feature extraction with clustering (FEC), which views feature extraction as a process of selecting representatives from data.
FEC alternates between grouping pixels into individual clusters to abstract representatives and updating the deep features of pixels with current representatives.
arXiv Detail & Related papers (2024-03-26T06:04:50Z) - Towards emotion recognition for virtual environments: an evaluation of
EEG features on benchmark dataset [0.0]
This paper investigates features extracted from electroencephalogram signals for the purpose of affective state modelling.
It aims to provide the foundation for future work in modelling user affect to enhance interaction experience in virtual environments.
arXiv Detail & Related papers (2022-10-25T10:02:55Z) - Force-Aware Interface via Electromyography for Natural VR/AR Interaction [69.1332992637271]
We design a learning-based neural interface for natural and intuitive force inputs in VR/AR.
We show that our interface can decode finger-wise forces in real-time with 3.3% mean error, and generalize to new users with little calibration.
We envision our findings to push forward research towards more realistic physicality in future VR/AR.
arXiv Detail & Related papers (2022-10-03T20:51:25Z) - Differentiable Frequency-based Disentanglement for Aerial Video Action
Recognition [56.91538445510214]
We present a learning algorithm for human activity recognition in videos.
Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras.
We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset.
arXiv Detail & Related papers (2022-09-15T22:16:52Z) - A Multi-user Oriented Live Free-viewpoint Video Streaming System Based
On View Interpolation [15.575219833681635]
We introduce a CNN-based view algorithm to synthesis dense virtual views in real time.
We also build an end-to-end live free-viewpoint system with a multi-user oriented streaming strategy.
arXiv Detail & Related papers (2021-12-20T15:17:57Z) - PreViTS: Contrastive Pretraining with Video Tracking Supervision [53.73237606312024]
PreViTS is an unsupervised SSL framework for selecting clips containing the same object.
PreViTS spatially constrains the frame regions to learn from and trains the model to locate meaningful objects.
We train a momentum contrastive (MoCo) encoder on VGG-Sound and Kinetics-400 datasets with PreViTS.
arXiv Detail & Related papers (2021-12-01T19:49:57Z) - Cloud based Scalable Object Recognition from Video Streams using
Orientation Fusion and Convolutional Neural Networks [11.44782606621054]
Convolutional neural networks (CNNs) have been widely used to perform intelligent visual object recognition.
CNNs still suffer from severe accuracy degradation, particularly on illumination-variant datasets.
We propose a new CNN method based on orientation fusion for visual object recognition.
arXiv Detail & Related papers (2021-06-19T07:15:15Z) - AEGIS: A real-time multimodal augmented reality computer vision based
system to assist facial expression recognition for individuals with autism
spectrum disorder [93.0013343535411]
This paper presents the development of a multimodal augmented reality (AR) system which combines the use of computer vision and deep convolutional neural networks (CNN)
The proposed system, which we call AEGIS, is an assistive technology deployable on a variety of user devices including tablets, smartphones, video conference systems, or smartglasses.
We leverage both spatial and temporal information in order to provide an accurate expression prediction, which is then converted into its corresponding visualization and drawn on top of the original video frame.
arXiv Detail & Related papers (2020-10-22T17:20:38Z) - Visual Concept Reasoning Networks [93.99840807973546]
A split-transform-merge strategy has been broadly used as an architectural constraint in convolutional neural networks for visual recognition tasks.
We propose to exploit this strategy and combine it with our Visual Concept Reasoning Networks (VCRNet) to enable reasoning between high-level visual concepts.
Our proposed model, VCRNet, consistently improves the performance by increasing the number of parameters by less than 1%.
arXiv Detail & Related papers (2020-08-26T20:02:40Z) - Stillleben: Realistic Scene Synthesis for Deep Learning in Robotics [33.30312206728974]
We describe a synthesis pipeline capable of producing training data for cluttered scene perception tasks.
Our approach arranges object meshes in physically realistic, dense scenes using physics simulation.
Our pipeline can be run online during training of a deep neural network.
arXiv Detail & Related papers (2020-05-12T10:11:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.