Predicting User Grasp Intentions in Virtual Reality
- URL: http://arxiv.org/abs/2508.16582v1
- Date: Tue, 05 Aug 2025 15:17:19 GMT
- Title: Predicting User Grasp Intentions in Virtual Reality
- Authors: Linghao Zeng,
- Abstract summary: We evaluate classification and regression approaches across 810 trials with varied object types, sizes, and manipulations.<n>Regression-based approaches demonstrate more robust performance, with timing errors within 0.25 seconds and distance errors around 5-20 cm.<n>Our results underscore the potential of machine learning models to enhance VR interactions.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Predicting user intentions in virtual reality (VR) is crucial for creating immersive experiences, particularly in tasks involving complex grasping motions where accurate haptic feedback is essential. In this work, we leverage time-series data from hand movements to evaluate both classification and regression approaches across 810 trials with varied object types, sizes, and manipulations. Our findings reveal that classification models struggle to generalize across users, leading to inconsistent performance. In contrast, regression-based approaches, particularly those using Long Short Term Memory (LSTM) networks, demonstrate more robust performance, with timing errors within 0.25 seconds and distance errors around 5-20 cm in the critical two-second window before a grasp. Despite these improvements, predicting precise hand postures remains challenging. Through a comprehensive analysis of user variability and model interpretability, we explore why certain models fail and how regression models better accommodate the dynamic and complex nature of user behavior in VR. Our results underscore the potential of machine learning models to enhance VR interactions, particularly through adaptive haptic feedback, and lay the groundwork for future advancements in real-time prediction of user actions in VR.
Related papers
- GazeProphetV2: Head-Movement-Based Gaze Prediction Enabling Efficient Foveated Rendering on Mobile VR [0.0]
This paper introduces a multimodal approach to VR gaze prediction that combines temporal gaze patterns, head movement data, and visual scene information.<n> Evaluations using a dataset spanning 22 VR scenes with 5.3M gaze samples show improvements in predictive accuracy when combining modalities.<n>Cross-scene generalization testing shows consistent performance with 93.1% validation accuracy and temporal consistency in predicted gaze trajectories.
arXiv Detail & Related papers (2025-11-25T06:55:39Z) - Behavioral Biometrics for Automatic Detection of User Familiarity in VR [0.0]
A growing number of users without prior experience will engage with virtual reality (VR) systems.<n>In this study, we explore the automatic detection of VR familiarity by analyzing hand movement patterns during a passcode-based door-opening task.<n>Our results underline the promise of using hand movement biometrics for the real-time detection of user familiarity in critical VR applications.
arXiv Detail & Related papers (2025-10-14T21:00:05Z) - Understanding Cognitive States from Head & Hand Motion Data [1.0742675209112622]
We introduce a novel dataset of head and hand motion with frame-level annotations of states collected during structured decision-making tasks.<n>Our findings suggest that deep temporal models can infer subtle cognitive states from motion alone, achieving comparable performance with human observers.<n>This work demonstrates that standard VR telemetry contains strong patterns related to users' internal cognitive processes, which opens the door for a new generation of adaptive virtual environments.
arXiv Detail & Related papers (2025-09-29T03:59:56Z) - From Seeing to Experiencing: Scaling Navigation Foundation Models with Reinforcement Learning [59.88543114325153]
We introduce the Seeing-to-Experiencing framework to scale the capability of navigation foundation models with reinforcement learning.<n>S2E combines the strengths of pre-training on videos and post-training through RL.<n>We establish a comprehensive end-to-end evaluation benchmark, NavBench-GS, built on photorealistic 3DGS reconstructions of real-world scenes.
arXiv Detail & Related papers (2025-07-29T17:26:10Z) - RoHOI: Robustness Benchmark for Human-Object Interaction Detection [78.18946529195254]
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support.<n>We introduce the first benchmark for HOI detection, evaluating model resilience under diverse challenges.<n>Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric.
arXiv Detail & Related papers (2025-07-12T01:58:04Z) - Towards Consumer-Grade Cybersickness Prediction: Multi-Model Alignment for Real-Time Vision-Only Inference [3.4667973471411853]
Cybersickness is a major obstacle to the widespread adoption of immersive virtual reality (VR)<n>We propose a scalable, deployable framework for personalized cybersickness prediction.<n>Our framework supports real-time applications, ideal for integration into consumer-grade VR platforms.
arXiv Detail & Related papers (2025-01-02T11:41:43Z) - Self-Avatar Animation in Virtual Reality: Impact of Motion Signals Artifacts on the Full-Body Pose Reconstruction [13.422686350235615]
We aim to measure the impact on the reconstruction of the articulated self-avatar's full-body pose.
We analyze the motion reconstruction errors using ground truth and 3D Cartesian coordinates estimated from textitYOLOv8 pose estimation.
arXiv Detail & Related papers (2024-04-29T12:02:06Z) - Scaling Up Dynamic Human-Scene Interaction Modeling [58.032368564071895]
TRUMANS is the most comprehensive motion-captured HSI dataset currently available.
It intricately captures whole-body human motions and part-level object dynamics.
We devise a diffusion-based autoregressive model that efficiently generates HSI sequences of any length.
arXiv Detail & Related papers (2024-03-13T15:45:04Z) - Toward Optimized VR/AR Ergonomics: Modeling and Predicting User Neck
Muscle Contraction [21.654553113159665]
We measure, model, and predict VR users' neck muscle contraction levels (MCL) while they move their heads to interact with the virtual environment.
We develop a bio-physically inspired computational model to predict neck MCL under diverse head kinematic states.
We hope this research will motivate new ergonomic-centered designs for VR/AR and interactive graphics applications.
arXiv Detail & Related papers (2023-08-28T18:58:01Z) - Force-Aware Interface via Electromyography for Natural VR/AR Interaction [69.1332992637271]
We design a learning-based neural interface for natural and intuitive force inputs in VR/AR.
We show that our interface can decode finger-wise forces in real-time with 3.3% mean error, and generalize to new users with little calibration.
We envision our findings to push forward research towards more realistic physicality in future VR/AR.
arXiv Detail & Related papers (2022-10-03T20:51:25Z) - Short-Term Trajectory Prediction for Full-Immersive Multiuser Virtual
Reality with Redirected Walking [6.622115542749609]
Full-immersive multiuser Virtual Reality (VR) envisions supporting unconstrained mobility of the users in the virtual worlds.
We show that Gated Recurrent Unit (GRU) networks, another candidate from the RNN family, generally outperform the traditionally utilized LSTMs.
Second, we show that context from a virtual world can enhance the accuracy of the prediction if used as an additional input feature.
arXiv Detail & Related papers (2022-07-15T15:09:07Z) - On the Real-World Adversarial Robustness of Real-Time Semantic
Segmentation Models for Autonomous Driving [59.33715889581687]
The existence of real-world adversarial examples (commonly in the form of patches) poses a serious threat for the use of deep learning models in safety-critical computer vision tasks.
This paper presents an evaluation of the robustness of semantic segmentation models when attacked with different types of adversarial patches.
A novel loss function is proposed to improve the capabilities of attackers in inducing a misclassification of pixels.
arXiv Detail & Related papers (2022-01-05T22:33:43Z) - Social NCE: Contrastive Learning of Socially-aware Motion
Representations [87.82126838588279]
Experimental results show that the proposed method dramatically reduces the collision rates of recent trajectory forecasting, behavioral cloning and reinforcement learning algorithms.
Our method makes few assumptions about neural architecture designs, and hence can be used as a generic way to promote the robustness of neural motion models.
arXiv Detail & Related papers (2020-12-21T22:25:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.