EgoTouch: On-Body Touch Input Using AR/VR Headset Cameras
- URL: http://arxiv.org/abs/2509.01786v1
- Date: Mon, 01 Sep 2025 21:32:30 GMT
- Title: EgoTouch: On-Body Touch Input Using AR/VR Headset Cameras
- Authors: Vimal Mollyn, Chris Harrison,
- Abstract summary: We demonstrate high accuracy, bare hands (i.e., no special instrumentation of the user) skin input using just an RGB camera.<n>Our results show this approach can be accurate, and robust across diverse lighting conditions.<n>We believe these are the requisite technical ingredients to unlock more fully on-skin interfaces.
- Score: 13.852935460131896
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In augmented and virtual reality (AR/VR) experiences, a user's arms and hands can provide a convenient and tactile surface for touch input. Prior work has shown on-body input to have significant speed, accuracy, and ergonomic benefits over in-air interfaces, which are common today. In this work, we demonstrate high accuracy, bare hands (i.e., no special instrumentation of the user) skin input using just an RGB camera, like those already integrated into all modern XR headsets. Our results show this approach can be accurate, and robust across diverse lighting conditions, skin tones, and body motion (e.g., input while walking). Finally, our pipeline also provides rich input metadata including touch force, finger identification, angle of attack, and rotation. We believe these are the requisite technical ingredients to more fully unlock on-skin interfaces that have been well motivated in the HCI literature but have lacked robust and practical methods.
Related papers
- OPENTOUCH: Bringing Full-Hand Touch to Real-World Interaction [93.88239833545623]
We present OpenTouch, the first in-the-wild egocentric full-hand tactile dataset.<n>We show that tactile signals provide a compact yet powerful cue for grasp understanding.<n>We aim to advance multimodal egocentric perception, embodied learning, and contact-rich robotic manipulation.
arXiv Detail & Related papers (2025-12-18T18:18:17Z) - EclipseTouch: Touch Segmentation on Ad Hoc Surfaces using Worn Infrared Shadow Casting [14.237453119638516]
We propose a new headset-integrated technique called systemname to detect touch events on uninstrumented surfaces.<n>We use a combination of a computer-triggered camera and one or more infrared emitters to create structured shadows, from which we can accurately estimate hover distance.<n>We discuss how our technique works across a range of conditions, including surface material, interaction orientation, and environmental lighting.
arXiv Detail & Related papers (2025-09-03T15:59:28Z) - EgoPressure: A Dataset for Hand Pressure and Pose Estimation in Egocentric Vision [69.1005706608681]
EgoPressure is a novel egocentric dataset that captures detailed touch contact and pressure interactions.<n>Our dataset comprises 5 hours of recorded interactions from 21 participants captured simultaneously by one head-mounted and seven stationary Kinect cameras.
arXiv Detail & Related papers (2024-09-03T18:53:32Z) - Learning In-Hand Translation Using Tactile Skin With Shear and Normal Force Sensing [43.269672740168396]
We introduce a sensor model for tactile skin that enables zero-shot sim-to-real transfer of ternary shear and binary normal forces.<n>We conduct extensive real-world experiments to assess how tactile sensing facilitates policy adaptation to various unseen object properties.
arXiv Detail & Related papers (2024-07-10T17:52:30Z) - Typing on Any Surface: A Deep Learning-based Method for Real-Time
Keystroke Detection in Augmented Reality [4.857109990499532]
Mid-air keyboard interface, wireless keyboards or voice input, either suffer from poor ergonomic design, limited accuracy, or are simply embarrassing to use in public.
This paper proposes and validates a deep-learning based approach, that enables AR applications to accurately predict keystrokes from the user perspective RGB video stream.
A two-stage model, combing an off-the-shelf hand landmark extractor and a novel adaptive Convolutional Recurrent Neural Network (C-RNN) was trained.
arXiv Detail & Related papers (2023-08-31T23:58:25Z) - Force-Aware Interface via Electromyography for Natural VR/AR Interaction [69.1332992637271]
We design a learning-based neural interface for natural and intuitive force inputs in VR/AR.
We show that our interface can decode finger-wise forces in real-time with 3.3% mean error, and generalize to new users with little calibration.
We envision our findings to push forward research towards more realistic physicality in future VR/AR.
arXiv Detail & Related papers (2022-10-03T20:51:25Z) - QuestSim: Human Motion Tracking from Sparse Sensors with Simulated
Avatars [80.05743236282564]
Real-time tracking of human body motion is crucial for immersive experiences in AR/VR.
We present a reinforcement learning framework that takes in sparse signals from an HMD and two controllers.
We show that a single policy can be robust to diverse locomotion styles, different body sizes, and novel environments.
arXiv Detail & Related papers (2022-09-20T00:25:54Z) - AvatarPoser: Articulated Full-Body Pose Tracking from Sparse Motion
Sensing [24.053096294334694]
We present AvatarPoser, the first learning-based method that predicts full-body poses in world coordinates using only motion input from the user's head and hands.
Our method builds on a Transformer encoder to extract deep features from the input signals and decouples global motion from the learned local joint orientations.
In our evaluation, AvatarPoser achieved new state-of-the-art results in evaluations on large motion capture datasets.
arXiv Detail & Related papers (2022-07-27T20:52:39Z) - The Gesture Authoring Space: Authoring Customised Hand Gestures for
Grasping Virtual Objects in Immersive Virtual Environments [81.5101473684021]
This work proposes a hand gesture authoring tool for object specific grab gestures allowing virtual objects to be grabbed as in the real world.
The presented solution uses template matching for gesture recognition and requires no technical knowledge to design and create custom tailored hand gestures.
The study showed that gestures created with the proposed approach are perceived by users as a more natural input modality than the others.
arXiv Detail & Related papers (2022-07-03T18:33:33Z) - Unmasking Communication Partners: A Low-Cost AI Solution for Digitally
Removing Head-Mounted Displays in VR-Based Telepresence [62.997667081978825]
Face-to-face conversation in Virtual Reality (VR) is a challenge when participants wear head-mounted displays (HMD)
Past research has shown that high-fidelity face reconstruction with personal avatars in VR is possible under laboratory conditions with high-cost hardware.
We propose one of the first low-cost systems for this task which uses only open source, free software and affordable hardware.
arXiv Detail & Related papers (2020-11-06T23:17:12Z) - Physics-Based Dexterous Manipulations with Estimated Hand Poses and
Residual Reinforcement Learning [52.37106940303246]
We learn a model that maps noisy input hand poses to target virtual poses.
The agent is trained in a residual setting by using a model-free hybrid RL+IL approach.
We test our framework in two applications that use hand pose estimates for dexterous manipulations: hand-object interactions in VR and hand-object motion reconstruction in-the-wild.
arXiv Detail & Related papers (2020-08-07T17:34:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.