Combining Vision and EMG-Based Hand Tracking for Extended Reality
Musical Instruments
- URL: http://arxiv.org/abs/2307.10203v1
- Date: Thu, 13 Jul 2023 15:15:02 GMT
- Title: Combining Vision and EMG-Based Hand Tracking for Extended Reality
Musical Instruments
- Authors: Max Graf, Mathieu Barthet
- Abstract summary: Self-occlusion remains a significant challenge for vision-based hand tracking systems.
We propose a multimodal hand tracking system that combines vision-based hand tracking with surface electromyography (sEMG) data for finger joint angle estimation.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hand tracking is a critical component of natural user interactions in
extended reality (XR) environments, including extended reality musical
instruments (XRMIs). However, self-occlusion remains a significant challenge
for vision-based hand tracking systems, leading to inaccurate results and
degraded user experiences. In this paper, we propose a multimodal hand tracking
system that combines vision-based hand tracking with surface electromyography
(sEMG) data for finger joint angle estimation. We validate the effectiveness of
our system through a series of hand pose tasks designed to cover a wide range
of gestures, including those prone to self-occlusion. By comparing the
performance of our multimodal system to a baseline vision-based tracking
method, we demonstrate that our multimodal approach significantly improves
tracking accuracy for several finger joints prone to self-occlusion. These
findings suggest that our system has the potential to enhance XR experiences by
providing more accurate and robust hand tracking, even in the presence of
self-occlusion.
Related papers
- A Cross-Scene Benchmark for Open-World Drone Active Tracking [54.235808061746525]
Drone Visual Active Tracking aims to autonomously follow a target object by controlling the motion system based on visual observations.
We propose a unified cross-scene cross-domain benchmark for open-world drone active tracking called DAT.
We also propose a reinforcement learning-based drone tracking method called R-VAT.
arXiv Detail & Related papers (2024-12-01T09:37:46Z) - XR-MBT: Multi-modal Full Body Tracking for XR through Self-Supervision with Learned Depth Point Cloud Registration [19.874691210555472]
XR-MBT tracks legs in XR for the first time, whereas traditional synthesis approaches based on partial body tracking are blind.
We demonstrate how current 3-point motion synthesis models can be extended to point cloud modalities.
arXiv Detail & Related papers (2024-11-27T14:25:32Z) - Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition [24.217068565936117]
We present a novel method for action recognition that integrates motion data from body-worn IMUs with egocentric video.
To model the complex relation of multiple IMU devices placed across the body, we exploit the collaborative dynamics in multiple IMU devices.
Experiments show our method can achieve state-of-the-art performance on multiple public datasets.
arXiv Detail & Related papers (2024-07-09T07:53:16Z) - Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs [61.143381152739046]
We introduce Cambrian-1, a family of multimodal LLMs (MLLMs) designed with a vision-centric approach.
Our study uses LLMs and visual instruction tuning as an interface to evaluate various visual representations.
We provide model weights, code, supporting tools, datasets, and detailed instruction-tuning and evaluation recipes.
arXiv Detail & Related papers (2024-06-24T17:59:42Z) - Learning Visuotactile Skills with Two Multifingered Hands [80.99370364907278]
We explore learning from human demonstrations using a bimanual system with multifingered hands and visuotactile data.
Our results mark a promising step forward in bimanual multifingered manipulation from visuotactile data.
arXiv Detail & Related papers (2024-04-25T17:59:41Z) - Exploiting Modality-Specific Features For Multi-Modal Manipulation
Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks.
Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment.
We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z) - UltraGlove: Hand Pose Estimation with Mems-Ultrasonic Sensors [14.257535961674021]
We propose a novel and low-cost hand-tracking glove that utilizes several MEMS-ultrasonic sensors attached to the fingers.
Our experimental results demonstrate that this approach is both accurate, size-agnostic, and robust to external interference.
arXiv Detail & Related papers (2023-06-22T03:41:47Z) - Scalable and Real-time Multi-Camera Vehicle Detection,
Re-Identification, and Tracking [58.95210121654722]
We propose a real-time city-scale multi-camera vehicle tracking system that handles real-world, low-resolution CCTV instead of idealized and curated video streams.
Our method is ranked among the top five performers on the public leaderboard.
arXiv Detail & Related papers (2022-04-15T12:47:01Z) - HMD-EgoPose: Head-Mounted Display-Based Egocentric Marker-Less Tool and
Hand Pose Estimation for Augmented Surgical Guidance [0.0]
We present HMD-EgoPose, a single-shot learning-based approach to hand and object pose estimation.
We demonstrate state-of-the-art performance on a benchmark dataset for marker-less hand and surgical instrument pose tracking.
arXiv Detail & Related papers (2022-02-24T04:07:34Z) - Temporally Guided Articulated Hand Pose Tracking in Surgical Videos [22.752654546694334]
Articulated hand pose tracking is an under-explored problem that carries the potential for use in an extensive number of applications.
We propose a novel hand pose estimation model, CondPose, which improves detection and tracking accuracy by incorporating a pose prior to its prediction.
arXiv Detail & Related papers (2021-01-12T03:44:04Z) - Relational Graph Learning on Visual and Kinematics Embeddings for
Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information.
The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.