On-device Real-time Custom Hand Gesture Recognition
- URL: http://arxiv.org/abs/2309.10858v1
- Date: Tue, 19 Sep 2023 18:05:14 GMT
- Title: On-device Real-time Custom Hand Gesture Recognition
- Authors: Esha Uboweja, David Tian, Qifei Wang, Yi-Chun Kuo, Joe Zou, Lu Wang,
George Sung, Matthias Grundmann
- Abstract summary: We present a user-friendly framework that lets users easily customize and deploy their own gesture recognition pipeline.
Our framework provides a pre-trained single-hand embedding model that can be fine-tuned for custom gesture recognition.
We also offer a low-code solution to train and deploy the custom gesture recognition model.
- Score: 5.3581349005036465
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most existing hand gesture recognition (HGR) systems are limited to a
predefined set of gestures. However, users and developers often want to
recognize new, unseen gestures. This is challenging due to the vast diversity
of all plausible hand shapes, e.g. it is impossible for developers to include
all hand gestures in a predefined list. In this paper, we present a
user-friendly framework that lets users easily customize and deploy their own
gesture recognition pipeline. Our framework provides a pre-trained single-hand
embedding model that can be fine-tuned for custom gesture recognition. Users
can perform gestures in front of a webcam to collect a small amount of images
per gesture. We also offer a low-code solution to train and deploy the custom
gesture recognition model. This makes it easy for users with limited ML
expertise to use our framework. We further provide a no-code web front-end for
users without any ML expertise. This makes it even easier to build and test the
end-to-end pipeline. The resulting custom HGR is then ready to be run on-device
for real-time scenarios. This can be done by calling a simple function in our
open-sourced model inference API, MediaPipe Tasks. This entire process only
takes a few minutes.
Related papers
- From Pixels to UI Actions: Learning to Follow Instructions via Graphical
User Interfaces [66.85108822706489]
This paper focuses on creating agents that interact with the digital world using the same conceptual interface that humans commonly use.
It is possible for such agents to outperform human crowdworkers on the MiniWob++ benchmark of GUI-based instruction following tasks.
arXiv Detail & Related papers (2023-05-31T23:39:18Z) - Agile gesture recognition for capacitive sensing devices: adapting
on-the-job [55.40855017016652]
We demonstrate a hand gesture recognition system that uses signals from capacitive sensors embedded into the etee hand controller.
The controller generates real-time signals from each of the wearer five fingers.
We use a machine learning technique to analyse the time series signals and identify three features that can represent 5 fingers within 500 ms.
arXiv Detail & Related papers (2023-05-12T17:24:02Z) - GesSure -- A Robust Face-Authentication enabled Dynamic Gesture
Recognition GUI Application [1.3649494534428745]
This paper aims to design a robust, face-verification-enabled gesture recognition system.
We use meaningful and relevant gestures for task operation, resulting in a better user experience.
Our prototype has successfully executed context-dependent tasks like save, print, control video-player operations and exit, and context-free operating system tasks like sleep, shut-down, and unlock intuitively.
arXiv Detail & Related papers (2022-07-22T12:14:35Z) - The Gesture Authoring Space: Authoring Customised Hand Gestures for
Grasping Virtual Objects in Immersive Virtual Environments [81.5101473684021]
This work proposes a hand gesture authoring tool for object specific grab gestures allowing virtual objects to be grabbed as in the real world.
The presented solution uses template matching for gesture recognition and requires no technical knowledge to design and create custom tailored hand gestures.
The study showed that gestures created with the proposed approach are perceived by users as a more natural input modality than the others.
arXiv Detail & Related papers (2022-07-03T18:33:33Z) - HaGRID - HAnd Gesture Recognition Image Dataset [79.21033185563167]
This paper introduces an enormous dataset, HaGRID, to build a hand gesture recognition system concentrating on interaction with devices to manage them.
Although the gestures are static, they were picked up, especially for the ability to design several dynamic gestures.
The HaGRID contains 554,800 images and bounding box annotations with gesture labels to solve hand detection and gesture classification tasks.
arXiv Detail & Related papers (2022-06-16T14:41:32Z) - On-device Real-time Hand Gesture Recognition [1.4658400971135652]
We present an on-device real-time hand gesture recognition (HGR) system, which detects a set of predefined static gestures from a single RGB camera.
We use MediaPipe Hands as the basis of the hand skeleton tracker, improve the keypoint accuracy, and add the estimation of 3D keypoints in a world metric space.
arXiv Detail & Related papers (2021-10-29T18:33:25Z) - SHREC 2021: Track on Skeleton-based Hand Gesture Recognition in the Wild [62.450907796261646]
Recognition of hand gestures can be performed directly from the stream of hand skeletons estimated by software.
Despite the recent advancements in gesture and action recognition from skeletons, it is unclear how well the current state-of-the-art techniques can perform in a real-world scenario.
This paper presents the results of the SHREC 2021: Track on Skeleton-based Hand Gesture Recognition in the Wild contest.
arXiv Detail & Related papers (2021-06-21T10:57:49Z) - Skeleton Based Sign Language Recognition Using Whole-body Keypoints [71.97020373520922]
Sign language is used by deaf or speech impaired people to communicate.
Skeleton-based recognition is becoming popular that it can be further ensembled with RGB-D based method to achieve state-of-the-art performance.
Inspired by the recent development of whole-body pose estimation citejin 2020whole, we propose recognizing sign language based on the whole-body key points and features.
arXiv Detail & Related papers (2021-03-16T03:38:17Z) - Gestop : Customizable Gesture Control of Computer Systems [0.3553493344868413]
Gestop is a framework that learns to detect gestures from demonstrations and is customizable by end-users.
It enables users to interact in real-time with computers having only RGB cameras, using gestures.
arXiv Detail & Related papers (2020-10-25T19:13:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.