Fast Learning of Dynamic Hand Gesture Recognition with Few-Shot Learning
Models
- URL: http://arxiv.org/abs/2212.08363v1
- Date: Fri, 16 Dec 2022 09:31:15 GMT
- Title: Fast Learning of Dynamic Hand Gesture Recognition with Few-Shot Learning
Models
- Authors: Niels Schl\"usener, Michael B\"ucker
- Abstract summary: We develop Few-Shot Learning models trained to recognize five or ten different dynamic hand gestures.
Models are arbitrarily interchangeable by providing the model with one, two, or five examples per hand gesture.
Result show accuracy of up to 88.8% for recognition of five and up to 81.2% for ten dynamic hand gestures.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We develop Few-Shot Learning models trained to recognize five or ten
different dynamic hand gestures, respectively, which are arbitrarily
interchangeable by providing the model with one, two, or five examples per hand
gesture. All models were built in the Few-Shot Learning architecture of the
Relation Network (RN), in which Long-Short-Term Memory cells form the backbone.
The models use hand reference points extracted from RGB-video sequences of the
Jester dataset which was modified to contain 190 different types of hand
gestures. Result show accuracy of up to 88.8% for recognition of five and up to
81.2% for ten dynamic hand gestures. The research also sheds light on the
potential effort savings of using a Few-Shot Learning approach instead of a
traditional Deep Learning approach to detect dynamic hand gestures. Savings
were defined as the number of additional observations required when a Deep
Learning model is trained on new hand gestures instead of a Few Shot Learning
model. The difference with respect to the total number of observations required
to achieve approximately the same accuracy indicates potential savings of up to
630 observations for five and up to 1260 observations for ten hand gestures to
be recognized. Since labeling video recordings of hand gestures implies
significant effort, these savings can be considered substantial.
Related papers
- Zero-Shot Underwater Gesture Recognition [3.4078654008228924]
Hand gesture recognition allows humans to interact with machines non-verbally, which has a huge application in underwater exploration using autonomous underwater vehicles.
Recently, a new gesture-based language called CADDIAN has been devised for divers, and supervised learning methods have been applied to recognize the gestures with high accuracy.
In this work, we advocate the need for zero-shot underwater gesture recognition (ZSUGR), where the objective is to train a model with visual samples of gestures from a few seen'' classes only and transfer the gained knowledge at test time to recognize semantically-similar unseen gesture classes as well.
arXiv Detail & Related papers (2024-07-19T08:16:46Z) - Wearable Sensor-Based Few-Shot Continual Learning on Hand Gestures for Motor-Impaired Individuals via Latent Embedding Exploitation [6.782362178252351]
We introduce the Latent Embedding Exploitation (LEE) mechanism in our replay-based Few-Shot Continual Learning framework.
Our method produces a diversified latent feature space by leveraging a preserved latent embedding known as gesture prior knowledge.
Our method helps motor-impaired persons leverage wearable devices, and their unique styles of movement can be learned and applied.
arXiv Detail & Related papers (2024-05-14T21:20:27Z) - HomE: Homography-Equivariant Video Representation Learning [62.89516761473129]
We propose a novel method for representation learning of multi-view videos.
Our method learns an implicit mapping between different views, culminating in a representation space that maintains the homography relationship between neighboring views.
On action classification, our method obtains 96.4% 3-fold accuracy on the UCF101 dataset, better than most state-of-the-art self-supervised learning methods.
arXiv Detail & Related papers (2023-06-02T15:37:43Z) - Revisiting Classifier: Transferring Vision-Language Models for Video
Recognition [102.93524173258487]
Transferring knowledge from task-agnostic pre-trained deep models for downstream tasks is an important topic in computer vision research.
In this study, we focus on transferring knowledge for video classification tasks.
We utilize the well-pretrained language model to generate good semantic target for efficient transferring learning.
arXiv Detail & Related papers (2022-07-04T10:00:47Z) - HaGRID - HAnd Gesture Recognition Image Dataset [79.21033185563167]
This paper introduces an enormous dataset, HaGRID, to build a hand gesture recognition system concentrating on interaction with devices to manage them.
Although the gestures are static, they were picked up, especially for the ability to design several dynamic gestures.
The HaGRID contains 554,800 images and bounding box annotations with gesture labels to solve hand detection and gesture classification tasks.
arXiv Detail & Related papers (2022-06-16T14:41:32Z) - Enabling hand gesture customization on wrist-worn devices [28.583516259577486]
We present a framework for gesture customization requiring minimal examples from users, all without degrading the performance of existing gesture sets.
Our approach paves the way for a future where users are no longer bound to pre-existing gestures, freeing them to creatively introduce new gestures tailored to their preferences and abilities.
arXiv Detail & Related papers (2022-03-29T05:12:32Z) - Partner-Assisted Learning for Few-Shot Image Classification [54.66864961784989]
Few-shot Learning has been studied to mimic human visual capabilities and learn effective models without the need of exhaustive human annotation.
In this paper, we focus on the design of training strategy to obtain an elemental representation such that the prototype of each novel class can be estimated from a few labeled samples.
We propose a two-stage training scheme, which first trains a partner encoder to model pair-wise similarities and extract features serving as soft-anchors, and then trains a main encoder by aligning its outputs with soft-anchors while attempting to maximize classification performance.
arXiv Detail & Related papers (2021-09-15T22:46:19Z) - Dynamic Modeling of Hand-Object Interactions via Tactile Sensing [133.52375730875696]
In this work, we employ a high-resolution tactile glove to perform four different interactive activities on a diversified set of objects.
We build our model on a cross-modal learning framework and generate the labels using a visual processing pipeline to supervise the tactile model.
This work takes a step on dynamics modeling in hand-object interactions from dense tactile sensing.
arXiv Detail & Related papers (2021-09-09T16:04:14Z) - SHREC 2021: Track on Skeleton-based Hand Gesture Recognition in the Wild [62.450907796261646]
Recognition of hand gestures can be performed directly from the stream of hand skeletons estimated by software.
Despite the recent advancements in gesture and action recognition from skeletons, it is unclear how well the current state-of-the-art techniques can perform in a real-world scenario.
This paper presents the results of the SHREC 2021: Track on Skeleton-based Hand Gesture Recognition in the Wild contest.
arXiv Detail & Related papers (2021-06-21T10:57:49Z) - FS-HGR: Few-shot Learning for Hand Gesture Recognition via
ElectroMyography [19.795875814764116]
"Few-Shot Learning" is a variant of domain adaptation with the goal of inferring the required output based on just one or a few training examples.
The proposed approach led to 85.94% classification accuracy on new repetitions with few-shot observation (5-way 5-shot), 81.29% accuracy on new subjects with few-shot observation (5-way 5-shot), and 73.36% accuracy on new gestures with few-shot observation (5-way 5-shot)
arXiv Detail & Related papers (2020-11-11T22:33:31Z) - FineHand: Learning Hand Shapes for American Sign Language Recognition [16.862375555609667]
We present an approach for effective learning of hand shape embeddings, which are discriminative for ASL gestures.
For hand shape recognition our method uses a mix of manually labelled hand shapes and high confidence predictions to train deep convolutional neural network (CNN)
We will demonstrate that higher quality hand shape models can significantly improve the accuracy of final video gesture classification.
arXiv Detail & Related papers (2020-03-04T23:32:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.