FineHand: Learning Hand Shapes for American Sign Language Recognition
- URL: http://arxiv.org/abs/2003.08753v1
- Date: Wed, 4 Mar 2020 23:32:08 GMT
- Title: FineHand: Learning Hand Shapes for American Sign Language Recognition
- Authors: Al Amin Hosain, Panneer Selvam Santhalingam, Parth Pathak, Huzefa
Rangwala and Jana Kosecka
- Abstract summary: We present an approach for effective learning of hand shape embeddings, which are discriminative for ASL gestures.
For hand shape recognition our method uses a mix of manually labelled hand shapes and high confidence predictions to train deep convolutional neural network (CNN)
We will demonstrate that higher quality hand shape models can significantly improve the accuracy of final video gesture classification.
- Score: 16.862375555609667
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: American Sign Language recognition is a difficult gesture recognition
problem, characterized by fast, highly articulate gestures. These are comprised
of arm movements with different hand shapes, facial expression and head
movements. Among these components, hand shape is the vital, often the most
discriminative part of a gesture. In this work, we present an approach for
effective learning of hand shape embeddings, which are discriminative for ASL
gestures. For hand shape recognition our method uses a mix of manually labelled
hand shapes and high confidence predictions to train deep convolutional neural
network (CNN). The sequential gesture component is captured by recursive neural
network (RNN) trained on the embeddings learned in the first stage. We will
demonstrate that higher quality hand shape models can significantly improve the
accuracy of final video gesture classification in challenging conditions with
variety of speakers, different illumination and significant motion blurr. We
compare our model to alternative approaches exploiting different modalities and
representations of the data and show improved video gesture recognition
accuracy on GMU-ASL51 benchmark dataset
Related papers
- Sign Language Recognition Based On Facial Expression and Hand Skeleton [2.5879170041667523]
We propose a sign language recognition network that integrates skeleton features of hands and facial expression.
By incorporating facial expression information, the accuracy and robustness of sign language recognition are improved.
arXiv Detail & Related papers (2024-07-02T13:02:51Z) - Self-Supervised Representation Learning with Spatial-Temporal Consistency for Sign Language Recognition [96.62264528407863]
We propose a self-supervised contrastive learning framework to excavate rich context via spatial-temporal consistency.
Inspired by the complementary property of motion and joint modalities, we first introduce first-order motion information into sign language modeling.
Our method is evaluated with extensive experiments on four public benchmarks, and achieves new state-of-the-art performance with a notable margin.
arXiv Detail & Related papers (2024-06-15T04:50:19Z) - ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis [50.69464138626748]
We present ConvoFusion, a diffusion-based approach for multi-modal gesture synthesis.
Our method proposes two guidance objectives that allow the users to modulate the impact of different conditioning modalities.
Our method is versatile in that it can be trained either for generating monologue gestures or even the conversational gestures.
arXiv Detail & Related papers (2024-03-26T17:59:52Z) - Local Spherical Harmonics Improve Skeleton-Based Hand Action Recognition [17.62840662799232]
We propose a method specifically designed for hand action recognition which uses relative angular embeddings and local Spherical Harmonics to create novel hand representations.
The use of Spherical Harmonics creates rotation-invariant representations which make hand action recognition even more robust against inter-subject differences and viewpoint changes.
arXiv Detail & Related papers (2023-08-21T08:17:42Z) - Sign Language Recognition via Skeleton-Aware Multi-Model Ensemble [71.97020373520922]
Sign language is commonly used by deaf or mute people to communicate.
We propose a novel Multi-modal Framework with a Global Ensemble Model (GEM) for isolated Sign Language Recognition ( SLR)
Our proposed SAM- SLR-v2 framework is exceedingly effective and achieves state-of-the-art performance with significant margins.
arXiv Detail & Related papers (2021-10-12T16:57:18Z) - Real-time Indian Sign Language (ISL) Recognition [0.45880283710344055]
This paper presents a system which can recognise hand poses & gestures from the Indian Sign Language (ISL) in real-time.
The existing solutions either provide relatively low accuracy or do not work in real-time.
It can identify 33 hand poses and some gestures from the ISL.
arXiv Detail & Related papers (2021-08-24T21:49:21Z) - Align before Fuse: Vision and Language Representation Learning with
Momentum Distillation [52.40490994871753]
We introduce a contrastive loss to representations BEfore Fusing (ALBEF) through cross-modal attention.
We propose momentum distillation, a self-training method which learns from pseudo-targets produced by a momentum model.
ALBEF achieves state-of-the-art performance on multiple downstream vision-language tasks.
arXiv Detail & Related papers (2021-07-16T00:19:22Z) - A deep-learning--based multimodal depth-aware dynamic hand gesture
recognition system [5.458813674116228]
We focus on dynamic hand gesture (DHG) recognition using depth quantized image hand skeleton joint points.
In particular, we explore the effect of using depth-quantized features in CNN and Recurrent Neural Network (RNN) based multi-modal fusion networks.
arXiv Detail & Related papers (2021-07-06T11:18:53Z) - SHREC 2021: Track on Skeleton-based Hand Gesture Recognition in the Wild [62.450907796261646]
Recognition of hand gestures can be performed directly from the stream of hand skeletons estimated by software.
Despite the recent advancements in gesture and action recognition from skeletons, it is unclear how well the current state-of-the-art techniques can perform in a real-world scenario.
This paper presents the results of the SHREC 2021: Track on Skeleton-based Hand Gesture Recognition in the Wild contest.
arXiv Detail & Related papers (2021-06-21T10:57:49Z) - Skeleton Based Sign Language Recognition Using Whole-body Keypoints [71.97020373520922]
Sign language is used by deaf or speech impaired people to communicate.
Skeleton-based recognition is becoming popular that it can be further ensembled with RGB-D based method to achieve state-of-the-art performance.
Inspired by the recent development of whole-body pose estimation citejin 2020whole, we propose recognizing sign language based on the whole-body key points and features.
arXiv Detail & Related papers (2021-03-16T03:38:17Z) - Understanding the hand-gestures using Convolutional Neural Networks and
Generative Adversial Networks [0.0]
The system consists of three modules: real time hand tracking, training gesture and gesture recognition using Convolutional Neural Networks.
It has been tested to the vocabulary of 36 gestures including the alphabets and digits, and results effectiveness of the approach.
arXiv Detail & Related papers (2020-11-10T02:20:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.