VGG Induced Deep Hand Sign Language Detection
- URL: http://arxiv.org/abs/2601.08262v1
- Date: Tue, 13 Jan 2026 06:39:29 GMT
- Title: VGG Induced Deep Hand Sign Language Detection
- Authors: Subham Sharma, Sharmila Subudhi,
- Abstract summary: This work proposes a novel hand gesture recognizing system for the differently-abled persons.<n>The model uses a convolutional neural network, known as VGG-16 net, for building a trained model on a widely used image dataset.<n>The experimental results show that by combining a transfer learning mechanism together with the image data augmentation, the VGG-16 net produced around 98% accuracy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Hand gesture recognition is an important aspect of human-computer interaction. It forms the basis of sign language for the visually impaired people. This work proposes a novel hand gesture recognizing system for the differently-abled persons. The model uses a convolutional neural network, known as VGG-16 net, for building a trained model on a widely used image dataset by employing Python and Keras libraries. Furthermore, the result is validated by the NUS dataset, consisting of 10 classes of hand gestures, fed to the model as the validation set. Afterwards, a testing dataset of 10 classes is built by employing Google's open source Application Programming Interface (API) that captures different gestures of human hand and the efficacy is then measured by carrying out experiments. The experimental results show that by combining a transfer learning mechanism together with the image data augmentation, the VGG-16 net produced around 98% accuracy.
Related papers
- Hierarchical Windowed Graph Attention Network and a Large Scale Dataset for Isolated Indian Sign Language Recognition [0.20075899678041528]
We introduce a large-scale isolated ISL dataset and a novel SL recognition model based on skeleton graph structure.
The dataset covers 2002 daily used common words in the deaf community recorded by 20 (10 male and 10 female) deaf adult signers.
We propose a SL recognition model namely Hierarchical Windowed Graph Attention Network (HWGAT) by utilizing the human upper body skeleton graph.
arXiv Detail & Related papers (2024-07-19T11:48:36Z) - Online Recognition of Incomplete Gesture Data to Interface Collaborative
Robots [0.0]
This paper introduces an HRI framework to classify large vocabularies of interwoven static gestures (SGs) and dynamic gestures (DGs) captured with wearable sensors.
The recognized gestures are used to teleoperate a robot in a collaborative process that consists of preparing a breakfast meal.
arXiv Detail & Related papers (2023-04-13T18:49:08Z) - Advancing 3D finger knuckle recognition via deep feature learning [51.871256510747465]
Contactless 3D finger knuckle patterns have emerged as an effective biometric identifier due to its discriminativeness, visibility from a distance, and convenience.
Recent research has developed a deep feature collaboration network which simultaneously incorporates intermediate features from deep neural networks with multiple scales.
This paper advances this approach by investigating the possibility of learning a discriminative feature vector with the least possible dimension for representing 3D finger knuckle images.
arXiv Detail & Related papers (2023-01-07T20:55:16Z) - Video-based Pose-Estimation Data as Source for Transfer Learning in
Human Activity Recognition [71.91734471596433]
Human Activity Recognition (HAR) using on-body devices identifies specific human actions in unconstrained environments.
Previous works demonstrated that transfer learning is a good strategy for addressing scenarios with scarce data.
This paper proposes using datasets intended for human-pose estimation as a source for transfer learning.
arXiv Detail & Related papers (2022-12-02T18:19:36Z) - HaGRID - HAnd Gesture Recognition Image Dataset [79.21033185563167]
This paper introduces an enormous dataset, HaGRID, to build a hand gesture recognition system concentrating on interaction with devices to manage them.
Although the gestures are static, they were picked up, especially for the ability to design several dynamic gestures.
The HaGRID contains 554,800 images and bounding box annotations with gesture labels to solve hand detection and gesture classification tasks.
arXiv Detail & Related papers (2022-06-16T14:41:32Z) - Keypoint Message Passing for Video-based Person Re-Identification [106.41022426556776]
Video-based person re-identification (re-ID) is an important technique in visual surveillance systems which aims to match video snippets of people captured by different cameras.
Existing methods are mostly based on convolutional neural networks (CNNs), whose building blocks either process local neighbor pixels at a time, or, when 3D convolutions are used to model temporal information, suffer from the misalignment problem caused by person movement.
In this paper, we propose to overcome the limitations of normal convolutions with a human-oriented graph method. Specifically, features located at person joint keypoints are extracted and connected as a spatial-temporal graph
arXiv Detail & Related papers (2021-11-16T08:01:16Z) - Efficient sign language recognition system and dataset creation method
based on deep learning and image processing [0.0]
This work investigates techniques of digital image processing and machine learning that can be used to create a sign language dataset effectively.
Different datasets were created to test the hypotheses, containing 14 words used daily and recorded by different smartphones in the RGB color system.
We achieved an accuracy of 96.38% on the test set and 81.36% on the validation set containing more challenging conditions.
arXiv Detail & Related papers (2021-03-22T23:36:49Z) - Skeleton Based Sign Language Recognition Using Whole-body Keypoints [71.97020373520922]
Sign language is used by deaf or speech impaired people to communicate.
Skeleton-based recognition is becoming popular that it can be further ensembled with RGB-D based method to achieve state-of-the-art performance.
Inspired by the recent development of whole-body pose estimation citejin 2020whole, we propose recognizing sign language based on the whole-body key points and features.
arXiv Detail & Related papers (2021-03-16T03:38:17Z) - Application of Facial Recognition using Convolutional Neural Networks
for Entry Access Control [0.0]
The paper focuses on solving the supervised classification problem of taking images of people as input and classifying the person in the image as one of the authors or not.
Two approaches are proposed: (1) building and training a neural network called WoodNet from scratch and (2) leveraging transfer learning by utilizing a network pre-trained on the ImageNet database.
The results are two models classifying the individuals in the dataset with high accuracy, achieving over 99% accuracy on held-out test data.
arXiv Detail & Related papers (2020-11-23T07:55:24Z) - Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision
Action Recognition [131.6328804788164]
We propose a framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos)
The SAKDN uses multiple wearable-sensors as teacher modalities and uses RGB videos as student modality.
arXiv Detail & Related papers (2020-09-01T03:38:31Z) - A Deep Learning Framework for Recognizing both Static and Dynamic
Gestures [0.8602553195689513]
We propose a unified framework that recognizes both static and dynamic gestures, using simple RGB vision (without depth sensing)
We employ a pose-driven spatial attention strategy, which guides our proposed Static and Dynamic gestures Network - StaDNet.
In a number of experiments, we show that the proposed approach surpasses the state-of-the-art results on the large-scale Chalearn 2016 dataset.
arXiv Detail & Related papers (2020-06-11T10:39:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.