Mediapipe and CNNs for Real-Time ASL Gesture Recognition
- URL: http://arxiv.org/abs/2305.05296v3
- Date: Wed, 24 May 2023 06:48:01 GMT
- Title: Mediapipe and CNNs for Real-Time ASL Gesture Recognition
- Authors: Rupesh Kumar, Ashutosh Bajpai, Ayush Sinha (Galgotias college of
Engineering and Technology)
- Abstract summary: This research paper describes a realtime system for identifying American Sign Language (ASL) movements.
The suggested method makes use of the Mediapipe library for feature extraction and a Convolutional Neural Network (CNN) for ASL gesture classification.
- Score: 0.1529342790344802
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: This research paper describes a realtime system for identifying American Sign
Language (ASL) movements that employs modern computer vision and machine
learning approaches. The suggested method makes use of the Mediapipe library
for feature extraction and a Convolutional Neural Network (CNN) for ASL gesture
classification. The testing results show that the suggested system can detect
all ASL alphabets with an accuracy of 99.95%, indicating its potential for use
in communication devices for people with hearing impairments. The proposed
approach can also be applied to additional sign languages with similar hand
motions, potentially increasing the quality of life for people with hearing
loss. Overall, the study demonstrates the effectiveness of using Mediapipe and
CNN for real-time sign language recognition, making a significant contribution
to the field of computer vision and machine learning.
Related papers
- Deep Neural Network-Based Sign Language Recognition: A Comprehensive Approach Using Transfer Learning with Explainability [0.0]
We suggest a novel solution that uses a deep neural network to fully automate sign language recognition.
This methodology integrates sophisticated preprocessing methodologies to optimise the overall performance.
Our model's ability to provide informational clarity was assessed using the SHAP (SHapley Additive exPlanations) method.
arXiv Detail & Related papers (2024-09-11T17:17:44Z) - Enhancing Sign Language Detection through Mediapipe and Convolutional Neural Networks (CNN) [3.192629447369627]
This research combines MediaPipe and CNNs for the efficient and accurate interpretation of ASL dataset.
The accuracy achieved by the model on ASL datasets is 99.12%.
The system will have applications in the communication, education, and accessibility domains.
arXiv Detail & Related papers (2024-06-06T04:05:12Z) - Why does Self-Supervised Learning for Speech Recognition Benefit Speaker
Recognition? [86.53044183309824]
We study which factor leads to the success of self-supervised learning on speaker-related tasks.
Our empirical results on the Voxceleb-1 dataset suggest that the benefit of SSL to SV task is from a combination of mask speech prediction loss, data scale, and model size.
arXiv Detail & Related papers (2022-04-27T08:35:57Z) - Audio Self-supervised Learning: A Survey [60.41768569891083]
Self-Supervised Learning (SSL) targets at discovering general representations from large-scale data without requiring human annotations.
Its success in the fields of computer vision and natural language processing have prompted its recent adoption into the field of audio and speech processing.
arXiv Detail & Related papers (2022-03-02T15:58:29Z) - UniSpeech-SAT: Universal Speech Representation Learning with Speaker
Aware Pre-Training [72.004873454347]
Two methods are introduced for enhancing the unsupervised speaker information extraction.
Experiment results on SUPERB benchmark show that the proposed system achieves state-of-the-art performance.
We scale up training dataset to 94 thousand hours public audio data and achieve further performance improvement.
arXiv Detail & Related papers (2021-10-12T05:43:30Z) - Skeleton Based Sign Language Recognition Using Whole-body Keypoints [71.97020373520922]
Sign language is used by deaf or speech impaired people to communicate.
Skeleton-based recognition is becoming popular that it can be further ensembled with RGB-D based method to achieve state-of-the-art performance.
Inspired by the recent development of whole-body pose estimation citejin 2020whole, we propose recognizing sign language based on the whole-body key points and features.
arXiv Detail & Related papers (2021-03-16T03:38:17Z) - Acoustics Based Intent Recognition Using Discovered Phonetic Units for
Low Resource Languages [51.0542215642794]
We propose a novel acoustics based intent recognition system that uses discovered phonetic units for intent classification.
We present results for two languages families - Indic languages and Romance languages, for two different intent recognition tasks.
arXiv Detail & Related papers (2020-11-07T00:35:31Z) - Novel Approach to Use HU Moments with Image Processing Techniques for
Real Time Sign Language Communication [0.0]
"Sign Language Communicator" (SLC) is designed to solve the language barrier between the sign language users and the rest of the world.
System is able to recognize selected Sign Language signs with the accuracy of 84% without a controlled background with small light adjustments.
arXiv Detail & Related papers (2020-07-20T03:10:18Z) - A Deep Learning based Wearable Healthcare IoT Device for AI-enabled
Hearing Assistance Automation [6.283190933140046]
This research presents a novel AI-enabled Internet of Things (IoT) device capable of assisting those who suffer from impairment of hearing or deafness to communicate with others in conversations.
A server application is created that leverages Google's online speech recognition service to convert the received conversations into texts, then deployed to a micro-display attached to the glasses to display the conversation contents to deaf people.
arXiv Detail & Related papers (2020-05-16T19:42:16Z) - Meta-Transfer Learning for Code-Switched Speech Recognition [72.84247387728999]
We propose a new learning method, meta-transfer learning, to transfer learn on a code-switched speech recognition system in a low-resource setting.
Our model learns to recognize individual languages, and transfer them so as to better recognize mixed-language speech by conditioning the optimization on the code-switching data.
arXiv Detail & Related papers (2020-04-29T14:27:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.