Sign Language to Text Conversion in Real Time using Transfer Learning
- URL: http://arxiv.org/abs/2211.14446v1
- Date: Sun, 13 Nov 2022 17:20:19 GMT
- Title: Sign Language to Text Conversion in Real Time using Transfer Learning
- Authors: Shubham Thakar, Samveg Shah, Bhavya Shah, Anant V. Nimkar
- Abstract summary: We propose a deep learning model trained on the American Sign Language.
There has been an improvement in accuracy from 94% of CNN to 98.7% by Transfer Learning.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The people in the world who are hearing impaired face many obstacles in
communication and require an interpreter to comprehend what a person is saying.
There has been constant scientific research and the existing models lack the
ability to make accurate predictions. So we propose a deep learning model
trained on the ASL i.e. American Sign Language which will take action in the
form of American Sign Language as input and translate it into text. To achieve
the former a Convolution Neural Network based VGG16 architecture is used as
well as a TensorFlow model for image classification and we have improved the
accuracy of the latter by over 4%. There has been an improvement in accuracy
from 94% of CNN to 98.7% by Transfer Learning. An application with the deep
learning model integrated has also been built.
Related papers
- Advanced Arabic Alphabet Sign Language Recognition Using Transfer Learning and Transformer Models [0.0]
This paper presents an Arabic Alphabet Sign Language recognition approach, using deep learning methods in conjunction with transfer learning and transformer-based models.
We study the performance of the different variants on two publicly available datasets, namely ArSL2018 and AASL.
Experimental results present evidence that the suggested methodology can receive a high recognition accuracy, by up to 99.6% and 99.43% on ArSL2018 and AASL, respectively.
arXiv Detail & Related papers (2024-10-01T13:39:26Z) - Enhancing Sign Language Detection through Mediapipe and Convolutional Neural Networks (CNN) [3.192629447369627]
This research combines MediaPipe and CNNs for the efficient and accurate interpretation of ASL dataset.
The accuracy achieved by the model on ASL datasets is 99.12%.
The system will have applications in the communication, education, and accessibility domains.
arXiv Detail & Related papers (2024-06-06T04:05:12Z) - Language Contamination Explains the Cross-lingual Capabilities of
English Pretrained Models [79.38278330678965]
We find that common English pretraining corpora contain significant amounts of non-English text.
This leads to hundreds of millions of foreign language tokens in large-scale datasets.
We then demonstrate that even these small percentages of non-English data facilitate cross-lingual transfer for models trained on them.
arXiv Detail & Related papers (2022-04-17T23:56:54Z) - VidLanKD: Improving Language Understanding via Video-Distilled Knowledge
Transfer [76.3906723777229]
We present VidLanKD, a video-language knowledge distillation method for improving language understanding.
We train a multi-modal teacher model on a video-text dataset, and then transfer its knowledge to a student language model with a text dataset.
In our experiments, VidLanKD achieves consistent improvements over text-only language models and vokenization models.
arXiv Detail & Related papers (2021-07-06T15:41:32Z) - Skeleton Based Sign Language Recognition Using Whole-body Keypoints [71.97020373520922]
Sign language is used by deaf or speech impaired people to communicate.
Skeleton-based recognition is becoming popular that it can be further ensembled with RGB-D based method to achieve state-of-the-art performance.
Inspired by the recent development of whole-body pose estimation citejin 2020whole, we propose recognizing sign language based on the whole-body key points and features.
arXiv Detail & Related papers (2021-03-16T03:38:17Z) - Read Like Humans: Autonomous, Bidirectional and Iterative Language
Modeling for Scene Text Recognition [80.446770909975]
Linguistic knowledge is of great benefit to scene text recognition.
How to effectively model linguistic rules in end-to-end deep networks remains a research challenge.
We propose an autonomous, bidirectional and iterative ABINet for scene text recognition.
arXiv Detail & Related papers (2021-03-11T06:47:45Z) - From Universal Language Model to Downstream Task: Improving
RoBERTa-Based Vietnamese Hate Speech Detection [8.602181445598776]
We propose a pipeline to adapt the general-purpose RoBERTa language model to a specific text classification task: Vietnamese Hate Speech Detection.
Our experiments proved that our proposed pipeline boosts the performance significantly, achieving a new state-of-the-art on Vietnamese Hate Speech Detection campaign with 0.7221 F1 score.
arXiv Detail & Related papers (2021-02-24T09:30:55Z) - Emergent Communication Pretraining for Few-Shot Machine Translation [66.48990742411033]
We pretrain neural networks via emergent communication from referential games.
Our key assumption is that grounding communication on images---as a crude approximation of real-world environments---inductively biases the model towards learning natural languages.
arXiv Detail & Related papers (2020-11-02T10:57:53Z) - Interpretation of Swedish Sign Language using Convolutional Neural
Networks and Transfer Learning [2.7629216089139934]
We use Convolutional Neural Networks (CNNs) and transfer learning in order to make computers able to interpret signs of the Swedish Sign Language (SSL) hand alphabet.
Our model consist of the implementation of a pre-trained InceptionV3 network, and the usage of the mini-batch gradient descent optimization algorithm.
The final accuracy of the model, based on 8 study subjects and 9,400 images, is 85%.
arXiv Detail & Related papers (2020-10-15T15:34:09Z) - Vokenization: Improving Language Understanding with Contextualized,
Visual-Grounded Supervision [110.66085917826648]
We develop a technique that extrapolates multimodal alignments to language-only data by contextually mapping language tokens to their related images.
"vokenization" is trained on relatively small image captioning datasets and we then apply it to generate vokens for large language corpora.
Trained with these contextually generated vokens, our visually-supervised language models show consistent improvements over self-supervised alternatives on multiple pure-language tasks.
arXiv Detail & Related papers (2020-10-14T02:11:51Z) - Transfer Learning for British Sign Language Modelling [0.0]
Research in minority languages, including sign languages, is hampered by the severe lack of data.
This has led to work on transfer learning methods, whereby a model developed for one language is reused as the starting point for a model on a second language.
In this paper, we examine two transfer learning techniques of fine-tuning and layer substitution for language modelling of British Sign Language.
arXiv Detail & Related papers (2020-06-03T10:13:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.