Design of Arabic Sign Language Recognition Model
- URL: http://arxiv.org/abs/2301.02693v1
- Date: Fri, 6 Jan 2023 19:19:25 GMT
- Title: Design of Arabic Sign Language Recognition Model
- Authors: Muhammad Al-Barham, Ahmad Jamal, Musa Al-Yaman
- Abstract summary: The model is tested on ArASL 2018, consisting of 54,000 images for 32 alphabet signs gathered from 40 signers.
The future work will be a model that converts Arabic sign language into Arabic text.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deaf people are using sign language for communication, and it is a
combination of gestures, movements, postures, and facial expressions that
correspond to alphabets and words in spoken languages. The proposed Arabic sign
language recognition model helps deaf and hard hearing people communicate
effectively with ordinary people. The recognition has four stages of converting
the alphabet into letters as follows: Image Loading stage, which loads the
images of Arabic sign language alphabets that were used later to train and test
the model, a pre-processing stage which applies image processing techniques
such as normalization, Image augmentation, resizing, and filtering to extract
the features which are necessary to accomplish the recognition perfectly, a
training stage which is achieved by deep learning techniques like CNN, a
testing stage which demonstrates how effectively the model performs for images
did not see it before, and the model was built and tested mainly using PyTorch
library. The model is tested on ArASL2018, consisting of 54,000 images for 32
alphabet signs gathered from 40 signers, and the dataset has two sets: training
dataset and testing dataset. We had to ensure that the system is reliable in
terms of accuracy, time, and flexibility of use explained in detail in this
report. Finally, the future work will be a model that converts Arabic sign
language into Arabic text.
Related papers
- Bukva: Russian Sign Language Alphabet [75.42794328290088]
This paper investigates the recognition of the Russian fingerspelling alphabet, also known as the Russian Sign Language (RSL) dactyl.
Dactyl is a component of sign languages where distinct hand movements represent individual letters of a written language.
We provide Bukva, the first full-fledged open-source video dataset for RSL dactyl recognition.
arXiv Detail & Related papers (2024-10-11T09:59:48Z) - Advanced Arabic Alphabet Sign Language Recognition Using Transfer Learning and Transformer Models [0.0]
This paper presents an Arabic Alphabet Sign Language recognition approach, using deep learning methods in conjunction with transfer learning and transformer-based models.
We study the performance of the different variants on two publicly available datasets, namely ArSL2018 and AASL.
Experimental results present evidence that the suggested methodology can receive a high recognition accuracy, by up to 99.6% and 99.43% on ArSL2018 and AASL, respectively.
arXiv Detail & Related papers (2024-10-01T13:39:26Z) - Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation [82.5217996570387]
We adapt a pre-trained language model for auto-regressive text-to-image generation.
We find that pre-trained language models offer limited help.
arXiv Detail & Related papers (2023-11-27T07:19:26Z) - Learning Cross-lingual Visual Speech Representations [108.68531445641769]
Cross-lingual self-supervised visual representation learning has been a growing research topic in the last few years.
We use the recently-proposed Raw Audio-Visual Speechs (RAVEn) framework to pre-train an audio-visual model with unlabelled data.
Our experiments show that: (1) multi-lingual models with more data outperform monolingual ones, but, when keeping the amount of data fixed, monolingual models tend to reach better performance.
arXiv Detail & Related papers (2023-03-14T17:05:08Z) - Fine-tuning of sign language recognition models: a technical report [0.0]
We focus on investigating two questions: how fine-tuning on datasets from other sign languages helps improve sign recognition quality, and whether sign recognition is possible in real-time without using GPU.
We provide code for reproducing model training experiments, converting models to ONNX format, and inference for real-time gesture recognition.
arXiv Detail & Related papers (2023-02-15T14:36:18Z) - Language Quantized AutoEncoders: Towards Unsupervised Text-Image
Alignment [81.73717488887938]
Language-Quantized AutoEncoder (LQAE) learns to align text-image data in an unsupervised manner by leveraging pretrained language models.
LQAE learns to represent similar images with similar clusters of text tokens, thereby aligning these two modalities without the use of aligned text-image pairs.
This enables few-shot image classification with large language models (e.g., GPT-3) as well as linear classification of images based on BERT text features.
arXiv Detail & Related papers (2023-02-02T06:38:44Z) - OpenHands: Making Sign Language Recognition Accessible with Pose-based
Pretrained Models across Languages [2.625144209319538]
We introduce OpenHands, a library where we take four key ideas from the NLP community for low-resource languages and apply them to sign languages for word-level recognition.
First, we propose using pose extracted through pretrained models as the standard modality of data to reduce training time and enable efficient inference.
Second, we train and release checkpoints of 4 pose-based isolated sign language recognition models across all 6 languages, providing baselines and ready checkpoints for deployment.
Third, to address the lack of labelled data, we propose self-supervised pretraining on unlabelled data.
arXiv Detail & Related papers (2021-10-12T10:33:02Z) - Align before Fuse: Vision and Language Representation Learning with
Momentum Distillation [52.40490994871753]
We introduce a contrastive loss to representations BEfore Fusing (ALBEF) through cross-modal attention.
We propose momentum distillation, a self-training method which learns from pseudo-targets produced by a momentum model.
ALBEF achieves state-of-the-art performance on multiple downstream vision-language tasks.
arXiv Detail & Related papers (2021-07-16T00:19:22Z) - Skeleton Based Sign Language Recognition Using Whole-body Keypoints [71.97020373520922]
Sign language is used by deaf or speech impaired people to communicate.
Skeleton-based recognition is becoming popular that it can be further ensembled with RGB-D based method to achieve state-of-the-art performance.
Inspired by the recent development of whole-body pose estimation citejin 2020whole, we propose recognizing sign language based on the whole-body key points and features.
arXiv Detail & Related papers (2021-03-16T03:38:17Z) - AraELECTRA: Pre-Training Text Discriminators for Arabic Language
Understanding [0.0]
We develop an Arabic language representation model, which we name AraELECTRA.
Our model is pretrained using the replaced token detection objective on large Arabic text corpora.
We show that AraELECTRA outperforms current state-of-the-art Arabic language representation models, given the same pretraining data and with even a smaller model size.
arXiv Detail & Related papers (2020-12-31T09:35:39Z) - A Hybrid Deep Learning Model for Arabic Text Recognition [2.064612766965483]
This paper presents a model that can recognize Arabic text that was printed using multiple font types.
The proposed model employs a hybrid DL network that can recognize Arabic printed text without the need for character segmentation.
The model achieved good results in recognizing characters and words and it also achieved promising results in recognizing characters when it was tested on unseen data.
arXiv Detail & Related papers (2020-09-04T02:49:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.