Word level Bangla Sign Language Dataset for Continuous BSL Recognition
- URL: http://arxiv.org/abs/2302.11559v2
- Date: Sun, 9 Apr 2023 18:48:21 GMT
- Title: Word level Bangla Sign Language Dataset for Continuous BSL Recognition
- Authors: Md Shamimul Islam, A.J.M. Akhtarujjaman Joha, Md Nur Hossain, Sohaib
Abdullah, Ibrahim Elwarfalli, Md Mahedi Hasan
- Abstract summary: We develop an attention-based Bi-GRU model that captures the temporal dynamics of pose information for individuals communicating through sign language.
The accuracy of the model is reported to be 85.64%.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: An robust sign language recognition system can greatly alleviate
communication barriers, particularly for people who struggle with verbal
communication. This is crucial for human growth and progress as it enables the
expression of thoughts, feelings, and ideas. However, sign recognition is a
complex task that faces numerous challenges such as same gesture patterns for
multiple signs, lighting, clothing, carrying conditions, and the presence of
large poses, as well as illumination discrepancies across different views.
Additionally, the absence of an extensive Bangla sign language video dataset
makes it even more challenging to operate recognition systems, particularly
when utilizing deep learning techniques. In order to address this issue,
firstly, we created a large-scale dataset called the MVBSL-W50, which comprises
50 isolated words across 13 categories. Secondly, we developed an
attention-based Bi-GRU model that captures the temporal dynamics of pose
information for individuals communicating through sign language. The proposed
model utilizes human pose information, which has shown to be successful in
analyzing sign language patterns. By focusing solely on movement information
and disregarding body appearance and environmental factors, the model is
simplified and can achieve a speedier performance. The accuracy of the model is
reported to be 85.64%.
Related papers
- Scaling up Multimodal Pre-training for Sign Language Understanding [96.17753464544604]
Sign language serves as the primary meaning of communication for the deaf-mute community.
To facilitate communication between the deaf-mute and hearing people, a series of sign language understanding (SLU) tasks have been studied.
These tasks investigate sign language topics from diverse perspectives and raise challenges in learning effective representation of sign language videos.
arXiv Detail & Related papers (2024-08-16T06:04:25Z) - EvSign: Sign Language Recognition and Translation with Streaming Events [59.51655336911345]
Event camera could naturally perceive dynamic hand movements, providing rich manual clues for sign language tasks.
We propose efficient transformer-based framework for event-based SLR and SLT tasks.
Our method performs favorably against existing state-of-the-art approaches with only 0.34% computational cost.
arXiv Detail & Related papers (2024-07-17T14:16:35Z) - Nonverbal Interaction Detection [83.40522919429337]
This work addresses a new challenge of understanding human nonverbal interaction in social contexts.
We contribute a novel large-scale dataset, called NVI, which is meticulously annotated to include bounding boxes for humans and corresponding social groups.
Second, we establish a new task NVI-DET for nonverbal interaction detection, which is formalized as identifying triplets in the form individual, group, interaction> from images.
Third, we propose a nonverbal interaction detection hypergraph (NVI-DEHR), a new approach that explicitly models high-order nonverbal interactions using hypergraphs.
arXiv Detail & Related papers (2024-07-11T02:14:06Z) - Sign Language Recognition Based On Facial Expression and Hand Skeleton [2.5879170041667523]
We propose a sign language recognition network that integrates skeleton features of hands and facial expression.
By incorporating facial expression information, the accuracy and robustness of sign language recognition are improved.
arXiv Detail & Related papers (2024-07-02T13:02:51Z) - Self-Supervised Representation Learning with Spatial-Temporal Consistency for Sign Language Recognition [96.62264528407863]
We propose a self-supervised contrastive learning framework to excavate rich context via spatial-temporal consistency.
Inspired by the complementary property of motion and joint modalities, we first introduce first-order motion information into sign language modeling.
Our method is evaluated with extensive experiments on four public benchmarks, and achieves new state-of-the-art performance with a notable margin.
arXiv Detail & Related papers (2024-06-15T04:50:19Z) - Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation [2.6311088262657907]
This work proposes an Isolated Sign Language Recognition (ISLR) approach where body, hands, and facial landmarks are extracted throughout time and encoded as 2-D images.
We show that our method surpassed the state-of-the-art in terms of performance metrics on two widely recognized datasets in Brazilian Sign Language (LIBRAS)
In addition to being more accurate, our method is more time-efficient and easier to train due to its reliance on a simpler network architecture and solely RGB data as input.
arXiv Detail & Related papers (2024-04-29T23:21:17Z) - SignBERT+: Hand-model-aware Self-supervised Pre-training for Sign
Language Understanding [132.78015553111234]
Hand gesture serves as a crucial role during the expression of sign language.
Current deep learning based methods for sign language understanding (SLU) are prone to over-fitting due to insufficient sign data resource.
We propose the first self-supervised pre-trainable SignBERT+ framework with model-aware hand prior incorporated.
arXiv Detail & Related papers (2023-05-08T17:16:38Z) - Fine-tuning of sign language recognition models: a technical report [0.0]
We focus on investigating two questions: how fine-tuning on datasets from other sign languages helps improve sign recognition quality, and whether sign recognition is possible in real-time without using GPU.
We provide code for reproducing model training experiments, converting models to ONNX format, and inference for real-time gesture recognition.
arXiv Detail & Related papers (2023-02-15T14:36:18Z) - Skeleton Based Sign Language Recognition Using Whole-body Keypoints [71.97020373520922]
Sign language is used by deaf or speech impaired people to communicate.
Skeleton-based recognition is becoming popular that it can be further ensembled with RGB-D based method to achieve state-of-the-art performance.
Inspired by the recent development of whole-body pose estimation citejin 2020whole, we propose recognizing sign language based on the whole-body key points and features.
arXiv Detail & Related papers (2021-03-16T03:38:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.