BdSL36: A Dataset for Bangladeshi Sign Letters Recognition
- URL: http://arxiv.org/abs/2110.00869v1
- Date: Sat, 2 Oct 2021 19:52:48 GMT
- Title: BdSL36: A Dataset for Bangladeshi Sign Letters Recognition
- Authors: Oishee Bintey Hoque, Mohammad Imrul Jubair, Al-Farabi Akash, Saiful
Islam
- Abstract summary: Bangladeshi Sign Language (BdSL) is a commonly used medium of communication for the hearing-impaired people in Bangladesh.
In this paper, we introduce a dataset named BdSL36 which incorporates background augmentation to make the dataset versatile.
Besides, we annotate about 40,000 images with bounding boxes to utilize the potentiality of object detection algorithms.
- Score: 4.010701467679244
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bangladeshi Sign Language (BdSL) is a commonly used medium of communication
for the hearing-impaired people in Bangladesh. A real-time BdSL interpreter
with no controlled lab environment has a broad social impact and an interesting
avenue of research as well. Also, it is a challenging task due to the variation
in different subjects (age, gender, color, etc.), complex features, and
similarities of signs and clustered backgrounds. However, the existing dataset
for BdSL classification task is mainly built in a lab friendly setup which
limits the application of powerful deep learning technology. In this paper, we
introduce a dataset named BdSL36 which incorporates background augmentation to
make the dataset versatile and contains over four million images belonging to
36 categories. Besides, we annotate about 40,000 images with bounding boxes to
utilize the potentiality of object detection algorithms. Furthermore, several
intensive experiments are performed to establish the baseline performance of
our BdSL36. Moreover, we employ beta testing of our classifiers at the user
level to justify the possibilities of real-world application with this dataset.
We believe our BdSL36 will expedite future research on practical sign letter
classification. We make the datasets and all the pre-trained models available
for further researcher.
Related papers
- BAUST Lipi: A BdSL Dataset with Deep Learning Based Bangla Sign Language Recognition [0.5497663232622964]
Sign language research is burgeoning to enhance communication with the deaf community.
One significant barrier has been the lack of a comprehensive Bangla sign language dataset.
We introduce a new BdSL dataset comprising alphabets totaling 18,000 images, with each image being 224x224 pixels in size.
We devised a hybrid Convolutional Neural Network (CNN) model, integrating multiple convolutional layers, activation functions, dropout techniques, and LSTM layers.
arXiv Detail & Related papers (2024-08-20T03:35:42Z) - BdSLW60: A Word-Level Bangla Sign Language Dataset [3.8631510994883254]
We create a comprehensive BdSL word-level dataset named BdSLW60 in an unconstrained and natural setting.
The dataset encompasses 60 Bangla sign words, with a significant scale of 9307 video trials provided by 18 signers under the supervision of a sign language professional.
We report the benchmarking of our BdSLW60 dataset using the Support Vector Machine (SVM) with testing accuracy up to 67.6% and an attention-based bi-LSTM with testing accuracy up to 75.1%.
arXiv Detail & Related papers (2024-02-13T18:02:58Z) - Connecting the Dots: Leveraging Spatio-Temporal Graph Neural Networks
for Accurate Bangla Sign Language Recognition [2.624902795082451]
We present a new word-level Bangla Sign Language dataset - BdSL40 - consisting of 611 videos over 40 words.
This is the first study on word-level BdSL recognition, and the dataset was transcribed from Indian Sign Language (ISL) using the Bangla Sign Language Dictionary (1997).
The study highlights the significant lexical and semantic similarity between BdSL, West Bengal Sign Language, and ISL, and the lack of word-level datasets for BdSL in the literature.
arXiv Detail & Related papers (2024-01-22T18:52:51Z) - Towards Generic Semi-Supervised Framework for Volumetric Medical Image
Segmentation [19.09640071505051]
We develop a generic SSL framework to handle settings such as UDA and SemiDG.
We evaluate our proposed framework on four benchmark datasets for SSL, Class-imbalanced SSL, UDA and SemiDG.
The results showcase notable improvements compared to state-of-the-art methods across all four settings.
arXiv Detail & Related papers (2023-10-17T14:58:18Z) - Joint Prediction and Denoising for Large-scale Multilingual
Self-supervised Learning [69.77973092264338]
We show that more powerful techniques can lead to more efficient pre-training, opening SSL to more research groups.
We propose WavLabLM, which extends WavLM's joint prediction and denoising to 40k hours of data across 136 languages.
We show that further efficiency can be achieved with a vanilla HuBERT Base model, which can maintain 94% of XLS-R's performance with only 3% of the data.
arXiv Detail & Related papers (2023-09-26T23:55:57Z) - A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends [82.64268080902742]
Self-supervised learning (SSL) aims to learn discriminative features from unlabeled data without relying on human-annotated labels.
SSL has garnered significant attention recently, leading to the development of numerous related algorithms.
This paper presents a review of diverse SSL methods, encompassing algorithmic aspects, application domains, three key trends, and open research questions.
arXiv Detail & Related papers (2023-01-13T14:41:05Z) - Towards Realistic Semi-Supervised Learning [73.59557447798134]
We propose a novel approach to tackle SSL in open-world setting, where we simultaneously learn to classify known and unknown classes.
Our approach substantially outperforms the existing state-of-the-art on seven diverse datasets.
arXiv Detail & Related papers (2022-07-05T19:04:43Z) - Open-Set Semi-Supervised Learning for 3D Point Cloud Understanding [62.17020485045456]
It is commonly assumed in semi-supervised learning (SSL) that the unlabeled data are drawn from the same distribution as that of the labeled ones.
We propose to selectively utilize unlabeled data through sample weighting, so that only conducive unlabeled data would be prioritized.
arXiv Detail & Related papers (2022-05-02T16:09:17Z) - Sound and Visual Representation Learning with Multiple Pretraining Tasks [104.11800812671953]
Self-supervised tasks (SSL) reveal different features from the data.
This work aims to combine Multiple SSL tasks (Multi-SSL) that generalizes well for all downstream tasks.
Experiments on sound representations demonstrate that Multi-SSL via incremental learning (IL) of SSL tasks outperforms single SSL task models.
arXiv Detail & Related papers (2022-01-04T09:09:38Z) - BBC-Oxford British Sign Language Dataset [64.32108826673183]
We introduce the BBC-Oxford British Sign Language (BOBSL) dataset, a large-scale video collection of British Sign Language (BSL)
We describe the motivation for the dataset, together with statistics and available annotations.
We conduct experiments to provide baselines for the tasks of sign recognition, sign language alignment, and sign language translation.
arXiv Detail & Related papers (2021-11-05T17:35:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.