Related papers: Developing Lightweight DNN Models With Limited Data For Real-Time Sign Language Recognition

Developing Lightweight DNN Models With Limited Data For Real-Time Sign Language Recognition

URL: http://arxiv.org/abs/2507.00248v1
Date: Mon, 30 Jun 2025 20:34:54 GMT
Title: Developing Lightweight DNN Models With Limited Data For Real-Time Sign Language Recognition
Authors: Nikita Nikitin, Eugene Fomin,
Abstract summary: We present a novel framework for real-time sign language recognition using lightweight DNNs trained on limited data.<n>Our system addresses key challenges in sign language recognition, including data scarcity, high computational costs, and discrepancies in frame rates between training and inference environments.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a novel framework for real-time sign language recognition using lightweight DNNs trained on limited data. Our system addresses key challenges in sign language recognition, including data scarcity, high computational costs, and discrepancies in frame rates between training and inference environments. By encoding sign language specific parameters, such as handshape, palm orientation, movement, and location into vectorized inputs, and leveraging MediaPipe for landmark extraction, we achieve highly separable input data representations. Our DNN architecture, optimized for sub 10MB deployment, enables accurate classification of 343 signs with less than 10ms latency on edge devices. The data annotation platform 'slait data' facilitates structured labeling and vector extraction. Our model achieved 92% accuracy in isolated sign recognition and has been integrated into the 'slait ai' web application, where it demonstrates stable inference.

Related papers

Deep Neural Network-Based Sign Language Recognition: A Comprehensive Approach Using Transfer Learning with Explainability [0.0]
We suggest a novel solution that uses a deep neural network to fully automate sign language recognition. This methodology integrates sophisticated preprocessing methodologies to optimise the overall performance. Our model's ability to provide informational clarity was assessed using the SHAP (SHapley Additive exPlanations) method.
arXiv Detail & Related papers (2024-09-11T17:17:44Z)
Sign language recognition based on deep learning and low-cost handcrafted descriptors [0.0]
It is important to consider as many linguistic parameters as possible in gesture execution to avoid ambiguity between words. It is essential to ensure that the chosen technology is realistic, avoiding expensive, intrusive, or low-mobility sensors. We propose an efficient sign language recognition system that utilizes low-cost sensors and techniques.
arXiv Detail & Related papers (2024-08-14T00:56:51Z)
EvSign: Sign Language Recognition and Translation with Streaming Events [59.51655336911345]
Event camera could naturally perceive dynamic hand movements, providing rich manual clues for sign language tasks. We propose efficient transformer-based framework for event-based SLR and SLT tasks. Our method performs favorably against existing state-of-the-art approaches with only 0.34% computational cost.
arXiv Detail & Related papers (2024-07-17T14:16:35Z)
Enhancing Sign Language Detection through Mediapipe and Convolutional Neural Networks (CNN) [3.192629447369627]
This research combines MediaPipe and CNNs for the efficient and accurate interpretation of ASL dataset. The accuracy achieved by the model on ASL datasets is 99.12%. The system will have applications in the communication, education, and accessibility domains.
arXiv Detail & Related papers (2024-06-06T04:05:12Z)
On the Importance of Signer Overlap for Sign Language Detection [65.26091369630547]
We argue that the current benchmark data sets for sign language detection estimate overly positive results that do not generalize well. We quantify this with a detailed analysis of the effect of signer overlap on current sign detection benchmark data sets. We propose new data set partitions that are free of overlap and allow for more realistic performance assessment.
arXiv Detail & Related papers (2023-03-19T22:15:05Z)
ArabSign: A Multi-modality Dataset and Benchmark for Continuous Arabic Sign Language Recognition [1.2691047660244335]
ArabSign dataset consists of 9,335 samples performed by 6 signers. The total time of the recorded sentences is around 10 hours and the average sentence's length is 3.1 signs. We propose an encoder-decoder model for Continuous ArSL recognition.
arXiv Detail & Related papers (2022-10-08T07:36:20Z)
Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition [54.92161571089808]
Cross-lingual NER transfers knowledge from rich-resource language to languages with low resources. Existing cross-lingual NER methods do not make good use of rich unlabeled data in target languages. We develop a novel approach based on the ideas of semi-supervised learning and reinforcement learning.
arXiv Detail & Related papers (2021-06-01T05:46:22Z)
On Addressing Practical Challenges for RNN-Transduce [72.72132048437751]
We adapt a well-trained RNN-T model to a new domain without collecting the audio data. We obtain word-level confidence scores by utilizing several types of features calculated during decoding. The proposed time stamping method can get less than 50ms word timing difference on average.
arXiv Detail & Related papers (2021-04-27T23:31:43Z)
Skeleton Based Sign Language Recognition Using Whole-body Keypoints [71.97020373520922]
Sign language is used by deaf or speech impaired people to communicate. Skeleton-based recognition is becoming popular that it can be further ensembled with RGB-D based method to achieve state-of-the-art performance. Inspired by the recent development of whole-body pose estimation citejin 2020whole, we propose recognizing sign language based on the whole-body key points and features.
arXiv Detail & Related papers (2021-03-16T03:38:17Z)
BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues [106.21067543021887]
We show how to use mouthing cues from signers to obtain high-quality annotations from video data. The BSL-1K dataset is a collection of British Sign Language (BSL) signs of unprecedented scale.
arXiv Detail & Related papers (2020-07-23T16:59:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.