Related papers: Sign Language Recognition based on YOLOv5 Algorithm for the Telugu Sign Language

Sign Language Recognition based on YOLOv5 Algorithm for the Telugu Sign Language

URL: http://arxiv.org/abs/2406.10231v1
Date: Wed, 24 Apr 2024 18:39:27 GMT
Title: Sign Language Recognition based on YOLOv5 Algorithm for the Telugu Sign Language
Authors: Vipul Reddy. P, Vishnu Vardhan Reddy. B, Sukriti,
Abstract summary: This paper presents a novel approach for identifying gestures in TSL using the YOLOv5 object identification framework. A deep learning model was created that used the YOLOv5 to recognize and classify gestures. The system's stability and generalizability across various TSL gestures and settings were evaluated through rigorous testing and validation.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Sign language recognition (SLR) technology has enormous promise to improve communication and accessibility for the difficulty of hearing. This paper presents a novel approach for identifying gestures in TSL using the YOLOv5 object identification framework. The main goal is to create an accurate and successful method for identifying TSL gestures so that the deaf community can use slr. After that, a deep learning model was created that used the YOLOv5 to recognize and classify gestures. This model benefited from the YOLOv5 architecture's high accuracy, speed, and capacity to handle complex sign language features. Utilizing transfer learning approaches, the YOLOv5 model was customized to TSL gestures. To attain the best outcomes, careful parameter and hyperparameter adjustment was carried out during training. With F1-score and mean Average Precision (mAP) ratings of 90.5% and 98.1%, the YOLOv5-medium model stands out for its outstanding performance metrics, demonstrating its efficacy in Telugu sign language identification tasks. Surprisingly, this model strikes an acceptable balance between computational complexity and training time to produce these amazing outcomes. Because it offers a convincing blend of accuracy and efficiency, the YOLOv5-medium model, trained for 200 epochs, emerges as the recommended choice for real-world deployment. The system's stability and generalizability across various TSL gestures and settings were evaluated through rigorous testing and validation, which yielded outstanding accuracy. This research lays the foundation for future advancements in accessible technology for linguistic communities by providing a cutting-edge application of deep learning and computer vision techniques to TSL gesture identification. It also offers insightful perspectives and novel approaches to the field of sign language recognition.

Related papers

Revolutionizing Communication with Deep Learning and XAI for Enhanced Arabic Sign Language Recognition [0.0]
This study introduces an integrated approach to recognizing Arabic Sign Language (ArSL) using state-of-the-art deep learning models such as MobileNetV3, ResNet50, and EfficientNet-B2. The proposed system not only sets new benchmarks in recognition accuracy but also emphasizes interpretability, making it suitable for applications in healthcare, education, and inclusive communication technologies.
arXiv Detail & Related papers (2025-01-14T14:49:49Z)
SHuBERT: Self-Supervised Sign Language Representation Learning via Multi-Stream Cluster Prediction [65.1590372072555]
We introduce SHuBERT, a self-supervised transformer encoder that learns strong representations from American Sign Language (ASL) video content. Inspired by the success of the HuBERT speech representation model, SHuBERT adapts masked prediction for multi-stream visual sign language input. SHuBERT achieves state-of-the-art performance across multiple benchmarks.
arXiv Detail & Related papers (2024-11-25T03:13:08Z)
Deep Neural Network-Based Sign Language Recognition: A Comprehensive Approach Using Transfer Learning with Explainability [0.0]
We suggest a novel solution that uses a deep neural network to fully automate sign language recognition. This methodology integrates sophisticated preprocessing methodologies to optimise the overall performance. Our model's ability to provide informational clarity was assessed using the SHAP (SHapley Additive exPlanations) method.
arXiv Detail & Related papers (2024-09-11T17:17:44Z)
Sign language recognition based on deep learning and low-cost handcrafted descriptors [0.0]
It is important to consider as many linguistic parameters as possible in gesture execution to avoid ambiguity between words. It is essential to ensure that the chosen technology is realistic, avoiding expensive, intrusive, or low-mobility sensors. We propose an efficient sign language recognition system that utilizes low-cost sensors and techniques.
arXiv Detail & Related papers (2024-08-14T00:56:51Z)
PenSLR: Persian end-to-end Sign Language Recognition Using Ensembling [0.953605234706973]
Pen SLR is a glove-based sign language system consisting of an Inertial Measurement Unit (IMU) and five flexible sensors powered by a deep learning framework. We propose a novel ensembling technique by leveraging a multiple sequence alignment algorithm known as Star Alignment. Our evaluations show that Pen SLR achieves a remarkable word accuracy of 94.58% and 96.70% in subject-independent and subject-dependent setups.
arXiv Detail & Related papers (2024-06-24T07:59:34Z)
Enhancing Sign Language Detection through Mediapipe and Convolutional Neural Networks (CNN) [3.192629447369627]
This research combines MediaPipe and CNNs for the efficient and accurate interpretation of ASL dataset. The accuracy achieved by the model on ASL datasets is 99.12%. The system will have applications in the communication, education, and accessibility domains.
arXiv Detail & Related papers (2024-06-06T04:05:12Z)
YOLO-World: Real-Time Open-Vocabulary Object Detection [87.08732047660058]
We introduce YOLO-World, an innovative approach that enhances YOLO with open-vocabulary detection capabilities. Our method excels in detecting a wide range of objects in a zero-shot manner with high efficiency. YOLO-World achieves 35.4 AP with 52.0 FPS on V100, which outperforms many state-of-the-art methods in terms of both accuracy and speed.
arXiv Detail & Related papers (2024-01-30T18:59:38Z)
SignBERT+: Hand-model-aware Self-supervised Pre-training for Sign Language Understanding [132.78015553111234]
Hand gesture serves as a crucial role during the expression of sign language. Current deep learning based methods for sign language understanding (SLU) are prone to over-fitting due to insufficient sign data resource. We propose the first self-supervised pre-trainable SignBERT+ framework with model-aware hand prior incorporated.
arXiv Detail & Related papers (2023-05-08T17:16:38Z)
Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE [93.98660272309974]
This report briefly describes our submission Vega v1 on the General Language Understanding Evaluation leaderboard. GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference. With our optimized pretraining and fine-tuning strategies, our 1.3 billion model sets new state-of-the-art on 4/9 tasks, achieving the best average score of 91.3.
arXiv Detail & Related papers (2023-02-18T09:26:35Z)
Fine-tuning of sign language recognition models: a technical report [0.0]
We focus on investigating two questions: how fine-tuning on datasets from other sign languages helps improve sign recognition quality, and whether sign recognition is possible in real-time without using GPU. We provide code for reproducing model training experiments, converting models to ONNX format, and inference for real-time gesture recognition.
arXiv Detail & Related papers (2023-02-15T14:36:18Z)
UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training [72.004873454347]
Two methods are introduced for enhancing the unsupervised speaker information extraction. Experiment results on SUPERB benchmark show that the proposed system achieves state-of-the-art performance. We scale up training dataset to 94 thousand hours public audio data and achieve further performance improvement.
arXiv Detail & Related papers (2021-10-12T05:43:30Z)
Skeleton Based Sign Language Recognition Using Whole-body Keypoints [71.97020373520922]
Sign language is used by deaf or speech impaired people to communicate. Skeleton-based recognition is becoming popular that it can be further ensembled with RGB-D based method to achieve state-of-the-art performance. Inspired by the recent development of whole-body pose estimation citejin 2020whole, we propose recognizing sign language based on the whole-body key points and features.
arXiv Detail & Related papers (2021-03-16T03:38:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.