Generation of Indian Sign Language Letters, Numbers, and Words
- URL: http://arxiv.org/abs/2508.09522v1
- Date: Wed, 13 Aug 2025 06:10:20 GMT
- Title: Generation of Indian Sign Language Letters, Numbers, and Words
- Authors: Ajeet Kumar Yadav, Nishant Kumar, Rathna G N,
- Abstract summary: We develop a Generative Adversarial Network (GAN) variant to generate high-resolution, class-conditional sign language images.<n>Our modified Attention-based model generates high-quality images of Indian Sign Language letters, numbers, and words.
- Score: 0.9080472817672264
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Sign language, which contains hand movements, facial expressions and bodily gestures, is a significant medium for communicating with hard-of-hearing people. A well-trained sign language community communicates easily, but those who don't know sign language face significant challenges. Recognition and generation are basic communication methods between hearing and hard-of-hearing individuals. Despite progress in recognition, sign language generation still needs to be explored. The Progressive Growing of Generative Adversarial Network (ProGAN) excels at producing high-quality images, while the Self-Attention Generative Adversarial Network (SAGAN) generates feature-rich images at medium resolutions. Balancing resolution and detail is crucial for sign language image generation. We are developing a Generative Adversarial Network (GAN) variant that combines both models to generate feature-rich, high-resolution, and class-conditional sign language images. Our modified Attention-based model generates high-quality images of Indian Sign Language letters, numbers, and words, outperforming the traditional ProGAN in Inception Score (IS) and Fr\'echet Inception Distance (FID), with improvements of 3.2 and 30.12, respectively. Additionally, we are publishing a large dataset incorporating high-quality images of Indian Sign Language alphabets, numbers, and 129 words.
Related papers
- Signs as Tokens: A Retrieval-Enhanced Multilingual Sign Language Generator [55.94334001112357]
We introduce a multilingual sign language model, Signs as Tokens (SOKE), which can generate 3D sign avatars autoregressively from text inputs.<n>We propose a retrieval-enhanced SLG approach, which incorporates external sign dictionaries to provide accurate word-level signs.
arXiv Detail & Related papers (2024-11-26T18:28:09Z) - SHuBERT: Self-Supervised Sign Language Representation Learning via Multi-Stream Cluster Prediction [65.1590372072555]
SHuBERT (Sign Hidden-Unit BERT) is a self-supervised contextual representation model learned from 1,000 hours of American Sign Language video.<n>SHuBERT adapts masked token prediction objectives to multi-stream visual sign language input, learning to predict multiple targets corresponding to clustered hand, face, and body pose streams.<n>SHuBERT achieves state-of-the-art performance across multiple tasks including sign language translation, isolated sign language recognition, and fingerspelling detection.
arXiv Detail & Related papers (2024-11-25T03:13:08Z) - Scaling up Multimodal Pre-training for Sign Language Understanding [96.17753464544604]
Sign language serves as the primary meaning of communication for the deaf-mute community.
To facilitate communication between the deaf-mute and hearing people, a series of sign language understanding (SLU) tasks have been studied.
These tasks investigate sign language topics from diverse perspectives and raise challenges in learning effective representation of sign language videos.
arXiv Detail & Related papers (2024-08-16T06:04:25Z) - EvSign: Sign Language Recognition and Translation with Streaming Events [59.51655336911345]
Event camera could naturally perceive dynamic hand movements, providing rich manual clues for sign language tasks.
We propose efficient transformer-based framework for event-based SLR and SLT tasks.
Our method performs favorably against existing state-of-the-art approaches with only 0.34% computational cost.
arXiv Detail & Related papers (2024-07-17T14:16:35Z) - Neural Sign Actors: A diffusion model for 3D sign language production from text [51.81647203840081]
Sign Languages (SL) serve as the primary mode of communication for the Deaf and Hard of Hearing communities.
This work makes an important step towards realistic neural sign avatars, bridging the communication gap between Deaf and hearing communities.
arXiv Detail & Related papers (2023-12-05T12:04:34Z) - Image-based Indian Sign Language Recognition: A Practical Review using
Deep Neural Networks [0.0]
This model is to develop a real-time word-level sign language recognition system that would translate sign language to text.
For this analysis, the user must be able to take pictures of hand movements using a web camera.
Our model is trained using a convolutional neural network (CNN), which is then utilized to recognize the images.
arXiv Detail & Related papers (2023-04-28T09:27:04Z) - Indian Sign Language Recognition Using Mediapipe Holistic [0.0]
We will create a robust system for sign language recognition in order to convert Indian Sign Language to text or speech.
The creation of a text-to-sign language paradigm is essential since it will enhance the sign language-dependent deaf and hard-of-hearing population's communication skills.
arXiv Detail & Related papers (2023-04-20T12:25:47Z) - Word level Bangla Sign Language Dataset for Continuous BSL Recognition [0.0]
We develop an attention-based Bi-GRU model that captures the temporal dynamics of pose information for individuals communicating through sign language.
The accuracy of the model is reported to be 85.64%.
arXiv Detail & Related papers (2023-02-22T18:55:54Z) - Fine-tuning of sign language recognition models: a technical report [0.0]
We focus on investigating two questions: how fine-tuning on datasets from other sign languages helps improve sign recognition quality, and whether sign recognition is possible in real-time without using GPU.
We provide code for reproducing model training experiments, converting models to ONNX format, and inference for real-time gesture recognition.
arXiv Detail & Related papers (2023-02-15T14:36:18Z) - Skeleton Based Sign Language Recognition Using Whole-body Keypoints [71.97020373520922]
Sign language is used by deaf or speech impaired people to communicate.
Skeleton-based recognition is becoming popular that it can be further ensembled with RGB-D based method to achieve state-of-the-art performance.
Inspired by the recent development of whole-body pose estimation citejin 2020whole, we propose recognizing sign language based on the whole-body key points and features.
arXiv Detail & Related papers (2021-03-16T03:38:17Z) - Everybody Sign Now: Translating Spoken Language to Photo Realistic Sign
Language Video [43.45785951443149]
To be truly understandable by Deaf communities, an automatic Sign Language Production system must generate a photo-realistic signer.
We propose SignGAN, the first SLP model to produce photo-realistic continuous sign language videos directly from spoken language.
A pose-conditioned human synthesis model is then introduced to generate a photo-realistic sign language video from the skeletal pose sequence.
arXiv Detail & Related papers (2020-11-19T14:31:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.