Slovo: Russian Sign Language Dataset
- URL: http://arxiv.org/abs/2305.14527v3
- Date: Tue, 12 Mar 2024 14:45:02 GMT
- Title: Slovo: Russian Sign Language Dataset
- Authors: Alexander Kapitanov, Karina Kvanchiani, Alexander Nagaev, Elizaveta
Petrova
- Abstract summary: This paper presents the Russian Sign Language (RSL) video dataset Slovo, produced using crowdsourcing platforms.
The dataset contains 20,000 FullHD recordings, divided into 1,000 classes of isolated RSL gestures received by 194 signers.
- Score: 83.93252084624997
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: One of the main challenges of the sign language recognition task is the
difficulty of collecting a suitable dataset due to the gap between
hard-of-hearing and hearing societies. In addition, the sign language in each
country differs significantly, which obliges the creation of new data for each
of them. This paper presents the Russian Sign Language (RSL) video dataset
Slovo, produced using crowdsourcing platforms. The dataset contains 20,000
FullHD recordings, divided into 1,000 classes of isolated RSL gestures received
by 194 signers. We also provide the entire dataset creation pipeline, from data
collection to video annotation, with the following demo application. Several
neural networks are trained and evaluated on the Slovo to demonstrate its
teaching ability. Proposed data and pre-trained models are publicly available.
Related papers
- Bukva: Russian Sign Language Alphabet [75.42794328290088]
This paper investigates the recognition of the Russian fingerspelling alphabet, also known as the Russian Sign Language (RSL) dactyl.
Dactyl is a component of sign languages where distinct hand movements represent individual letters of a written language.
We provide Bukva, the first full-fledged open-source video dataset for RSL dactyl recognition.
arXiv Detail & Related papers (2024-10-11T09:59:48Z) - Hierarchical Windowed Graph Attention Network and a Large Scale Dataset for Isolated Indian Sign Language Recognition [0.20075899678041528]
We introduce a large-scale isolated ISL dataset and a novel SL recognition model based on skeleton graph structure.
The dataset covers 2002 daily used common words in the deaf community recorded by 20 (10 male and 10 female) deaf adult signers.
We propose a SL recognition model namely Hierarchical Windowed Graph Attention Network (HWGAT) by utilizing the human upper body skeleton graph.
arXiv Detail & Related papers (2024-07-19T11:48:36Z) - ASL Citizen: A Community-Sourced Dataset for Advancing Isolated Sign
Language Recognition [6.296362537531586]
Sign languages are used as a primary language by approximately 70 million D/deaf people world-wide.
To help tackle this problem, we release ASL Citizen, the first crowdsourced Isolated Sign Language Recognition dataset.
We propose that this dataset be used for sign language dictionary retrieval for American Sign Language (ASL), where a user demonstrates a sign to their webcam to retrieve matching signs from a dictionary.
arXiv Detail & Related papers (2023-04-12T15:52:53Z) - Learning from What is Already Out There: Few-shot Sign Language
Recognition with Online Dictionaries [0.0]
We open-source the UWB-SL-Wild few-shot dataset, the first of its kind training resource consisting of dictionary-scraped videos.
We introduce a novel approach to training sign language recognition models in a few-shot scenario, resulting in state-of-the-art results.
arXiv Detail & Related papers (2023-01-10T03:21:01Z) - LSA-T: The first continuous Argentinian Sign Language dataset for Sign
Language Translation [52.87578398308052]
Sign language translation (SLT) is an active field of study that encompasses human-computer interaction, computer vision, natural language processing and machine learning.
This paper presents the first continuous Argentinian Sign Language (LSA) dataset.
It contains 14,880 sentence level videos of LSA extracted from the CN Sordos YouTube channel with labels and keypoints annotations for each signer.
arXiv Detail & Related papers (2022-11-14T14:46:44Z) - ASR2K: Speech Recognition for Around 2000 Languages without Audio [100.41158814934802]
We present a speech recognition pipeline that does not require any audio for the target language.
Our pipeline consists of three components: acoustic, pronunciation, and language models.
We build speech recognition for 1909 languages by combining it with Crubadan: a large endangered languages n-gram database.
arXiv Detail & Related papers (2022-09-06T22:48:29Z) - A Simple Multi-Modality Transfer Learning Baseline for Sign Language
Translation [54.29679610921429]
Existing sign language datasets contain only about 10K-20K pairs of sign videos, gloss annotations and texts.
Data is thus a bottleneck for training effective sign language translation models.
This simple baseline surpasses the previous state-of-the-art results on two sign language translation benchmarks.
arXiv Detail & Related papers (2022-03-08T18:59:56Z) - How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign
Language [37.578776156503906]
How2Sign is a multimodal and multiview continuous American Sign Language (ASL) dataset.
It consists of a parallel corpus of more than 80 hours of sign language videos and a set of corresponding modalities including speech, English transcripts, and depth.
A three-hour subset was recorded in the Panoptic studio enabling detailed 3D pose estimation.
arXiv Detail & Related papers (2020-08-18T20:22:16Z) - BSL-1K: Scaling up co-articulated sign language recognition using
mouthing cues [106.21067543021887]
We show how to use mouthing cues from signers to obtain high-quality annotations from video data.
The BSL-1K dataset is a collection of British Sign Language (BSL) signs of unprecedented scale.
arXiv Detail & Related papers (2020-07-23T16:59:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.