LSA-T: The first continuous Argentinian Sign Language dataset for Sign
Language Translation
- URL: http://arxiv.org/abs/2211.15481v1
- Date: Mon, 14 Nov 2022 14:46:44 GMT
- Title: LSA-T: The first continuous Argentinian Sign Language dataset for Sign
Language Translation
- Authors: Pedro Dal Bianco and Gast\'on R\'ios and Franco Ronchetti and Facundo
Quiroga and Oscar Stanchi and Waldo Hasperu\'e and Alejandro Rosete
- Abstract summary: Sign language translation (SLT) is an active field of study that encompasses human-computer interaction, computer vision, natural language processing and machine learning.
This paper presents the first continuous Argentinian Sign Language (LSA) dataset.
It contains 14,880 sentence level videos of LSA extracted from the CN Sordos YouTube channel with labels and keypoints annotations for each signer.
- Score: 52.87578398308052
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sign language translation (SLT) is an active field of study that encompasses
human-computer interaction, computer vision, natural language processing and
machine learning. Progress on this field could lead to higher levels of
integration of deaf people. This paper presents, to the best of our knowledge,
the first continuous Argentinian Sign Language (LSA) dataset. It contains
14,880 sentence level videos of LSA extracted from the CN Sordos YouTube
channel with labels and keypoints annotations for each signer. We also present
a method for inferring the active signer, a detailed analysis of the
characteristics of the dataset, a visualization tool to explore the dataset and
a neural SLT model to serve as baseline for future experiments.
Related papers
- SCOPE: Sign Language Contextual Processing with Embedding from LLMs [49.5629738637893]
Sign languages, used by around 70 million Deaf individuals globally, are visual languages that convey visual and contextual information.
Current methods in vision-based sign language recognition ( SLR) and translation (SLT) struggle with dialogue scenes due to limited dataset diversity and the neglect of contextually relevant information.
We introduce SCOPE, a novel context-aware vision-based SLR and SLT framework.
arXiv Detail & Related papers (2024-09-02T08:56:12Z) - Scaling up Multimodal Pre-training for Sign Language Understanding [96.17753464544604]
Sign language serves as the primary meaning of communication for the deaf-mute community.
To facilitate communication between the deaf-mute and hearing people, a series of sign language understanding (SLU) tasks have been studied.
These tasks investigate sign language topics from diverse perspectives and raise challenges in learning effective representation of sign language videos.
arXiv Detail & Related papers (2024-08-16T06:04:25Z) - Transfer Learning for Cross-dataset Isolated Sign Language Recognition in Under-Resourced Datasets [2.512406961007489]
We use a temporal graph convolution-based sign language recognition approach to evaluate five supervised transfer learning approaches.
Experiments demonstrate that improvement over finetuning based transfer learning is possible with specialized supervised transfer learning methods.
arXiv Detail & Related papers (2024-03-21T16:36:40Z) - LSA64: An Argentinian Sign Language Dataset [42.27617228521691]
This paper presents a dataset of 64 signs from the Argentinian Sign Language (LSA)
The dataset, called LSA64, contains 3200 videos of 64 different LSA signs recorded by 10 subjects.
We also present a pre-processed version of the dataset, from which we computed statistics of movement, position and handshape of the signs.
arXiv Detail & Related papers (2023-10-26T14:37:01Z) - Gloss-free Sign Language Translation: Improving from Visual-Language
Pretraining [56.26550923909137]
Gloss-Free Sign Language Translation (SLT) is a challenging task due to its cross-domain nature.
We propose a novel Gloss-Free SLT based on Visual-Language Pretraining (GFSLT-)
Our approach involves two stages: (i) integrating Contrastive Language-Image Pre-training with masked self-supervised learning to create pre-tasks that bridge the semantic gap between visual and textual representations and restore masked sentences, and (ii) constructing an end-to-end architecture with an encoder-decoder-like structure that inherits the parameters of the pre-trained Visual and Text Decoder from
arXiv Detail & Related papers (2023-07-27T10:59:18Z) - ASL-Homework-RGBD Dataset: An annotated dataset of 45 fluent and
non-fluent signers performing American Sign Language homeworks [32.3809065803553]
This dataset contains videos of fluent and non-fluent signers using American Sign Language (ASL)
A total of 45 fluent and non-fluent participants were asked to perform signing homework assignments.
The data is annotated to identify several aspects of signing including grammatical features and non-manual markers.
arXiv Detail & Related papers (2022-07-08T17:18:49Z) - WLASL-LEX: a Dataset for Recognising Phonological Properties in American
Sign Language [2.814213966364155]
We build a large-scale dataset of American Sign Language signs annotated with six different phonological properties.
We investigate whether data-driven end-to-end and feature-based approaches can be optimised to automatically recognise these properties.
arXiv Detail & Related papers (2022-03-11T17:21:24Z) - A Simple Multi-Modality Transfer Learning Baseline for Sign Language
Translation [54.29679610921429]
Existing sign language datasets contain only about 10K-20K pairs of sign videos, gloss annotations and texts.
Data is thus a bottleneck for training effective sign language translation models.
This simple baseline surpasses the previous state-of-the-art results on two sign language translation benchmarks.
arXiv Detail & Related papers (2022-03-08T18:59:56Z) - Improving Sign Language Translation with Monolingual Data by Sign
Back-Translation [105.83166521438463]
We propose a sign back-translation (SignBT) approach, which incorporates massive spoken language texts into sign training.
With a text-to-gloss translation model, we first back-translate the monolingual text to its gloss sequence.
Then, the paired sign sequence is generated by splicing pieces from an estimated gloss-to-sign bank at the feature level.
arXiv Detail & Related papers (2021-05-26T08:49:30Z) - How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign
Language [37.578776156503906]
How2Sign is a multimodal and multiview continuous American Sign Language (ASL) dataset.
It consists of a parallel corpus of more than 80 hours of sign language videos and a set of corresponding modalities including speech, English transcripts, and depth.
A three-hour subset was recorded in the Panoptic studio enabling detailed 3D pose estimation.
arXiv Detail & Related papers (2020-08-18T20:22:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.