Building Korean Sign Language Augmentation (KoSLA) Corpus with Data
Augmentation Technique
- URL: http://arxiv.org/abs/2207.05261v1
- Date: Tue, 12 Jul 2022 02:12:36 GMT
- Title: Building Korean Sign Language Augmentation (KoSLA) Corpus with Data
Augmentation Technique
- Authors: Changnam An, Eunkyung Han, Dongmyeong Noh, Ohkyoon Kwon, Sumi Lee,
Hyunshim Han
- Abstract summary: We present an efficient framework of corpus for sign language translation.
By considering the linguistic features of sign language, our proposed framework is a first and unique attempt to build a multimodal sign language augmentation corpus.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an efficient framework of corpus for sign language translation.
Aided with a simple but dramatic data augmentation technique, our method
converts text into annotated forms with minimum information loss. Sign
languages are composed of manual signals, non-manual signals, and iconic
features. According to professional sign language interpreters, non-manual
signals such as facial expressions and gestures play an important role in
conveying exact meaning. By considering the linguistic features of sign
language, our proposed framework is a first and unique attempt to build a
multimodal sign language augmentation corpus (hereinafter referred to as the
KoSLA corpus) containing both manual and non-manual modalities. The corpus we
built demonstrates confident results in the hospital context, showing improved
performance with augmented datasets. To overcome data scarcity, we resorted to
data augmentation techniques such as synonym replacement to boost the
efficiency of our translation model and available data, while maintaining
grammatical and semantic structures of sign language. For the experimental
support, we verify the effectiveness of data augmentation technique and
usefulness of our corpus by performing a translation task between normal
sentences and sign language annotations on two tokenizers. The result was
convincing, proving that the BLEU scores with the KoSLA corpus were
significant.
Related papers
- EvSign: Sign Language Recognition and Translation with Streaming Events [59.51655336911345]
Event camera could naturally perceive dynamic hand movements, providing rich manual clues for sign language tasks.
We propose efficient transformer-based framework for event-based SLR and SLT tasks.
Our method performs favorably against existing state-of-the-art approaches with only 0.34% computational cost.
arXiv Detail & Related papers (2024-07-17T14:16:35Z) - Reconsidering Sentence-Level Sign Language Translation [2.099922236065961]
We show that for 33% of sentences in our sample, our fluent Deaf signer annotators were only able to understand key parts of the clip in light of discourse-level context.
These results underscore the importance of understanding and sanity checking examples when adapting machine learning to new domains.
arXiv Detail & Related papers (2024-06-16T19:19:54Z) - Self-Supervised Representation Learning with Spatial-Temporal Consistency for Sign Language Recognition [96.62264528407863]
We propose a self-supervised contrastive learning framework to excavate rich context via spatial-temporal consistency.
Inspired by the complementary property of motion and joint modalities, we first introduce first-order motion information into sign language modeling.
Our method is evaluated with extensive experiments on four public benchmarks, and achieves new state-of-the-art performance with a notable margin.
arXiv Detail & Related papers (2024-06-15T04:50:19Z) - SignMusketeers: An Efficient Multi-Stream Approach for Sign Language Translation at Scale [22.49602248323602]
A persistent challenge in sign language video processing is how we learn representations of sign language.
Our proposed method focuses on just the most relevant parts in a signing video: the face, hands and body posture of the signer.
Our approach is based on learning from individual frames (rather than video sequences) and is therefore much more efficient than prior work on sign language pre-training.
arXiv Detail & Related papers (2024-06-11T03:00:41Z) - Is context all you need? Scaling Neural Sign Language Translation to
Large Domains of Discourse [34.70927441846784]
Sign Language Translation (SLT) is a challenging task that aims to generate spoken language sentences from sign language videos.
We propose a novel multi-modal transformer architecture that tackles the translation task in a context-aware manner, as a human would.
We report significant improvements on state-of-the-art translation performance using contextual information, nearly doubling the reported BLEU-4 scores of baseline approaches.
arXiv Detail & Related papers (2023-08-18T15:27:22Z) - Cross-modality Data Augmentation for End-to-End Sign Language Translation [66.46877279084083]
End-to-end sign language translation (SLT) aims to convert sign language videos into spoken language texts directly without intermediate representations.
It has been a challenging task due to the modality gap between sign videos and texts and the data scarcity of labeled data.
We propose a novel Cross-modality Data Augmentation (XmDA) framework to transfer the powerful gloss-to-text translation capabilities to end-to-end sign language translation.
arXiv Detail & Related papers (2023-05-18T16:34:18Z) - Multilingual Extraction and Categorization of Lexical Collocations with
Graph-aware Transformers [86.64972552583941]
We put forward a sequence tagging BERT-based model enhanced with a graph-aware transformer architecture, which we evaluate on the task of collocation recognition in context.
Our results suggest that explicitly encoding syntactic dependencies in the model architecture is helpful, and provide insights on differences in collocation typification in English, Spanish and French.
arXiv Detail & Related papers (2022-05-23T16:47:37Z) - Keypoint based Sign Language Translation without Glosses [7.240731862549344]
We propose a new keypoint normalization method for performing translation based on the skeleton point of the signer.
It contributed to performance improvement by a customized normalization method depending on the body parts.
Our method can be applied to various datasets in a way that can be applied to datasets without glosses.
arXiv Detail & Related papers (2022-04-22T05:37:56Z) - A Simple Multi-Modality Transfer Learning Baseline for Sign Language
Translation [54.29679610921429]
Existing sign language datasets contain only about 10K-20K pairs of sign videos, gloss annotations and texts.
Data is thus a bottleneck for training effective sign language translation models.
This simple baseline surpasses the previous state-of-the-art results on two sign language translation benchmarks.
arXiv Detail & Related papers (2022-03-08T18:59:56Z) - Sign Language Transformers: Joint End-to-end Sign Language Recognition
and Translation [59.38247587308604]
We introduce a novel transformer based architecture that jointly learns Continuous Sign Language Recognition and Translation.
We evaluate the recognition and translation performances of our approaches on the challenging RWTH-PHOENIX-Weather-2014T dataset.
Our translation networks outperform both sign video to spoken language and gloss to spoken language translation models.
arXiv Detail & Related papers (2020-03-30T21:35:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.