Improving Sign Language Translation with Monolingual Data by Sign
Back-Translation
- URL: http://arxiv.org/abs/2105.12397v1
- Date: Wed, 26 May 2021 08:49:30 GMT
- Title: Improving Sign Language Translation with Monolingual Data by Sign
Back-Translation
- Authors: Hao Zhou, Wengang Zhou, Weizhen Qi, Junfu Pu, Houqiang Li
- Abstract summary: We propose a sign back-translation (SignBT) approach, which incorporates massive spoken language texts into sign training.
With a text-to-gloss translation model, we first back-translate the monolingual text to its gloss sequence.
Then, the paired sign sequence is generated by splicing pieces from an estimated gloss-to-sign bank at the feature level.
- Score: 105.83166521438463
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite existing pioneering works on sign language translation (SLT), there
is a non-trivial obstacle, i.e., the limited quantity of parallel sign-text
data. To tackle this parallel data bottleneck, we propose a sign
back-translation (SignBT) approach, which incorporates massive spoken language
texts into SLT training. With a text-to-gloss translation model, we first
back-translate the monolingual text to its gloss sequence. Then, the paired
sign sequence is generated by splicing pieces from an estimated gloss-to-sign
bank at the feature level. Finally, the synthetic parallel data serves as a
strong supplement for the end-to-end training of the encoder-decoder SLT
framework.
To promote the SLT research, we further contribute CSL-Daily, a large-scale
continuous SLT dataset. It provides both spoken language translations and
gloss-level annotations. The topic revolves around people's daily lives (e.g.,
travel, shopping, medical care), the most likely SLT application scenario.
Extensive experimental results and analysis of SLT methods are reported on
CSL-Daily. With the proposed sign back-translation method, we obtain a
substantial improvement over previous state-of-the-art SLT methods.
Related papers
- Scaling Sign Language Translation [38.43594795927101]
Sign language translation (SLT) addresses the problem of translating information from a sign language in video to a spoken language in text.
In this paper, we push forward the frontier of SLT by scaling pretraining data, model size, and number of translation directions.
Experiments show substantial quality improvements over the vanilla baselines, surpassing the previous state-of-the-art (SOTA) by wide margins.
arXiv Detail & Related papers (2024-07-16T15:36:58Z) - Gloss-free Sign Language Translation: Improving from Visual-Language
Pretraining [56.26550923909137]
Gloss-Free Sign Language Translation (SLT) is a challenging task due to its cross-domain nature.
We propose a novel Gloss-Free SLT based on Visual-Language Pretraining (GFSLT-)
Our approach involves two stages: (i) integrating Contrastive Language-Image Pre-training with masked self-supervised learning to create pre-tasks that bridge the semantic gap between visual and textual representations and restore masked sentences, and (ii) constructing an end-to-end architecture with an encoder-decoder-like structure that inherits the parameters of the pre-trained Visual and Text Decoder from
arXiv Detail & Related papers (2023-07-27T10:59:18Z) - Cross-modality Data Augmentation for End-to-End Sign Language Translation [66.46877279084083]
End-to-end sign language translation (SLT) aims to convert sign language videos into spoken language texts directly without intermediate representations.
It has been a challenging task due to the modality gap between sign videos and texts and the data scarcity of labeled data.
We propose a novel Cross-modality Data Augmentation (XmDA) framework to transfer the powerful gloss-to-text translation capabilities to end-to-end sign language translation.
arXiv Detail & Related papers (2023-05-18T16:34:18Z) - Better Sign Language Translation with Monolingual Data [6.845232643246564]
Sign language translation (SLT) systems heavily relies on the availability of large-scale parallel G2T pairs.
This paper proposes a simple and efficient rule transformation method to transcribe the large-scale target monolingual data into its pseudo glosses automatically.
Empirical results show that the proposed approach can significantly improve the performance of SLT.
arXiv Detail & Related papers (2023-04-21T09:39:54Z) - LSA-T: The first continuous Argentinian Sign Language dataset for Sign
Language Translation [52.87578398308052]
Sign language translation (SLT) is an active field of study that encompasses human-computer interaction, computer vision, natural language processing and machine learning.
This paper presents the first continuous Argentinian Sign Language (LSA) dataset.
It contains 14,880 sentence level videos of LSA extracted from the CN Sordos YouTube channel with labels and keypoints annotations for each signer.
arXiv Detail & Related papers (2022-11-14T14:46:44Z) - Scaling Back-Translation with Domain Text Generation for Sign Language
Gloss Translation [36.40377483258876]
Sign language gloss translation aims to translate the sign glosses into spoken language texts.
Back translation (BT) generates pseudo-parallel data by translating in-domain spoken language texts into sign glosses.
We propose a Prompt based domain text Generation (PGEN) approach to produce the large-scale spoken language text data.
arXiv Detail & Related papers (2022-10-13T14:25:08Z) - A Token-level Contrastive Framework for Sign Language Translation [9.185037439012952]
Sign Language Translation is a promising technology to bridge the communication gap between the deaf and the hearing people.
We propose ConSLT, a novel token-level.
textbfContrastive learning framework for textbfSign textbfLanguage.
textbfTranslation.
arXiv Detail & Related papers (2022-04-11T07:33:26Z) - A Simple Multi-Modality Transfer Learning Baseline for Sign Language
Translation [54.29679610921429]
Existing sign language datasets contain only about 10K-20K pairs of sign videos, gloss annotations and texts.
Data is thus a bottleneck for training effective sign language translation models.
This simple baseline surpasses the previous state-of-the-art results on two sign language translation benchmarks.
arXiv Detail & Related papers (2022-03-08T18:59:56Z) - SimulSLT: End-to-End Simultaneous Sign Language Translation [55.54237194555432]
Existing sign language translation methods need to read all the videos before starting the translation.
We propose SimulSLT, the first end-to-end simultaneous sign language translation model.
SimulSLT achieves BLEU scores that exceed the latest end-to-end non-simultaneous sign language translation model.
arXiv Detail & Related papers (2021-12-08T11:04:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.