Better Sign Language Translation with Monolingual Data
- URL: http://arxiv.org/abs/2304.10844v1
- Date: Fri, 21 Apr 2023 09:39:54 GMT
- Title: Better Sign Language Translation with Monolingual Data
- Authors: Ru Peng, Yawen Zeng, Junbo Zhao
- Abstract summary: Sign language translation (SLT) systems heavily relies on the availability of large-scale parallel G2T pairs.
This paper proposes a simple and efficient rule transformation method to transcribe the large-scale target monolingual data into its pseudo glosses automatically.
Empirical results show that the proposed approach can significantly improve the performance of SLT.
- Score: 6.845232643246564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sign language translation (SLT) systems, which are often decomposed into
video-to-gloss (V2G) recognition and gloss-to-text (G2T) translation through
the pivot gloss, heavily relies on the availability of large-scale parallel G2T
pairs. However, the manual annotation of pivot gloss, which is a sequence of
transcribed written-language words in the order in which they are signed,
further exacerbates the scarcity of data for SLT. To address this issue, this
paper proposes a simple and efficient rule transformation method to transcribe
the large-scale target monolingual data into its pseudo glosses automatically
for enhancing the SLT translation. Empirical results show that the proposed
approach can significantly improve the performance of SLT, especially achieving
state-of-the-art results on two SLT benchmark datasets PHEONIX-WEATHER 2014T
and ASLG-PC12. Our code has been released at:
https://github.com/pengr/Mono\_SLT.
Related papers
- Select and Reorder: A Novel Approach for Neural Sign Language Production [35.35777909051466]
Sign languages, often categorised as low-resource languages, face significant challenges in achieving accurate translation.
This paper introduces Select and Reorder (S&R), a novel approach that addresses data scarcity by breaking down the translation process into two distinct steps: Gloss Selection (GS) and Gloss Reordering (GR)
We achieve state-of-the-art BLEU and Rouge scores on the Meine DGS Annotated (mDGS) dataset, demonstrating a substantial BLUE-1 improvement of 37.88% in Text to Gloss (T2G) Translation.
arXiv Detail & Related papers (2024-04-17T16:25:19Z) - Gloss-free Sign Language Translation: Improving from Visual-Language
Pretraining [56.26550923909137]
Gloss-Free Sign Language Translation (SLT) is a challenging task due to its cross-domain nature.
We propose a novel Gloss-Free SLT based on Visual-Language Pretraining (GFSLT-)
Our approach involves two stages: (i) integrating Contrastive Language-Image Pre-training with masked self-supervised learning to create pre-tasks that bridge the semantic gap between visual and textual representations and restore masked sentences, and (ii) constructing an end-to-end architecture with an encoder-decoder-like structure that inherits the parameters of the pre-trained Visual and Text Decoder from
arXiv Detail & Related papers (2023-07-27T10:59:18Z) - Gloss-Free End-to-End Sign Language Translation [59.28829048788345]
We design the Gloss-Free End-to-end sign language translation framework (GloFE)
Our method improves the performance of SLT in the gloss-free setting by exploiting the shared underlying semantics of signs and the corresponding spoken translation.
We obtained state-of-the-art results on large-scale datasets, including OpenASL and How2Sign.
arXiv Detail & Related papers (2023-05-22T09:57:43Z) - Cross-modality Data Augmentation for End-to-End Sign Language Translation [66.46877279084083]
End-to-end sign language translation (SLT) aims to convert sign language videos into spoken language texts directly without intermediate representations.
It has been a challenging task due to the modality gap between sign videos and texts and the data scarcity of labeled data.
We propose a novel Cross-modality Data Augmentation (XmDA) framework to transfer the powerful gloss-to-text translation capabilities to end-to-end sign language translation.
arXiv Detail & Related papers (2023-05-18T16:34:18Z) - A Simple Multi-Modality Transfer Learning Baseline for Sign Language
Translation [54.29679610921429]
Existing sign language datasets contain only about 10K-20K pairs of sign videos, gloss annotations and texts.
Data is thus a bottleneck for training effective sign language translation models.
This simple baseline surpasses the previous state-of-the-art results on two sign language translation benchmarks.
arXiv Detail & Related papers (2022-03-08T18:59:56Z) - Improving Sign Language Translation with Monolingual Data by Sign
Back-Translation [105.83166521438463]
We propose a sign back-translation (SignBT) approach, which incorporates massive spoken language texts into sign training.
With a text-to-gloss translation model, we first back-translate the monolingual text to its gloss sequence.
Then, the paired sign sequence is generated by splicing pieces from an estimated gloss-to-sign bank at the feature level.
arXiv Detail & Related papers (2021-05-26T08:49:30Z) - Data Augmentation for Sign Language Gloss Translation [115.13684506803529]
Sign language translation (SLT) is often decomposed into video-to-gloss recognition and gloss-totext translation.
We focus here on gloss-to-text translation, which we treat as a low-resource neural machine translation (NMT) problem.
By pre-training on the thus obtained synthetic data, we improve translation from American Sign Language (ASL) to English and German Sign Language (DGS) to German by up to 3.14 and 2.20 BLEU, respectively.
arXiv Detail & Related papers (2021-05-16T16:37:36Z) - Better Sign Language Translation with STMC-Transformer [9.835743237370218]
Sign Language Translation first uses a Sign Language Recognition system to extract sign language glosses from videos.
A translation system then generates spoken language translations from the sign language glosses.
This paper introduces the STMC-Transformer which improves on the current state-of-the-art by over 5 and 7 BLEU respectively.
arXiv Detail & Related papers (2020-04-01T17:20:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.