Machine Translation between Spoken Languages and Signed Languages
Represented in SignWriting
- URL: http://arxiv.org/abs/2210.05404v1
- Date: Tue, 11 Oct 2022 12:28:06 GMT
- Title: Machine Translation between Spoken Languages and Signed Languages
Represented in SignWriting
- Authors: Zifan Jiang, Amit Moryossef, Mathias M\"uller, Sarah Ebling
- Abstract summary: We introduce novel methods to parse, factorize, decode, and evaluate SignWriting, leveraging ideas from neural factored MT.
We find that common MT techniques used to improve spoken language translation similarly affect the performance of sign language translation.
- Score: 5.17427644066658
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents work on novel machine translation (MT) systems between
spoken and signed languages, where signed languages are represented in
SignWriting, a sign language writing system. Our work seeks to address the lack
of out-of-the-box support for signed languages in current MT systems and is
based on the SignBank dataset, which contains pairs of spoken language text and
SignWriting content. We introduce novel methods to parse, factorize, decode,
and evaluate SignWriting, leveraging ideas from neural factored MT. In a
bilingual setup--translating from American Sign Language to (American)
English--our method achieves over 30 BLEU, while in two multilingual
setups--translating in both directions between spoken languages and signed
languages--we achieve over 20 BLEU. We find that common MT techniques used to
improve spoken language translation similarly affect the performance of sign
language translation. These findings validate our use of an intermediate text
representation for signed languages to include them in natural language
processing research.
Related papers
- Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues [56.038123093599815]
Our objective is to translate continuous sign language into spoken language text.
We incorporate additional contextual cues together with the signing video.
We show that our contextual approach significantly enhances the quality of the translations.
arXiv Detail & Related papers (2025-01-16T18:59:03Z) - Signs as Tokens: An Autoregressive Multilingual Sign Language Generator [55.94334001112357]
We introduce a multilingual sign language model, Signs as Tokens (SOKE), which can generate 3D sign avatars autoregressively from text inputs.
To align sign language with the LM, we develop a decoupled tokenizer that discretizes continuous signs into token sequences representing various body parts.
These sign tokens are integrated into the raw text vocabulary of the LM, allowing for supervised fine-tuning on sign language datasets.
arXiv Detail & Related papers (2024-11-26T18:28:09Z) - SignBLEU: Automatic Evaluation of Multi-channel Sign Language Translation [3.9711029428461653]
We introduce a new task named multi-channel sign language translation (MCSLT)
We present a novel metric, SignBLEU, designed to capture multiple signal channels.
We found that SignBLEU consistently correlates better with human judgment than competing metrics.
arXiv Detail & Related papers (2024-06-10T05:01:26Z) - LLMs are Good Sign Language Translators [19.259163728870696]
Sign Language Translation is a challenging task that aims to translate sign videos into spoken language.
We propose a novel SignLLM framework to transform sign videos into a language-like representation.
We achieve state-of-the-art gloss-free results on two widely-used SLT benchmarks.
arXiv Detail & Related papers (2024-04-01T05:07:13Z) - JWSign: A Highly Multilingual Corpus of Bible Translations for more
Diversity in Sign Language Processing [2.9936326613596775]
JWSign dataset consists of 2,530 hours of Bible translations in 98 sign languages.
We train multilingual systems, including some that take into account the typological relatedness of signed or spoken languages.
arXiv Detail & Related papers (2023-11-16T20:02:44Z) - Enhancing Cross-lingual Transfer via Phonemic Transcription Integration [57.109031654219294]
PhoneXL is a framework incorporating phonemic transcriptions as an additional linguistic modality for cross-lingual transfer.
Our pilot study reveals phonemic transcription provides essential information beyond the orthography to enhance cross-lingual transfer.
arXiv Detail & Related papers (2023-07-10T06:17:33Z) - LAE: Language-Aware Encoder for Monolingual and Multilingual ASR [87.74794847245536]
A novel language-aware encoder (LAE) architecture is proposed to handle both situations by disentangling language-specific information.
Experiments conducted on Mandarin-English code-switched speech suggest that the proposed LAE is capable of discriminating different languages in frame-level.
arXiv Detail & Related papers (2022-06-05T04:03:12Z) - Looking for Clues of Language in Multilingual BERT to Improve
Cross-lingual Generalization [56.87201892585477]
Token embeddings in multilingual BERT (m-BERT) contain both language and semantic information.
We control the output languages of multilingual BERT by manipulating the token embeddings.
arXiv Detail & Related papers (2020-10-20T05:41:35Z) - Sign Language Transformers: Joint End-to-end Sign Language Recognition
and Translation [59.38247587308604]
We introduce a novel transformer based architecture that jointly learns Continuous Sign Language Recognition and Translation.
We evaluate the recognition and translation performances of our approaches on the challenging RWTH-PHOENIX-Weather-2014T dataset.
Our translation networks outperform both sign video to spoken language and gloss to spoken language translation models.
arXiv Detail & Related papers (2020-03-30T21:35:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.