Tencent AI Lab - Shanghai Jiao Tong University Low-Resource Translation
System for the WMT22 Translation Task
- URL: http://arxiv.org/abs/2210.08742v1
- Date: Mon, 17 Oct 2022 04:34:09 GMT
- Title: Tencent AI Lab - Shanghai Jiao Tong University Low-Resource Translation
System for the WMT22 Translation Task
- Authors: Zhiwei He, Xing Wang, Zhaopeng Tu, Shuming Shi, Rui Wang
- Abstract summary: This paper describes Tencent AI Lab - Shanghai Jiao Tong University (TAL-SJTU) Low-Resource Translation systems for the WMT22 shared task.
We participate in the general translation task on English$Leftrightarrow$Livonian.
Our system is based on M2M100 with novel techniques that adapt it to the target language pair.
- Score: 49.916963624249355
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes Tencent AI Lab - Shanghai Jiao Tong University
(TAL-SJTU) Low-Resource Translation systems for the WMT22 shared task. We
participate in the general translation task on
English$\Leftrightarrow$Livonian. Our system is based on M2M100 with novel
techniques that adapt it to the target language pair. (1) Cross-model word
embedding alignment: inspired by cross-lingual word embedding alignment, we
successfully transfer a pre-trained word embedding to M2M100, enabling it to
support Livonian. (2) Gradual adaptation strategy: we exploit Estonian and
Latvian as auxiliary languages for many-to-many translation training and then
adapt to English-Livonian. (3) Data augmentation: to enlarge the parallel data
for English-Livonian, we construct pseudo-parallel data with Estonian and
Latvian as pivot languages. (4) Fine-tuning: to make the most of all available
data, we fine-tune the model with the validation set and online
back-translation, further boosting the performance. In model evaluation: (1) We
find that previous work underestimated the translation performance of Livonian
due to inconsistent Unicode normalization, which may cause a discrepancy of up
to 14.9 BLEU score. (2) In addition to the standard validation set, we also
employ round-trip BLEU to evaluate the models, which we find more appropriate
for this task. Finally, our unconstrained system achieves BLEU scores of 17.0
and 30.4 for English to/from Livonian.
Related papers
- A Paradigm Shift in Machine Translation: Boosting Translation
Performance of Large Language Models [27.777372498182864]
We propose a novel fine-tuning approach for Generative Large Language Models (LLMs)
Our approach consists of two fine-tuning stages: initial fine-tuning on monolingual data followed by subsequent fine-tuning on a small set of high-quality parallel data.
Based on LLaMA-2 as our underlying model, our results show that the model can achieve an average improvement of more than 12 BLEU and 12 COMET over its zero-shot performance.
arXiv Detail & Related papers (2023-09-20T22:53:15Z) - Tackling Low-Resourced Sign Language Translation: UPC at WMT-SLT 22 [4.382973957294345]
This paper describes the system developed at the Universitat Politecnica de Catalunya for the Workshop on Machine Translation 2022 Sign Language Translation Task.
We use a Transformer model implemented with the Fairseq modeling toolkit.
We have experimented with the vocabulary size, data augmentation techniques and pretraining the model with the ENIX-14T dataset.
arXiv Detail & Related papers (2022-12-02T12:42:24Z) - BJTU-WeChat's Systems for the WMT22 Chat Translation Task [66.81525961469494]
This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the WMT'22 chat translation task for English-German.
Based on the Transformer, we apply several effective variants.
Our systems achieve 0.810 and 0.946 COMET scores.
arXiv Detail & Related papers (2022-11-28T02:35:04Z) - TSMind: Alibaba and Soochow University's Submission to the WMT22
Translation Suggestion Task [16.986003476984965]
This paper describes the joint submission of Alibaba and Soochow University, TSMind, to the WMT 2022 Shared Task on Translation Suggestion.
Basically, we utilize the model paradigm fine-tuning on the downstream tasks based on large-scale pre-trained models.
Considering the task's condition of limited use of training data, we follow the data augmentation strategies proposed by WeTS to boost our TS model performance.
arXiv Detail & Related papers (2022-11-16T15:43:31Z) - The USYD-JD Speech Translation System for IWSLT 2021 [85.64797317290349]
This paper describes the University of Sydney& JD's joint submission of the IWSLT 2021 low resource speech translation task.
We trained our models with the officially provided ASR and MT datasets.
To achieve better translation performance, we explored the most recent effective strategies, including back translation, knowledge distillation, multi-feature reranking and transductive finetuning.
arXiv Detail & Related papers (2021-07-24T09:53:34Z) - Beyond English-Centric Multilingual Machine Translation [74.21727842163068]
We create a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages.
We build and open source a training dataset that covers thousands of language directions with supervised data, created through large-scale mining.
Our focus on non-English-Centric models brings gains of more than 10 BLEU when directly translating between non-English directions while performing competitively to the best single systems of WMT.
arXiv Detail & Related papers (2020-10-21T17:01:23Z) - Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language.
We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models.
Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z) - Unsupervised Bitext Mining and Translation via Self-trained Contextual
Embeddings [51.47607125262885]
We describe an unsupervised method to create pseudo-parallel corpora for machine translation (MT) from unaligned text.
We use multilingual BERT to create source and target sentence embeddings for nearest-neighbor search and adapt the model via self-training.
We validate our technique by extracting parallel sentence pairs on the BUCC 2017 bitext mining task and observe up to a 24.5 point increase (absolute) in F1 scores over previous unsupervised methods.
arXiv Detail & Related papers (2020-10-15T14:04:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.