The USYD-JD Speech Translation System for IWSLT 2021
- URL: http://arxiv.org/abs/2107.11572v1
- Date: Sat, 24 Jul 2021 09:53:34 GMT
- Title: The USYD-JD Speech Translation System for IWSLT 2021
- Authors: Liang Ding, Di Wu, Dacheng Tao
- Abstract summary: This paper describes the University of Sydney& JD's joint submission of the IWSLT 2021 low resource speech translation task.
We trained our models with the officially provided ASR and MT datasets.
To achieve better translation performance, we explored the most recent effective strategies, including back translation, knowledge distillation, multi-feature reranking and transductive finetuning.
- Score: 85.64797317290349
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: This paper describes the University of Sydney& JD's joint submission of the
IWSLT 2021 low resource speech translation task. We participated in the
Swahili-English direction and got the best scareBLEU (25.3) score among all the
participants. Our constrained system is based on a pipeline framework, i.e. ASR
and NMT. We trained our models with the officially provided ASR and MT
datasets. The ASR system is based on the open-sourced tool Kaldi and this work
mainly explores how to make the most of the NMT models. To reduce the
punctuation errors generated by the ASR model, we employ our previous work
SlotRefine to train a punctuation correction model. To achieve better
translation performance, we explored the most recent effective strategies,
including back translation, knowledge distillation, multi-feature reranking and
transductive finetuning. For model structure, we tried auto-regressive and
non-autoregressive models, respectively. In addition, we proposed two novel
pre-train approaches, i.e. \textit{de-noising training} and
\textit{bidirectional training} to fully exploit the data. Extensive
experiments show that adding the above techniques consistently improves the
BLEU scores, and the final submission system outperforms the baseline
(Transformer ensemble model trained with the original parallel data) by
approximately 10.8 BLEU score, achieving the SOTA performance.
Related papers
- Strategies for improving low resource speech to text translation relying
on pre-trained ASR models [59.90106959717875]
This paper presents techniques and findings for improving the performance of low-resource speech to text translation (ST)
We conducted experiments on both simulated and real-low resource setups, on language pairs English - Portuguese, and Tamasheq - French respectively.
arXiv Detail & Related papers (2023-05-31T21:58:07Z) - Tencent AI Lab - Shanghai Jiao Tong University Low-Resource Translation
System for the WMT22 Translation Task [49.916963624249355]
This paper describes Tencent AI Lab - Shanghai Jiao Tong University (TAL-SJTU) Low-Resource Translation systems for the WMT22 shared task.
We participate in the general translation task on English$Leftrightarrow$Livonian.
Our system is based on M2M100 with novel techniques that adapt it to the target language pair.
arXiv Detail & Related papers (2022-10-17T04:34:09Z) - End-to-End Training for Back-Translation with Categorical Reparameterization Trick [0.0]
Back-translation is an effective semi-supervised learning framework in neural machine translation (NMT)
A pre-trained NMT model translates monolingual sentences and makes synthetic bilingual sentence pairs for the training of the other NMT model.
The discrete property of translated sentences prevents information gradient from flowing between the two NMT models.
arXiv Detail & Related papers (2022-02-17T06:31:03Z) - Improving Neural Machine Translation by Denoising Training [95.96569884410137]
We present a simple and effective pretraining strategy Denoising Training DoT for neural machine translation.
We update the model parameters with source- and target-side denoising tasks at the early stage and then tune the model normally.
Experiments show DoT consistently improves the neural machine translation performance across 12 bilingual and 16 multilingual directions.
arXiv Detail & Related papers (2022-01-19T00:11:38Z) - Improving Neural Machine Translation by Bidirectional Training [85.64797317290349]
We present a simple and effective pretraining strategy -- bidirectional training (BiT) for neural machine translation.
Specifically, we bidirectionally update the model parameters at the early stage and then tune the model normally.
Experimental results show that BiT pushes the SOTA neural machine translation performance across 15 translation tasks on 8 language pairs significantly higher.
arXiv Detail & Related papers (2021-09-16T07:58:33Z) - A Technical Report: BUT Speech Translation Systems [2.9327503320877457]
The paper describes the BUT's speech translation systems.
The systems are English$longrightarrow$German offline speech translation systems.
A large degradation is observed when translating ASR hypothesis compared to the oracle input text.
arXiv Detail & Related papers (2020-10-22T10:52:31Z) - Jointly Trained Transformers models for Spoken Language Translation [2.3886615435250302]
This work trains SLT systems with ASR objective as an auxiliary loss and both the networks are connected through neural hidden representations.
This architecture has improved from BLEU from 36.8 to 44.5.
All the experiments are reported on English-Portuguese speech translation task using How2 corpus.
arXiv Detail & Related papers (2020-04-25T11:28:39Z) - Joint Contextual Modeling for ASR Correction and Language Understanding [60.230013453699975]
We propose multi-task neural approaches to perform contextual language correction on ASR outputs jointly with language understanding (LU)
We show that the error rates of off the shelf ASR and following LU systems can be reduced significantly by 14% relative with joint models trained using small amounts of in-domain data.
arXiv Detail & Related papers (2020-01-28T22:09:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.