Related papers: Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation

Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation

URL: http://arxiv.org/abs/2305.07455v1
Date: Fri, 12 May 2023 13:07:51 GMT
Title: Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation
Authors: Yu-Kuan Fu, Liang-Hsuan Tseng, Jiatong Shi, Chen-An Li, Tsu-Yuan Hsu, Shinji Watanabe, Hung-yi Lee
Abstract summary: We propose to build a cascaded speech translation system without leveraging any kind of paired data. We use fully unpaired data to train our unsupervised systems and evaluate our results on CoVoST 2 and CVSS.
Score: 70.33052952571884
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Most of the speech translation models heavily rely on parallel data, which is hard to collect especially for low-resource languages. To tackle this issue, we propose to build a cascaded speech translation system without leveraging any kind of paired data. We use fully unpaired data to train our unsupervised systems and evaluate our results on CoVoST 2 and CVSS. The results show that our work is comparable with some other early supervised methods in some language pairs. While cascaded systems always suffer from severe error propagation problems, we proposed denoising back-translation (DBT), a novel approach to building robust unsupervised neural machine translation (UNMT). DBT successfully increases the BLEU score by 0.7--0.9 in all three translation directions. Moreover, we simplified the pipeline of our cascaded system to reduce inference latency and conducted a comprehensive analysis of every part of our work. We also demonstrate our unsupervised speech translation results on the established website.

Related papers

Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases [79.07111754406841]
This work proposes using contrastive evaluation to measure the ability of direct S2TT systems to disambiguate utterances where prosody plays a crucial role. Our results clearly demonstrate the value of direct translation systems over cascade translation models.
arXiv Detail & Related papers (2024-02-01T14:46:35Z)
Semi-supervised Neural Machine Translation with Consistency Regularization for Low-Resource Languages [3.475371300689165]
This paper presents a simple yet effective method to tackle the problem for low-resource languages by augmenting high-quality sentence pairs and training NMT models in a semi-supervised manner. Specifically, our approach combines the cross-entropy loss for supervised learning with KL Divergence for unsupervised fashion given pseudo and augmented target sentences. Experimental results show that our approach significantly improves NMT baselines, especially on low-resource datasets with 0.46--2.03 BLEU scores.
arXiv Detail & Related papers (2023-04-02T15:24:08Z)
Simple and Effective Unsupervised Speech Translation [68.25022245914363]
We study a simple and effective approach to build speech translation systems without labeled data. We present an unsupervised domain adaptation technique for pre-trained speech models. Experiments show that unsupervised speech-to-text translation outperforms the previous unsupervised state of the art.
arXiv Detail & Related papers (2022-10-18T22:26:13Z)
Unsupervised Neural Machine Translation with Generative Language Models Only [19.74865387759671]
We show how to derive state-of-the-art unsupervised neural machine translation systems from generatively pre-trained language models. Our method consists of three steps: few-shot amplification, distillation, and backtranslation.
arXiv Detail & Related papers (2021-10-11T17:35:34Z)
Improving Multilingual Translation by Representation and Gradient Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level. Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z)
Cross-lingual Supervision Improves Unsupervised Neural Machine Translation [97.84871088440102]
We introduce a multilingual unsupervised NMT framework to leverage weakly supervised signals from high-resource language pairs to zero-resource translation directions. Method significantly improves the translation quality by more than 3 BLEU score on six benchmark unsupervised translation directions.
arXiv Detail & Related papers (2020-04-07T05:46:49Z)
Robust Unsupervised Neural Machine Translation with Adversarial Denoising Training [66.39561682517741]
Unsupervised neural machine translation (UNMT) has attracted great interest in the machine translation community. The main advantage of the UNMT lies in its easy collection of required large training text sentences. In this paper, we first time explicitly take the noisy data into consideration to improve the robustness of the UNMT based systems.
arXiv Detail & Related papers (2020-02-28T05:17:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.