Improving Cascaded Unsupervised Speech Translation with Denoising
  Back-translation
        - URL: http://arxiv.org/abs/2305.07455v1
- Date: Fri, 12 May 2023 13:07:51 GMT
- Title: Improving Cascaded Unsupervised Speech Translation with Denoising
  Back-translation
- Authors: Yu-Kuan Fu, Liang-Hsuan Tseng, Jiatong Shi, Chen-An Li, Tsu-Yuan Hsu,
  Shinji Watanabe, Hung-yi Lee
- Abstract summary: We propose to build a cascaded speech translation system without leveraging any kind of paired data.
We use fully unpaired data to train our unsupervised systems and evaluate our results on CoVoST 2 and CVSS.
- Score: 70.33052952571884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Most of the speech translation models heavily rely on parallel data, which is
hard to collect especially for low-resource languages. To tackle this issue, we
propose to build a cascaded speech translation system without leveraging any
kind of paired data. We use fully unpaired data to train our unsupervised
systems and evaluate our results on CoVoST 2 and CVSS. The results show that
our work is comparable with some other early supervised methods in some
language pairs. While cascaded systems always suffer from severe error
propagation problems, we proposed denoising back-translation (DBT), a novel
approach to building robust unsupervised neural machine translation (UNMT). DBT
successfully increases the BLEU score by 0.7--0.9 in all three translation
directions. Moreover, we simplified the pipeline of our cascaded system to
reduce inference latency and conducted a comprehensive analysis of every part
of our work. We also demonstrate our unsupervised speech translation results on
the established website.
 
      
        Related papers
        - Prosody in Cascade and Direct Speech-to-Text Translation: a case study
  on Korean Wh-Phrases [79.07111754406841]
 This work proposes using contrastive evaluation to measure the ability of direct S2TT systems to disambiguate utterances where prosody plays a crucial role.
Our results clearly demonstrate the value of direct translation systems over cascade translation models.
 arXiv  Detail & Related papers  (2024-02-01T14:46:35Z)
- Semi-supervised Neural Machine Translation with Consistency
  Regularization for Low-Resource Languages [3.475371300689165]
 This paper presents a simple yet effective method to tackle the problem for low-resource languages by augmenting high-quality sentence pairs and training NMT models in a semi-supervised manner.
Specifically, our approach combines the cross-entropy loss for supervised learning with KL Divergence for unsupervised fashion given pseudo and augmented target sentences.
 Experimental results show that our approach significantly improves NMT baselines, especially on low-resource datasets with 0.46--2.03 BLEU scores.
 arXiv  Detail & Related papers  (2023-04-02T15:24:08Z)
- Simple and Effective Unsupervised Speech Translation [68.25022245914363]
 We study a simple and effective approach to build speech translation systems without labeled data.
We present an unsupervised domain adaptation technique for pre-trained speech models.
Experiments show that unsupervised speech-to-text translation outperforms the previous unsupervised state of the art.
 arXiv  Detail & Related papers  (2022-10-18T22:26:13Z)
- Unsupervised Neural Machine Translation with Generative Language Models
  Only [19.74865387759671]
 We show how to derive state-of-the-art unsupervised neural machine translation systems from generatively pre-trained language models.
Our method consists of three steps: few-shot amplification, distillation, and backtranslation.
 arXiv  Detail & Related papers  (2021-10-11T17:35:34Z)
- Improving Multilingual Translation by Representation and Gradient
  Regularization [82.42760103045083]
 We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
 arXiv  Detail & Related papers  (2021-09-10T10:52:21Z)
- Cross-lingual Supervision Improves Unsupervised Neural Machine
  Translation [97.84871088440102]
 We introduce a multilingual unsupervised NMT framework to leverage weakly supervised signals from high-resource language pairs to zero-resource translation directions.
Method significantly improves the translation quality by more than 3 BLEU score on six benchmark unsupervised translation directions.
 arXiv  Detail & Related papers  (2020-04-07T05:46:49Z)
- Robust Unsupervised Neural Machine Translation with Adversarial
  Denoising Training [66.39561682517741]
 Unsupervised neural machine translation (UNMT) has attracted great interest in the machine translation community.
The main advantage of the UNMT lies in its easy collection of required large training text sentences.
In this paper, we first time explicitly take the noisy data into consideration to improve the robustness of the UNMT based systems.
 arXiv  Detail & Related papers  (2020-02-28T05:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.