Source and Target Bidirectional Knowledge Distillation for End-to-end
Speech Translation
- URL: http://arxiv.org/abs/2104.06457v1
- Date: Tue, 13 Apr 2021 19:00:51 GMT
- Title: Source and Target Bidirectional Knowledge Distillation for End-to-end
Speech Translation
- Authors: Hirofumi Inaguma, Tatsuya Kawahara, Shinji Watanabe
- Abstract summary: We focus on sequence-level knowledge distillation (SeqKD) from external text-based NMT models.
We train a bilingual E2E-ST model to predict paraphrased transcriptions as an auxiliary task with a single decoder.
- Score: 88.78138830698173
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A conventional approach to improving the performance of end-to-end speech
translation (E2E-ST) models is to leverage the source transcription via
pre-training and joint training with automatic speech recognition (ASR) and
neural machine translation (NMT) tasks. However, since the input modalities are
different, it is difficult to leverage source language text successfully. In
this work, we focus on sequence-level knowledge distillation (SeqKD) from
external text-based NMT models. To leverage the full potential of the source
language information, we propose backward SeqKD, SeqKD from a target-to-source
backward NMT model. To this end, we train a bilingual E2E-ST model to predict
paraphrased transcriptions as an auxiliary task with a single decoder. The
paraphrases are generated from the translations in bitext via back-translation.
We further propose bidirectional SeqKD in which SeqKD from both forward and
backward NMT models is combined. Experimental evaluations on both
autoregressive and non-autoregressive models show that SeqKD in each direction
consistently improves the translation performance, and the effectiveness is
complementary regardless of the model capacity.
Related papers
- Confidence Based Bidirectional Global Context Aware Training Framework
for Neural Machine Translation [74.99653288574892]
We propose a Confidence Based Bidirectional Global Context Aware (CBBGCA) training framework for neural machine translation (NMT)
Our proposed CBBGCA training framework significantly improves the NMT model by +1.02, +1.30 and +0.57 BLEU scores on three large-scale translation datasets.
arXiv Detail & Related papers (2022-02-28T10:24:22Z) - Improving Neural Machine Translation by Bidirectional Training [85.64797317290349]
We present a simple and effective pretraining strategy -- bidirectional training (BiT) for neural machine translation.
Specifically, we bidirectionally update the model parameters at the early stage and then tune the model normally.
Experimental results show that BiT pushes the SOTA neural machine translation performance across 15 translation tasks on 8 language pairs significantly higher.
arXiv Detail & Related papers (2021-09-16T07:58:33Z) - Self-supervised and Supervised Joint Training for Resource-rich Machine
Translation [30.502625878505732]
Self-supervised pre-training of text representations has been successfully applied to low-resource Neural Machine Translation (NMT)
We propose a joint training approach, $F$-XEnDec, to combine self-supervised and supervised learning to optimize NMT models.
arXiv Detail & Related papers (2021-06-08T02:35:40Z) - Tight Integrated End-to-End Training for Cascaded Speech Translation [40.76367623739673]
A cascaded speech translation model relies on discrete and non-differentiable transcription.
Direct speech translation is an alternative method to avoid error propagation.
This work explores the feasibility of collapsing the entire cascade components into a single end-to-end trainable model.
arXiv Detail & Related papers (2020-11-24T15:43:49Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Neural Simultaneous Speech Translation Using Alignment-Based Chunking [4.224809458327515]
In simultaneous machine translation, the objective is to determine when to produce a partial translation given a continuous stream of source words.
We propose a neural machine translation (NMT) model that makes dynamic decisions when to continue feeding on input or generate output words.
Our results on the IWSLT 2020 English-to-German task outperform a wait-k baseline by 2.6 to 3.7% BLEU absolute.
arXiv Detail & Related papers (2020-05-29T10:20:48Z) - DiscreTalk: Text-to-Speech as a Machine Translation Problem [52.33785857500754]
This paper proposes a new end-to-end text-to-speech (E2E-TTS) model based on neural machine translation (NMT)
The proposed model consists of two components; a non-autoregressive vector quantized variational autoencoder (VQ-VAE) model and an autoregressive Transformer-NMT model.
arXiv Detail & Related papers (2020-05-12T02:45:09Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.