ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine
Translation
- URL: http://arxiv.org/abs/2005.00850v2
- Date: Tue, 12 May 2020 20:44:24 GMT
- Title: ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine
Translation
- Authors: Lifu Tu, Richard Yuanzhe Pang, Sam Wiseman, Kevin Gimpel
- Abstract summary: We train a non-autoregressive machine translation model to minimize the energy defined by a preregressive model.
Our approach achieves state-of-the-art non-autoregressive results on the IT 2014 DE-EN and WMT 2016 RO-WSLEN datasets.
- Score: 56.59824570139266
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We propose to train a non-autoregressive machine translation model to
minimize the energy defined by a pretrained autoregressive model. In
particular, we view our non-autoregressive translation system as an inference
network (Tu and Gimpel, 2018) trained to minimize the autoregressive teacher
energy. This contrasts with the popular approach of training a
non-autoregressive model on a distilled corpus consisting of the beam-searched
outputs of such a teacher model. Our approach, which we call ENGINE
(ENerGy-based Inference NEtworks), achieves state-of-the-art non-autoregressive
results on the IWSLT 2014 DE-EN and WMT 2016 RO-EN datasets, approaching the
performance of autoregressive models.
Related papers
- Improving Non-autoregressive Translation Quality with Pretrained
Language Model, Embedding Distillation and Upsampling Strategy for CTC [57.70351255180495]
This paper introduces a series of innovative techniques to enhance the translation quality of Non-Autoregressive Translation (NAT) models.
We propose fine-tuning Pretrained Multilingual Language Models (PMLMs) with the CTC loss to train NAT models effectively.
Our model exhibits a remarkable speed improvement of 16.35 times compared to the autoregressive model.
arXiv Detail & Related papers (2023-06-10T05:24:29Z) - Your Autoregressive Generative Model Can be Better If You Treat It as an
Energy-Based One [83.5162421521224]
We propose a unique method termed E-ARM for training autoregressive generative models.
E-ARM takes advantage of a well-designed energy-based learning objective.
We show that E-ARM can be trained efficiently and is capable of alleviating the exposure bias problem.
arXiv Detail & Related papers (2022-06-26T10:58:41Z) - Improving Non-autoregressive Generation with Mixup Training [51.61038444990301]
We present a non-autoregressive generation model based on pre-trained transformer models.
We propose a simple and effective iterative training method called MIx Source and pseudo Target.
Our experiments on three generation benchmarks including question generation, summarization and paraphrase generation, show that the proposed framework achieves the new state-of-the-art results.
arXiv Detail & Related papers (2021-10-21T13:04:21Z) - Enriching Non-Autoregressive Transformer with Syntactic and
SemanticStructures for Neural Machine Translation [54.864148836486166]
We propose to incorporate the explicit syntactic and semantic structures of languages into a non-autoregressive Transformer.
Our model achieves a significantly faster speed, as well as keeps the translation quality when compared with several state-of-the-art non-autoregressive models.
arXiv Detail & Related papers (2021-01-22T04:12:17Z) - A Spectral Energy Distance for Parallel Speech Synthesis [29.14723501889278]
Speech synthesis is an important practical generative modeling problem.
We propose a new learning method that allows us to train highly parallel models of speech.
arXiv Detail & Related papers (2020-08-03T19:56:04Z) - Aligned Cross Entropy for Non-Autoregressive Machine Translation [120.15069387374717]
We propose aligned cross entropy (AXE) as an alternative loss function for training of non-autoregressive models.
AXE-based training of conditional masked language models (CMLMs) substantially improves performance on major WMT benchmarks.
arXiv Detail & Related papers (2020-04-03T16:24:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.