Two-Way Neural Machine Translation: A Proof of Concept for Bidirectional
Translation Modeling using a Two-Dimensional Grid
- URL: http://arxiv.org/abs/2011.12165v1
- Date: Tue, 24 Nov 2020 15:42:32 GMT
- Title: Two-Way Neural Machine Translation: A Proof of Concept for Bidirectional
Translation Modeling using a Two-Dimensional Grid
- Authors: Parnia Bahar, Christopher Brix and Hermann Ney
- Abstract summary: This paper proposes to build a single end-to-end bidirectional translation model using a two-dimensional grid.
Instead of training two models independently, our approach encourages a single network to jointly learn to translate in both directions.
- Score: 47.39346022004215
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural translation models have proven to be effective in capturing sufficient
information from a source sentence and generating a high-quality target
sentence. However, it is not easy to get the best effect for bidirectional
translation, i.e., both source-to-target and target-to-source translation using
a single model. If we exclude some pioneering attempts, such as multilingual
systems, all other bidirectional translation approaches are required to train
two individual models. This paper proposes to build a single end-to-end
bidirectional translation model using a two-dimensional grid, where the
left-to-right decoding generates source-to-target, and the bottom-to-up
decoding creates target-to-source output. Instead of training two models
independently, our approach encourages a single network to jointly learn to
translate in both directions. Experiments on the WMT 2018
German$\leftrightarrow$English and Turkish$\leftrightarrow$English translation
tasks show that the proposed model is capable of generating a good translation
quality and has sufficient potential to direct the research.
Related papers
- Relay Decoding: Concatenating Large Language Models for Machine Translation [21.367605327742027]
We propose an innovative approach called RD (Relay Decoding), which entails concatenating two distinct large models that individually support the source and target languages.
By incorporating a simple mapping layer to facilitate the connection between these two models and utilizing a limited amount of parallel data for training, we successfully achieve superior results in the machine translation task.
arXiv Detail & Related papers (2024-05-05T13:42:25Z) - Duplex Diffusion Models Improve Speech-to-Speech Translation [1.4649095013539173]
Speech-to-speech translation is a sequence-to-sequence learning task that naturally has two directions.
We propose a duplex diffusion model that applies diffusion probabilistic models to both sides of a reversible duplex Conformer.
Our model enables reversible speech translation by simply flipping the input and output ends.
arXiv Detail & Related papers (2023-05-22T01:39:40Z) - Dual-Alignment Pre-training for Cross-lingual Sentence Embedding [79.98111074307657]
We propose a dual-alignment pre-training (DAP) framework for cross-lingual sentence embedding.
We introduce a novel representation translation learning (RTL) task, where the model learns to use one-side contextualized token representation to reconstruct its translation counterpart.
Our approach can significantly improve sentence embedding.
arXiv Detail & Related papers (2023-05-16T03:53:30Z) - Bridging the Data Gap between Training and Inference for Unsupervised
Neural Machine Translation [49.916963624249355]
A UNMT model is trained on the pseudo parallel data with translated source, and natural source sentences in inference.
The source discrepancy between training and inference hinders the translation performance of UNMT models.
We propose an online self-training approach, which simultaneously uses the pseudo parallel data natural source, translated target to mimic the inference scenario.
arXiv Detail & Related papers (2022-03-16T04:50:27Z) - Improving Neural Machine Translation by Bidirectional Training [85.64797317290349]
We present a simple and effective pretraining strategy -- bidirectional training (BiT) for neural machine translation.
Specifically, we bidirectionally update the model parameters at the early stage and then tune the model normally.
Experimental results show that BiT pushes the SOTA neural machine translation performance across 15 translation tasks on 8 language pairs significantly higher.
arXiv Detail & Related papers (2021-09-16T07:58:33Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - Bilingual alignment transfers to multilingual alignment for unsupervised
parallel text mining [3.4519649635864584]
This work presents methods for learning cross-lingual sentence representations using paired or unpaired bilingual texts.
We hypothesize that the cross-lingual alignment strategy is transferable, and therefore a model trained to align only two languages can encode multilingually more aligned representations.
arXiv Detail & Related papers (2021-04-15T17:51:22Z) - Source and Target Bidirectional Knowledge Distillation for End-to-end
Speech Translation [88.78138830698173]
We focus on sequence-level knowledge distillation (SeqKD) from external text-based NMT models.
We train a bilingual E2E-ST model to predict paraphrased transcriptions as an auxiliary task with a single decoder.
arXiv Detail & Related papers (2021-04-13T19:00:51Z) - A Hybrid Approach for Improved Low Resource Neural Machine Translation
using Monolingual Data [0.0]
Many language pairs are low resource, meaning the amount and/or quality of available parallel data is not sufficient to train a neural machine translation (NMT) model.
This work proposes a novel approach that enables both the backward and forward models to benefit from the monolingual target data.
arXiv Detail & Related papers (2020-11-14T22:18:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.