Related papers: Two-Way Neural Machine Translation: A Proof of Concept for Bidirectional Translation Modeling using a Two-Dimensional Grid

Two-Way Neural Machine Translation: A Proof of Concept for Bidirectional Translation Modeling using a Two-Dimensional Grid

URL: http://arxiv.org/abs/2011.12165v1
Date: Tue, 24 Nov 2020 15:42:32 GMT
Title: Two-Way Neural Machine Translation: A Proof of Concept for Bidirectional Translation Modeling using a Two-Dimensional Grid
Authors: Parnia Bahar, Christopher Brix and Hermann Ney
Abstract summary: This paper proposes to build a single end-to-end bidirectional translation model using a two-dimensional grid. Instead of training two models independently, our approach encourages a single network to jointly learn to translate in both directions.
Score: 47.39346022004215
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural translation models have proven to be effective in capturing sufficient information from a source sentence and generating a high-quality target sentence. However, it is not easy to get the best effect for bidirectional translation, i.e., both source-to-target and target-to-source translation using a single model. If we exclude some pioneering attempts, such as multilingual systems, all other bidirectional translation approaches are required to train two individual models. This paper proposes to build a single end-to-end bidirectional translation model using a two-dimensional grid, where the left-to-right decoding generates source-to-target, and the bottom-to-up decoding creates target-to-source output. Instead of training two models independently, our approach encourages a single network to jointly learn to translate in both directions. Experiments on the WMT 2018 German$\leftrightarrow$English and Turkish$\leftrightarrow$English translation tasks show that the proposed model is capable of generating a good translation quality and has sufficient potential to direct the research.

Related papers

Relay Decoding: Concatenating Large Language Models for Machine Translation [21.367605327742027]
We propose an innovative approach called RD (Relay Decoding), which entails concatenating two distinct large models that individually support the source and target languages. By incorporating a simple mapping layer to facilitate the connection between these two models and utilizing a limited amount of parallel data for training, we successfully achieve superior results in the machine translation task.
arXiv Detail & Related papers (2024-05-05T13:42:25Z)
Duplex Diffusion Models Improve Speech-to-Speech Translation [1.4649095013539173]
Speech-to-speech translation is a sequence-to-sequence learning task that naturally has two directions. We propose a duplex diffusion model that applies diffusion probabilistic models to both sides of a reversible duplex Conformer. Our model enables reversible speech translation by simply flipping the input and output ends.
arXiv Detail & Related papers (2023-05-22T01:39:40Z)
Dual-Alignment Pre-training for Cross-lingual Sentence Embedding [79.98111074307657]
We propose a dual-alignment pre-training (DAP) framework for cross-lingual sentence embedding. We introduce a novel representation translation learning (RTL) task, where the model learns to use one-side contextualized token representation to reconstruct its translation counterpart. Our approach can significantly improve sentence embedding.
arXiv Detail & Related papers (2023-05-16T03:53:30Z)
Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation [49.916963624249355]
A UNMT model is trained on the pseudo parallel data with translated source, and natural source sentences in inference. The source discrepancy between training and inference hinders the translation performance of UNMT models. We propose an online self-training approach, which simultaneously uses the pseudo parallel data natural source, translated target to mimic the inference scenario.
arXiv Detail & Related papers (2022-03-16T04:50:27Z)
Improving Neural Machine Translation by Bidirectional Training [85.64797317290349]
We present a simple and effective pretraining strategy -- bidirectional training (BiT) for neural machine translation. Specifically, we bidirectionally update the model parameters at the early stage and then tune the model normally. Experimental results show that BiT pushes the SOTA neural machine translation performance across 15 translation tasks on 8 language pairs significantly higher.
arXiv Detail & Related papers (2021-09-16T07:58:33Z)
Improving Multilingual Translation by Representation and Gradient Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level. Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z)
Bilingual alignment transfers to multilingual alignment for unsupervised parallel text mining [3.4519649635864584]
This work presents methods for learning cross-lingual sentence representations using paired or unpaired bilingual texts. We hypothesize that the cross-lingual alignment strategy is transferable, and therefore a model trained to align only two languages can encode multilingually more aligned representations.
arXiv Detail & Related papers (2021-04-15T17:51:22Z)
Source and Target Bidirectional Knowledge Distillation for End-to-end Speech Translation [88.78138830698173]
We focus on sequence-level knowledge distillation (SeqKD) from external text-based NMT models. We train a bilingual E2E-ST model to predict paraphrased transcriptions as an auxiliary task with a single decoder.
arXiv Detail & Related papers (2021-04-13T19:00:51Z)
A Hybrid Approach for Improved Low Resource Neural Machine Translation using Monolingual Data [0.0]
Many language pairs are low resource, meaning the amount and/or quality of available parallel data is not sufficient to train a neural machine translation (NMT) model. This work proposes a novel approach that enables both the backward and forward models to benefit from the monolingual target data.
arXiv Detail & Related papers (2020-11-14T22:18:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.