Towards Opening the Black Box of Neural Machine Translation: Source and
Target Interpretations of the Transformer
- URL: http://arxiv.org/abs/2205.11631v1
- Date: Mon, 23 May 2022 20:59:14 GMT
- Title: Towards Opening the Black Box of Neural Machine Translation: Source and
Target Interpretations of the Transformer
- Authors: Javier Ferrando, Gerard I. G\'allego, Belen Alastruey, Carlos
Escolano, Marta R. Costa-juss\`a
- Abstract summary: In Neural Machine Translation (NMT), each token prediction is conditioned on the source sentence and the target prefix.
Previous work on interpretability in NMT has focused solely on source sentence tokens attributions.
We propose an interpretability method that tracks complete input token attributions.
- Score: 1.8594711725515678
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In Neural Machine Translation (NMT), each token prediction is conditioned on
the source sentence and the target prefix (what has been previously translated
at a decoding step). However, previous work on interpretability in NMT has
focused solely on source sentence tokens attributions. Therefore, we lack a
full understanding of the influences of every input token (source sentence and
target prefix) in the model predictions. In this work, we propose an
interpretability method that tracks complete input token attributions. Our
method, which can be extended to any encoder-decoder Transformer-based model,
allows us to better comprehend the inner workings of current NMT models. We
apply the proposed method to both bilingual and multilingual Transformers and
present insights into their behaviour.
Related papers
- Learning Homographic Disambiguation Representation for Neural Machine
Translation [20.242134720005467]
Homographs, words with the same spelling but different meanings, remain challenging in Neural Machine Translation (NMT)
We propose a novel approach to tackle issues of NMT in the latent space.
We first train an encoder (aka " homographic-encoder") to learn universal sentence representations in a natural language inference (NLI) task.
We further fine-tune the encoder using homograph-based syn-set WordNet, enabling it to learn word-set representations from sentences.
arXiv Detail & Related papers (2023-04-12T13:42:59Z) - Transformer Feed-Forward Layers Build Predictions by Promoting Concepts
in the Vocabulary Space [49.029910567673824]
Transformer-based language models (LMs) are at the core of modern NLP, but their internal prediction construction process is opaque and largely not understood.
We make a substantial step towards unveiling this underlying prediction process, by reverse-engineering the operation of the feed-forward network (FFN) layers.
arXiv Detail & Related papers (2022-03-28T12:26:00Z) - Confidence Based Bidirectional Global Context Aware Training Framework
for Neural Machine Translation [74.99653288574892]
We propose a Confidence Based Bidirectional Global Context Aware (CBBGCA) training framework for neural machine translation (NMT)
Our proposed CBBGCA training framework significantly improves the NMT model by +1.02, +1.30 and +0.57 BLEU scores on three large-scale translation datasets.
arXiv Detail & Related papers (2022-02-28T10:24:22Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Source and Target Bidirectional Knowledge Distillation for End-to-end
Speech Translation [88.78138830698173]
We focus on sequence-level knowledge distillation (SeqKD) from external text-based NMT models.
We train a bilingual E2E-ST model to predict paraphrased transcriptions as an auxiliary task with a single decoder.
arXiv Detail & Related papers (2021-04-13T19:00:51Z) - Token Drop mechanism for Neural Machine Translation [12.666468105300002]
We propose Token Drop to improve generalization and avoid overfitting for the NMT model.
Similar to word dropout, whereas we replace dropped token with a special token instead of setting zero to words.
arXiv Detail & Related papers (2020-10-21T14:02:27Z) - Universal Vector Neural Machine Translation With Effective Attention [0.0]
We propose a singular model for Neural Machine Translation based on encoder-decoder models.
We introduce a neutral/universal model representation that can be used to predict more than one language.
arXiv Detail & Related papers (2020-06-09T01:13:57Z) - Explicit Reordering for Neural Machine Translation [50.70683739103066]
In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency.
We propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT.
The empirical results on the WMT14 English-to-German, WAT ASPEC Japanese-to-English, and WMT17 Chinese-to-English translation tasks show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-04-08T05:28:46Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.