Extract and Attend: Improving Entity Translation in Neural Machine
Translation
- URL: http://arxiv.org/abs/2306.02242v1
- Date: Sun, 4 Jun 2023 03:05:25 GMT
- Title: Extract and Attend: Improving Entity Translation in Neural Machine
Translation
- Authors: Zixin Zeng, Rui Wang, Yichong Leng, Junliang Guo, Xu Tan, Tao Qin,
Tie-yan Liu
- Abstract summary: We propose an Extract-and-Attend approach to enhance entity translation in NMT.
The proposed method is effective on improving both the translation accuracy of entities and the overall translation quality.
- Score: 141.7840980565706
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: While Neural Machine Translation(NMT) has achieved great progress in recent
years, it still suffers from inaccurate translation of entities (e.g.,
person/organization name, location), due to the lack of entity training
instances. When we humans encounter an unknown entity during translation, we
usually first look up in a dictionary and then organize the entity translation
together with the translations of other parts to form a smooth target sentence.
Inspired by this translation process, we propose an Extract-and-Attend approach
to enhance entity translation in NMT, where the translation candidates of
source entities are first extracted from a dictionary and then attended to by
the NMT model to generate the target sentence. Specifically, the translation
candidates are extracted by first detecting the entities in a source sentence
and then translating the entities through looking up in a dictionary. Then, the
extracted candidates are added as a prefix of the decoder input to be attended
to by the decoder when generating the target sentence through self-attention.
Experiments conducted on En-Zh and En-Ru demonstrate that the proposed method
is effective on improving both the translation accuracy of entities and the
overall translation quality, with up to 35% reduction on entity error rate and
0.85 gain on BLEU and 13.8 gain on COMET.
Related papers
- Understanding and Addressing the Under-Translation Problem from the Perspective of Decoding Objective [72.83966378613238]
Under-translation and over-translation remain two challenging problems in state-of-the-art Neural Machine Translation (NMT) systems.
We conduct an in-depth analysis on the underlying cause of under-translation in NMT, providing an explanation from the perspective of decoding objective.
We propose employing the confidence of predicting End Of Sentence (EOS) as a detector for under-translation, and strengthening the confidence-based penalty to penalize candidates with a high risk of under-translation.
arXiv Detail & Related papers (2024-05-29T09:25:49Z) - CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual
Labeled Sequence Translation [113.99145386490639]
Cross-lingual NER can transfer knowledge between languages via aligned cross-lingual representations or machine translation results.
We propose a Cross-lingual Entity Projection framework (CROP) to enable zero-shot cross-lingual NER.
We adopt a multilingual labeled sequence translation model to project the tagged sequence back to the target language and label the target raw sentence.
arXiv Detail & Related papers (2022-10-13T13:32:36Z) - Towards Debiasing Translation Artifacts [15.991970288297443]
We propose a novel approach to reducing translationese by extending an established bias-removal technique.
We use the Iterative Null-space Projection (INLP) algorithm, and show by measuring classification accuracy before and after debiasing, that translationese is reduced at both sentence and word level.
To the best of our knowledge, this is the first study to debias translationese as represented in latent embedding space.
arXiv Detail & Related papers (2022-05-16T21:46:51Z) - DEEP: DEnoising Entity Pre-training for Neural Machine Translation [123.6686940355937]
It has been shown that machine translation models usually generate poor translations for named entities that are infrequent in the training corpus.
We propose DEEP, a DEnoising Entity Pre-training method that leverages large amounts of monolingual data and a knowledge base to improve named entity translation accuracy within sentences.
arXiv Detail & Related papers (2021-11-14T17:28:09Z) - Unsupervised Neural Machine Translation with Generative Language Models
Only [19.74865387759671]
We show how to derive state-of-the-art unsupervised neural machine translation systems from generatively pre-trained language models.
Our method consists of three steps: few-shot amplification, distillation, and backtranslation.
arXiv Detail & Related papers (2021-10-11T17:35:34Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - Sentiment-based Candidate Selection for NMT [2.580271290008534]
We propose a decoder-side approach that incorporates automatic sentiment scoring into the machine translation (MT) candidate selection process.
We train separate English and Spanish sentiment classifiers, then, using n-best candidates generated by a baseline MT model with beam search, select the candidate that minimizes the absolute difference between the sentiment score of the source sentence and that of the translation.
The results of human evaluations show that, in comparison to the open-source MT model on top of which our pipeline is built, our baseline translations are more accurate of colloquial, sentiment-heavy source texts.
arXiv Detail & Related papers (2021-04-10T19:01:52Z) - Explicit Reordering for Neural Machine Translation [50.70683739103066]
In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency.
We propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT.
The empirical results on the WMT14 English-to-German, WAT ASPEC Japanese-to-English, and WMT17 Chinese-to-English translation tasks show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-04-08T05:28:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.