Learning Homographic Disambiguation Representation for Neural Machine
Translation
- URL: http://arxiv.org/abs/2304.05860v2
- Date: Thu, 13 Apr 2023 00:31:20 GMT
- Title: Learning Homographic Disambiguation Representation for Neural Machine
Translation
- Authors: Weixuan Wang, Wei Peng and Qun Liu
- Abstract summary: Homographs, words with the same spelling but different meanings, remain challenging in Neural Machine Translation (NMT)
We propose a novel approach to tackle issues of NMT in the latent space.
We first train an encoder (aka " homographic-encoder") to learn universal sentence representations in a natural language inference (NLI) task.
We further fine-tune the encoder using homograph-based syn-set WordNet, enabling it to learn word-set representations from sentences.
- Score: 20.242134720005467
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Homographs, words with the same spelling but different meanings, remain
challenging in Neural Machine Translation (NMT). While recent works leverage
various word embedding approaches to differentiate word sense in NMT, they do
not focus on the pivotal components in resolving ambiguities of homographs in
NMT: the hidden states of an encoder. In this paper, we propose a novel
approach to tackle homographic issues of NMT in the latent space. We first
train an encoder (aka "HDR-encoder") to learn universal sentence
representations in a natural language inference (NLI) task. We further
fine-tune the encoder using homograph-based synset sentences from WordNet,
enabling it to learn word-level homographic disambiguation representations
(HDR). The pre-trained HDR-encoder is subsequently integrated with a
transformer-based NMT in various schemes to improve translation accuracy.
Experiments on four translation directions demonstrate the effectiveness of the
proposed method in enhancing the performance of NMT systems in the BLEU scores
(up to +2.3 compared to a solid baseline). The effects can be verified by other
metrics (F1, precision, and recall) of translation accuracy in an additional
disambiguation task. Visualization methods like heatmaps, T-SNE and translation
examples are also utilized to demonstrate the effects of the proposed method.
Related papers
- Code-Switching with Word Senses for Pretraining in Neural Machine
Translation [107.23743153715799]
We introduce Word Sense Pretraining for Neural Machine Translation (WSP-NMT)
WSP-NMT is an end-to-end approach for pretraining multilingual NMT models leveraging word sense-specific information from Knowledge Bases.
Our experiments show significant improvements in overall translation quality.
arXiv Detail & Related papers (2023-10-21T16:13:01Z) - Towards Effective Disambiguation for Machine Translation with Large
Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences"
Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z) - Towards Reliable Neural Machine Translation with Consistency-Aware
Meta-Learning [24.64700139151659]
Current Neural machine translation (NMT) systems suffer from a lack of reliability.
We present a consistency-aware meta-learning (CAML) framework derived from the model-agnostic meta-learning (MAML) algorithm to address it.
We conduct experiments on the NIST Chinese to English task, three WMT translation tasks, and the TED M2O task.
arXiv Detail & Related papers (2023-03-20T09:41:28Z) - Neural Machine Translation with Contrastive Translation Memories [71.86990102704311]
Retrieval-augmented Neural Machine Translation models have been successful in many translation scenarios.
We propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence.
In training phase, a Multi-TM contrastive learning objective is introduced to learn salient feature of each TM with respect to target sentence.
arXiv Detail & Related papers (2022-12-06T17:10:17Z) - Towards Opening the Black Box of Neural Machine Translation: Source and
Target Interpretations of the Transformer [1.8594711725515678]
In Neural Machine Translation (NMT), each token prediction is conditioned on the source sentence and the target prefix.
Previous work on interpretability in NMT has focused solely on source sentence tokens attributions.
We propose an interpretability method that tracks complete input token attributions.
arXiv Detail & Related papers (2022-05-23T20:59:14Z) - When do Contrastive Word Alignments Improve Many-to-many Neural Machine
Translation? [33.28706502928905]
This work proposes a word-level contrastive objective to leverage word alignments for many-to-many NMT.
Analyses reveal that in many-to-many NMT, the encoder's sentence retrieval performance highly correlates with the translation quality.
arXiv Detail & Related papers (2022-04-26T09:07:51Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Encodings of Source Syntax: Similarities in NMT Representations Across
Target Languages [3.464656011246703]
We find that NMT encoders learn similar source syntax regardless of NMT target language.
NMT encoders outperform RNNs trained directly on several of the constituent label prediction tasks.
arXiv Detail & Related papers (2020-05-17T06:41:32Z) - Explicit Reordering for Neural Machine Translation [50.70683739103066]
In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency.
We propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT.
The empirical results on the WMT14 English-to-German, WAT ASPEC Japanese-to-English, and WMT17 Chinese-to-English translation tasks show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-04-08T05:28:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.