Domain Adaptation in Neural Machine Translation using a Qualia-Enriched
FrameNet
- URL: http://arxiv.org/abs/2202.10287v1
- Date: Mon, 21 Feb 2022 15:05:23 GMT
- Title: Domain Adaptation in Neural Machine Translation using a Qualia-Enriched
FrameNet
- Authors: Alexandre Diniz Costa, Mateus Coutinho Marim, Ely Edison da Silva
Matos and Tiago Timponi Torrent
- Abstract summary: We present Scylla, a methodology for domain adaptation of Neural Machine Translation (NMT) systems.
Two versions of Scylla are presented: one using the source sentence as input, and another one using the target sentence.
We evaluate Scylla in comparison to a state-of-the-art commercial NMT system in an experiment in which 50 sentences from the Sports domain are translated from Brazilian Portuguese to English.
- Score: 64.0476282000118
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper we present Scylla, a methodology for domain adaptation of
Neural Machine Translation (NMT) systems that make use of a multilingual
FrameNet enriched with qualia relations as an external knowledge base. Domain
adaptation techniques used in NMT usually require fine-tuning and in-domain
training data, which may pose difficulties for those working with
lesser-resourced languages and may also lead to performance decay of the NMT
system for out-of-domain sentences. Scylla does not require fine-tuning of the
NMT model, avoiding the risk of model over-fitting and consequent decrease in
performance for out-of-domain translations. Two versions of Scylla are
presented: one using the source sentence as input, and another one using the
target sentence. We evaluate Scylla in comparison to a state-of-the-art
commercial NMT system in an experiment in which 50 sentences from the Sports
domain are translated from Brazilian Portuguese to English. The two versions of
Scylla significantly outperform the baseline commercial system in HTER.
Related papers
- Domain Adaptation for Arabic Machine Translation: The Case of Financial
Texts [0.7673339435080445]
We develop a parallel corpus for Arabic-English (AR- EN) translation in the financial domain.
We fine-tune several NMT and Large Language models including ChatGPT-3.5 Turbo.
The quality of ChatGPT translation was superior than other models based on automatic and human evaluations.
arXiv Detail & Related papers (2023-09-22T13:37:19Z) - Exploiting Language Relatedness in Machine Translation Through Domain
Adaptation Techniques [3.257358540764261]
We present a novel approach of using a scaled similarity score of sentences, especially for related languages based on a 5-gram KenLM language model.
Our approach succeeds in increasing 2 BLEU point on multi-domain approach, 3 BLEU point on fine-tuning for NMT and 2 BLEU point on iterative back-translation approach.
arXiv Detail & Related papers (2023-03-03T09:07:30Z) - Efficient Cluster-Based k-Nearest-Neighbor Machine Translation [65.69742565855395]
k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT)
arXiv Detail & Related papers (2022-04-13T05:46:31Z) - Non-Parametric Unsupervised Domain Adaptation for Neural Machine
Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval.
We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z) - Domain Adaptation and Multi-Domain Adaptation for Neural Machine
Translation: A Survey [9.645196221785694]
We focus on robust approaches to domain adaptation for Neural Machine Translation (NMT) models.
In particular, we look at the case where a system may need to translate sentences from multiple domains.
We highlight the benefits of domain adaptation and multi-domain adaptation techniques to other lines of NMT research.
arXiv Detail & Related papers (2021-04-14T16:21:37Z) - Meta-Curriculum Learning for Domain Adaptation in Neural Machine
Translation [19.973201669851626]
We propose a novel meta-curriculum learning for domain adaptation in neural machine translation (NMT)
During meta-training, the NMT first learns the similar curricula from each domain to avoid falling into a bad local optimum early.
We show that meta-curriculum learning can improve the translation performance of both familiar and unfamiliar domains.
arXiv Detail & Related papers (2021-03-03T08:58:39Z) - Exploiting Neural Query Translation into Cross Lingual Information
Retrieval [49.167049709403166]
Existing CLIR systems mainly exploit statistical-based machine translation (SMT) rather than the advanced neural machine translation (NMT)
We propose a novel data augmentation method that extracts query translation pairs according to user clickthrough data.
Experimental results reveal that the proposed approach yields better retrieval quality than strong baselines.
arXiv Detail & Related papers (2020-10-26T15:28:19Z) - Iterative Domain-Repaired Back-Translation [50.32925322697343]
In this paper, we focus on the domain-specific translation with low resources, where in-domain parallel corpora are scarce or nonexistent.
We propose a novel iterative domain-repaired back-translation framework, which introduces the Domain-Repair model to refine translations in synthetic bilingual data.
Experiments on adapting NMT models between specific domains and from the general domain to specific domains demonstrate the effectiveness of our proposed approach.
arXiv Detail & Related papers (2020-10-06T04:38:09Z) - A Simple Baseline to Semi-Supervised Domain Adaptation for Machine
Translation [73.3550140511458]
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data.
We propose a simple but effect approach to the semi-supervised domain adaptation scenario of NMT.
This approach iteratively trains a Transformer-based NMT model via three training objectives: language modeling, back-translation, and supervised translation.
arXiv Detail & Related papers (2020-01-22T16:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.