Vocabulary Adaptation for Distant Domain Adaptation in Neural Machine
Translation
- URL: http://arxiv.org/abs/2004.14821v2
- Date: Sat, 31 Oct 2020 09:19:25 GMT
- Title: Vocabulary Adaptation for Distant Domain Adaptation in Neural Machine
Translation
- Authors: Shoetsu Sato, Jin Sakuma, Naoki Yoshinaga, Masashi Toyoda, Masaru
Kitsuregawa
- Abstract summary: Domain adaptation between distant domains cannot be performed effectively due to mismatches in vocabulary.
We propose vocabulary adaptation, a simple method for effective fine-tuning.
Our method improves the performance of conventional fine-tuning by 3.86 and 3.28 BLEU points in En-Ja and De-En translation.
- Score: 14.390932594872233
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural network methods exhibit strong performance only in a few resource-rich
domains. Practitioners, therefore, employ domain adaptation from resource-rich
domains that are, in most cases, distant from the target domain. Domain
adaptation between distant domains (e.g., movie subtitles and research papers),
however, cannot be performed effectively due to mismatches in vocabulary; it
will encounter many domain-specific words (e.g., "angstrom") and words whose
meanings shift across domains(e.g., "conductor"). In this study, aiming to
solve these vocabulary mismatches in domain adaptation for neural machine
translation (NMT), we propose vocabulary adaptation, a simple method for
effective fine-tuning that adapts embedding layers in a given pre-trained NMT
model to the target domain. Prior to fine-tuning, our method replaces the
embedding layers of the NMT model by projecting general word embeddings induced
from monolingual data in a target domain onto a source-domain embedding space.
Experimental results indicate that our method improves the performance of
conventional fine-tuning by 3.86 and 3.28 BLEU points in En-Ja and De-En
translation, respectively.
Related papers
- Meta-causal Learning for Single Domain Generalization [102.53303707563612]
Single domain generalization aims to learn a model from a single training domain (source domain) and apply it to multiple unseen test domains (target domains)
Existing methods focus on expanding the distribution of the training domain to cover the target domains, but without estimating the domain shift between the source and target domains.
We propose a new learning paradigm, namely simulate-analyze-reduce, which first simulates the domain shift by building an auxiliary domain as the target domain, then learns to analyze the causes of domain shift, and finally learns to reduce the domain shift for model adaptation.
arXiv Detail & Related papers (2023-04-07T15:46:38Z) - SwitchPrompt: Learning Domain-Specific Gated Soft Prompts for
Classification in Low-Resource Domains [14.096170976149521]
SwitchPrompt is a novel and lightweight prompting methodology for adaptation of language models trained on datasets from the general domain to diverse low-resource domains.
Our few-shot experiments on three text classification benchmarks demonstrate the efficacy of the general-domain pre-trained language models when used with SwitchPrompt.
They often even outperform their domain-specific counterparts trained with baseline state-of-the-art prompting methods by up to 10.7% performance increase in accuracy.
arXiv Detail & Related papers (2023-02-14T07:14:08Z) - Domain Adaptation via Prompt Learning [39.97105851723885]
Unsupervised domain adaption (UDA) aims to adapt models learned from a well-annotated source domain to a target domain.
We introduce a novel prompt learning paradigm for UDA, named Domain Adaptation via Prompt Learning (DAPL)
arXiv Detail & Related papers (2022-02-14T13:25:46Z) - Non-Parametric Unsupervised Domain Adaptation for Neural Machine
Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval.
We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z) - Neural Supervised Domain Adaptation by Augmenting Pre-trained Models
with Random Units [14.183224769428843]
Neural Transfer Learning (TL) is becoming ubiquitous in Natural Language Processing (NLP)
In this paper, we show through interpretation methods that such scheme, despite its efficiency, is suffering from a main limitation.
We propose to augment the pre-trained model with normalised, weighted and randomly initialised units that foster a better adaptation while maintaining the valuable source knowledge.
arXiv Detail & Related papers (2021-06-09T09:29:11Z) - Domain Adaptation and Multi-Domain Adaptation for Neural Machine
Translation: A Survey [9.645196221785694]
We focus on robust approaches to domain adaptation for Neural Machine Translation (NMT) models.
In particular, we look at the case where a system may need to translate sentences from multiple domains.
We highlight the benefits of domain adaptation and multi-domain adaptation techniques to other lines of NMT research.
arXiv Detail & Related papers (2021-04-14T16:21:37Z) - CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web
to Special Domain Search [89.48123965553098]
This paper presents a search system to alleviate the special domain adaption problem.
The system utilizes the domain-adaptive pretraining and few-shot learning technologies to help neural rankers mitigate the domain discrepancy.
Our system performs the best among the non-manual runs in Round 2 of the TREC-COVID task.
arXiv Detail & Related papers (2020-11-03T09:10:48Z) - Iterative Domain-Repaired Back-Translation [50.32925322697343]
In this paper, we focus on the domain-specific translation with low resources, where in-domain parallel corpora are scarce or nonexistent.
We propose a novel iterative domain-repaired back-translation framework, which introduces the Domain-Repair model to refine translations in synthetic bilingual data.
Experiments on adapting NMT models between specific domains and from the general domain to specific domains demonstrate the effectiveness of our proposed approach.
arXiv Detail & Related papers (2020-10-06T04:38:09Z) - Domain Adaptation for Semantic Parsing [68.81787666086554]
We propose a novel semantic for domain adaptation, where we have much fewer annotated data in the target domain compared to the source domain.
Our semantic benefits from a two-stage coarse-to-fine framework, thus can provide different and accurate treatments for the two stages.
Experiments on a benchmark dataset show that our method consistently outperforms several popular domain adaptation strategies.
arXiv Detail & Related papers (2020-06-23T14:47:41Z) - A Simple Baseline to Semi-Supervised Domain Adaptation for Machine
Translation [73.3550140511458]
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data.
We propose a simple but effect approach to the semi-supervised domain adaptation scenario of NMT.
This approach iteratively trains a Transformer-based NMT model via three training objectives: language modeling, back-translation, and supervised translation.
arXiv Detail & Related papers (2020-01-22T16:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.