Domain Mismatch Doesn't Always Prevent Cross-Lingual Transfer Learning
- URL: http://arxiv.org/abs/2211.16671v1
- Date: Wed, 30 Nov 2022 01:24:33 GMT
- Title: Domain Mismatch Doesn't Always Prevent Cross-Lingual Transfer Learning
- Authors: Daniel Edmiston, Phillip Keung, Noah A. Smith
- Abstract summary: Cross-lingual transfer learning has been surprisingly effective in zero-shot cross-lingual classification.
We show that a simple regimen can overcome much of the effect of domain mismatch in cross-lingual transfer.
- Score: 51.232774288403114
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Cross-lingual transfer learning without labeled target language data or
parallel text has been surprisingly effective in zero-shot cross-lingual
classification, question answering, unsupervised machine translation, etc.
However, some recent publications have claimed that domain mismatch prevents
cross-lingual transfer, and their results show that unsupervised bilingual
lexicon induction (UBLI) and unsupervised neural machine translation (UNMT) do
not work well when the underlying monolingual corpora come from different
domains (e.g., French text from Wikipedia but English text from UN
proceedings). In this work, we show that a simple initialization regimen can
overcome much of the effect of domain mismatch in cross-lingual transfer. We
pre-train word and contextual embeddings on the concatenated domain-mismatched
corpora, and use these as initializations for three tasks: MUSE UBLI, UN
Parallel UNMT, and the SemEval 2017 cross-lingual word similarity task. In all
cases, our results challenge the conclusions of prior work by showing that
proper initialization can recover a large portion of the losses incurred by
domain mismatch.
Related papers
- Domain Curricula for Code-Switched MT at MixMT 2022 [0.0]
We present our approach and results for the Code-mixed Machine Translation (MixMT) shared task at WMT 2022.
The task consists of two subtasks, monolingual to code-mixed machine translation (Subtask-1) and code-mixed to monolingual machine translation (Subtask-2).
We jointly learn multiple domains of text by pretraining and fine-tuning, combined with a sentence alignment objective.
arXiv Detail & Related papers (2022-10-31T16:41:57Z) - Non-Parametric Domain Adaptation for End-to-End Speech Translation [72.37869362559212]
End-to-End Speech Translation (E2E-ST) has received increasing attention due to the potential of its less error propagation, lower latency, and fewer parameters.
We propose a novel non-parametric method that leverages domain-specific text translation corpus to achieve domain adaptation for the E2E-ST system.
arXiv Detail & Related papers (2022-05-23T11:41:02Z) - So Different Yet So Alike! Constrained Unsupervised Text Style Transfer [54.4773992696361]
We introduce a method for constrained unsupervised text style transfer by introducing two complementary losses to the generative adversarial network (GAN) family of models.
Unlike the competing losses used in GANs, we introduce cooperative losses where the discriminator and the generator cooperate and reduce the same loss.
We show that the complementary cooperative losses improve text quality, according to both automated and human evaluation measures.
arXiv Detail & Related papers (2022-05-09T07:46:40Z) - Non-Parametric Unsupervised Domain Adaptation for Neural Machine
Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval.
We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z) - Generalised Unsupervised Domain Adaptation of Neural Machine Translation
with Cross-Lingual Data Selection [34.90952499734384]
We propose a cross-lingual data selection method to extract in-domain sentences in the missing language side from a large generic monolingual corpus.
Our proposed method trains an adaptive layer on top of multilingual BERT by contrastive learning to align the representation between the source and target language.
We evaluate our cross-lingual data selection method on NMT across five diverse domains in three language pairs, as well as a real-world scenario of translation for COVID-19.
arXiv Detail & Related papers (2021-09-09T14:12:12Z) - Contrastive Learning and Self-Training for Unsupervised Domain
Adaptation in Semantic Segmentation [71.77083272602525]
UDA attempts to provide efficient knowledge transfer from a labeled source domain to an unlabeled target domain.
We propose a contrastive learning approach that adapts category-wise centroids across domains.
We extend our method with self-training, where we use a memory-efficient temporal ensemble to generate consistent and reliable pseudo-labels.
arXiv Detail & Related papers (2021-05-05T11:55:53Z) - Sentence Alignment with Parallel Documents Helps Biomedical Machine
Translation [0.5430741734728369]
This work presents a new unsupervised sentence alignment method and explores features in training biomedical neural machine translation (NMT) systems.
We use a simple but effective way to build bilingual word embeddings to evaluate bilingual word similarity.
The proposed method achieved high accuracy in both 1-to-1 and many-to-many cases.
arXiv Detail & Related papers (2021-04-17T16:09:30Z) - Identifying the Limits of Cross-Domain Knowledge Transfer for Pretrained
Models [9.359514457957799]
We explore how much transfer occurs when models are denied any information about word identity via random scrambling.
We find that only BERT shows high rates of transfer into our scrambled domains, and for classification but not sequence labeling tasks.
Our analyses seek to explain why transfer succeeds for some tasks but not others, to isolate the separate contributions of pretraining versus fine-tuning, and to quantify the role of word frequency.
arXiv Detail & Related papers (2021-04-17T00:14:39Z) - Analyzing Zero-shot Cross-lingual Transfer in Supervised NLP Tasks [6.7155846430379285]
In zero-shot cross-lingual transfer, a supervised NLP task trained on a corpus in one language is directly applicable to another language without any additional training.
Recently introduced cross-lingual language model (XLM) pretraining brings out neural parameter sharing in Transformer-style networks.
In this paper, we aim to validate the hypothetically strong cross-lingual transfer properties induced by XLM pretraining.
arXiv Detail & Related papers (2021-01-26T09:21:25Z) - Structured Domain Adaptation with Online Relation Regularization for
Unsupervised Person Re-ID [62.90727103061876]
Unsupervised domain adaptation (UDA) aims at adapting the model trained on a labeled source-domain dataset to an unlabeled target-domain dataset.
We propose an end-to-end structured domain adaptation framework with an online relation-consistency regularization term.
Our proposed framework is shown to achieve state-of-the-art performance on multiple UDA tasks of person re-ID.
arXiv Detail & Related papers (2020-03-14T14:45:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.