Multilingual Domain Adaptation for NMT: Decoupling Language and Domain
Information with Adapters
- URL: http://arxiv.org/abs/2110.09574v1
- Date: Mon, 18 Oct 2021 18:55:23 GMT
- Title: Multilingual Domain Adaptation for NMT: Decoupling Language and Domain
Information with Adapters
- Authors: Asa Cooper Stickland, Alexandre B\'erard, Vassilina Nikoulina
- Abstract summary: We study the compositionality of language and domain adapters in the context of Machine Translation.
We find that in the partial resource scenario a naive combination of domain-specific and language-specific adapters often results in catastrophic forgetting' of the missing languages.
- Score: 66.7986513246294
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adapter layers are lightweight, learnable units inserted between transformer
layers. Recent work explores using such layers for neural machine translation
(NMT), to adapt pre-trained models to new domains or language pairs, training
only a small set of parameters for each new setting (language pair or domain).
In this work we study the compositionality of language and domain adapters in
the context of Machine Translation. We aim to study, 1) parameter-efficient
adaptation to multiple domains and languages simultaneously (full-resource
scenario) and 2) cross-lingual transfer in domains where parallel data is
unavailable for certain language pairs (partial-resource scenario). We find
that in the partial resource scenario a naive combination of domain-specific
and language-specific adapters often results in `catastrophic forgetting' of
the missing languages. We study other ways to combine the adapters to alleviate
this issue and maximize cross-lingual transfer. With our best adapter
combinations, we obtain improvements of 3-4 BLEU on average for source
languages that do not have in-domain data. For target languages without
in-domain data, we achieve a similar improvement by combining adapters with
back-translation. Supplementary material is available at
https://tinyurl.com/r66stbxj
Related papers
- The Impact of Language Adapters in Cross-Lingual Transfer for NLU [0.8702432681310401]
We study the effect of including a target-language adapter in detailed ablation studies with two multilingual models and three multilingual datasets.
Our results show that the effect of target-language adapters is highly inconsistent across tasks, languages and models.
Removing the language adapter after training has only a weak negative effect, indicating that the language adapters do not have a strong impact on the predictions.
arXiv Detail & Related papers (2024-01-31T20:07:43Z) - AdapterSoup: Weight Averaging to Improve Generalization of Pretrained
Language Models [127.04370753583261]
Pretrained language models (PLMs) are trained on massive corpora, but often need to specialize to specific domains.
A solution is to use a related-domain adapter for the novel domain at test time.
We introduce AdapterSoup, an approach that performs weight-space averaging of adapters trained on different domains.
arXiv Detail & Related papers (2023-02-14T13:09:23Z) - $m^4Adapter$: Multilingual Multi-Domain Adaptation for Machine
Translation with a Meta-Adapter [128.69723410769586]
Multilingual neural machine translation models (MNMT) yield state-of-the-art performance when evaluated on data from a domain and language pair.
When a MNMT model is used to translate under domain shift or to a new language pair, performance drops dramatically.
We propose $m4Adapter$, which combines domain and language knowledge using meta-learning with adapters.
arXiv Detail & Related papers (2022-10-21T12:25:05Z) - Language-Family Adapters for Low-Resource Multilingual Neural Machine
Translation [129.99918589405675]
Large multilingual models trained with self-supervision achieve state-of-the-art results in a wide range of natural language processing tasks.
Multilingual fine-tuning improves performance on low-resource languages but requires modifying the entire model and can be prohibitively expensive.
We propose training language-family adapters on top of mBART-50 to facilitate cross-lingual transfer.
arXiv Detail & Related papers (2022-09-30T05:02:42Z) - Parameter-Efficient Neural Reranking for Cross-Lingual and Multilingual
Retrieval [66.69799641522133]
State-of-the-art neural (re)rankers are notoriously data hungry.
Current approaches typically transfer rankers trained on English data to other languages and cross-lingual setups by means of multilingual encoders.
We show that two parameter-efficient approaches to cross-lingual transfer, namely Sparse Fine-Tuning Masks (SFTMs) and Adapters, allow for a more lightweight and more effective zero-shot transfer.
arXiv Detail & Related papers (2022-04-05T15:44:27Z) - Efficient Test Time Adapter Ensembling for Low-resource Language
Varieties [115.12997212870962]
Specialized language and task adapters have been proposed to facilitate cross-lingual transfer of multilingual pretrained models.
An intuitive solution is to use a related language adapter for the new language variety, but we observe that this solution can lead to sub-optimal performance.
In this paper, we aim to improve the robustness of language adapters to uncovered languages without training new adapters.
arXiv Detail & Related papers (2021-09-10T13:44:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.