Cross-Lingual Transfer with Target Language-Ready Task Adapters
- URL: http://arxiv.org/abs/2306.02767v1
- Date: Mon, 5 Jun 2023 10:46:33 GMT
- Title: Cross-Lingual Transfer with Target Language-Ready Task Adapters
- Authors: Marinela Parovi\'c, Alan Ansell, Ivan Vuli\'c, Anna Korhonen
- Abstract summary: BAD-X, an extension of the MAD-X framework, achieves improved transfer at the cost of MAD-X's modularity.
We aim to take the best of both worlds by fine-tuning task adapters adapted to the target language.
- Score: 66.5336029324059
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adapters have emerged as a modular and parameter-efficient approach to
(zero-shot) cross-lingual transfer. The established MAD-X framework employs
separate language and task adapters which can be arbitrarily combined to
perform the transfer of any task to any target language. Subsequently, BAD-X,
an extension of the MAD-X framework, achieves improved transfer at the cost of
MAD-X's modularity by creating "bilingual" adapters specific to the
source-target language pair. In this work, we aim to take the best of both
worlds by (i) fine-tuning task adapters adapted to the target language(s)
(so-called "target language-ready" (TLR) adapters) to maintain high transfer
performance, but (ii) without sacrificing the highly modular design of MAD-X.
The main idea of "target language-ready" adapters is to resolve the
training-vs-inference discrepancy of MAD-X: the task adapter "sees" the target
language adapter for the very first time during inference, and thus might not
be fully compatible with it. We address this mismatch by exposing the task
adapter to the target language adapter during training, and empirically
validate several variants of the idea: in the simplest form, we alternate
between using the source and target language adapters during task adapter
training, which can be generalized to cycling over any set of language
adapters. We evaluate different TLR-based transfer configurations with varying
degrees of generality across a suite of standard cross-lingual benchmarks, and
find that the most general (and thus most modular) configuration consistently
outperforms MAD-X and BAD-X on most tasks and languages.
Related papers
- AdaMergeX: Cross-Lingual Transfer with Large Language Models via
Adaptive Adapter Merging [96.39773974044041]
Cross-lingual transfer is an effective alternative to the direct fine-tuning on target tasks in specific languages.
We propose a new cross-lingual transfer method called $textttAdaMergeX$ that utilizes adaptive adapter merging.
Our empirical results demonstrate that our approach yields new and effective cross-lingual transfer, outperforming existing methods across all settings.
arXiv Detail & Related papers (2024-02-29T07:11:24Z) - The Impact of Language Adapters in Cross-Lingual Transfer for NLU [0.8702432681310401]
We study the effect of including a target-language adapter in detailed ablation studies with two multilingual models and three multilingual datasets.
Our results show that the effect of target-language adapters is highly inconsistent across tasks, languages and models.
Removing the language adapter after training has only a weak negative effect, indicating that the language adapters do not have a strong impact on the predictions.
arXiv Detail & Related papers (2024-01-31T20:07:43Z) - Multilingual Domain Adaptation for NMT: Decoupling Language and Domain
Information with Adapters [66.7986513246294]
We study the compositionality of language and domain adapters in the context of Machine Translation.
We find that in the partial resource scenario a naive combination of domain-specific and language-specific adapters often results in catastrophic forgetting' of the missing languages.
arXiv Detail & Related papers (2021-10-18T18:55:23Z) - Efficient Test Time Adapter Ensembling for Low-resource Language
Varieties [115.12997212870962]
Specialized language and task adapters have been proposed to facilitate cross-lingual transfer of multilingual pretrained models.
An intuitive solution is to use a related language adapter for the new language variety, but we observe that this solution can lead to sub-optimal performance.
In this paper, we aim to improve the robustness of language adapters to uncovered languages without training new adapters.
arXiv Detail & Related papers (2021-09-10T13:44:46Z) - Orthogonal Language and Task Adapters in Zero-Shot Cross-Lingual
Transfer [43.92142759245696]
orthoadapters are trained to encode language- and task-specific information that is complementary to the knowledge already stored in the pretrained transformer's parameters.
Our zero-shot cross-lingual transfer experiments, involving three tasks (POS-tagging, NER, NLI) and a set of 10 diverse languages, 1) point to the usefulness of orthoadapters in cross-lingual transfer, especially for the most complex NLI task, but also 2) indicate that the optimal adapter configuration highly depends on the task and the target language.
arXiv Detail & Related papers (2020-12-11T16:32:41Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z) - MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer [136.09386219006123]
We propose MAD-X, an adapter-based framework that enables high portability and parameter-efficient transfer to arbitrary tasks and languages.
MAD-X outperforms the state of the art in cross-lingual transfer across a representative set of typologically diverse languages on named entity recognition and causal commonsense reasoning.
arXiv Detail & Related papers (2020-04-30T18:54:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.