Orthogonal Language and Task Adapters in Zero-Shot Cross-Lingual
Transfer
- URL: http://arxiv.org/abs/2012.06460v1
- Date: Fri, 11 Dec 2020 16:32:41 GMT
- Title: Orthogonal Language and Task Adapters in Zero-Shot Cross-Lingual
Transfer
- Authors: Marko Vidoni, Ivan Vuli\'c, Goran Glava\v{s}
- Abstract summary: orthoadapters are trained to encode language- and task-specific information that is complementary to the knowledge already stored in the pretrained transformer's parameters.
Our zero-shot cross-lingual transfer experiments, involving three tasks (POS-tagging, NER, NLI) and a set of 10 diverse languages, 1) point to the usefulness of orthoadapters in cross-lingual transfer, especially for the most complex NLI task, but also 2) indicate that the optimal adapter configuration highly depends on the task and the target language.
- Score: 43.92142759245696
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adapter modules, additional trainable parameters that enable efficient
fine-tuning of pretrained transformers, have recently been used for language
specialization of multilingual transformers, improving downstream zero-shot
cross-lingual transfer. In this work, we propose orthogonal language and task
adapters (dubbed orthoadapters) for cross-lingual transfer. They are trained to
encode language- and task-specific information that is complementary (i.e.,
orthogonal) to the knowledge already stored in the pretrained transformer's
parameters. Our zero-shot cross-lingual transfer experiments, involving three
tasks (POS-tagging, NER, NLI) and a set of 10 diverse languages, 1) point to
the usefulness of orthoadapters in cross-lingual transfer, especially for the
most complex NLI task, but also 2) indicate that the optimal adapter
configuration highly depends on the task and the target language. We hope that
our work will motivate a wider investigation of usefulness of orthogonality
constraints in language- and task-specific fine-tuning of pretrained
transformers.
Related papers
- AdaMergeX: Cross-Lingual Transfer with Large Language Models via
Adaptive Adapter Merging [96.39773974044041]
Cross-lingual transfer is an effective alternative to the direct fine-tuning on target tasks in specific languages.
We propose a new cross-lingual transfer method called $textttAdaMergeX$ that utilizes adaptive adapter merging.
Our empirical results demonstrate that our approach yields new and effective cross-lingual transfer, outperforming existing methods across all settings.
arXiv Detail & Related papers (2024-02-29T07:11:24Z) - The Impact of Language Adapters in Cross-Lingual Transfer for NLU [0.8702432681310401]
We study the effect of including a target-language adapter in detailed ablation studies with two multilingual models and three multilingual datasets.
Our results show that the effect of target-language adapters is highly inconsistent across tasks, languages and models.
Removing the language adapter after training has only a weak negative effect, indicating that the language adapters do not have a strong impact on the predictions.
arXiv Detail & Related papers (2024-01-31T20:07:43Z) - Cross-Lingual Transfer with Target Language-Ready Task Adapters [66.5336029324059]
BAD-X, an extension of the MAD-X framework, achieves improved transfer at the cost of MAD-X's modularity.
We aim to take the best of both worlds by fine-tuning task adapters adapted to the target language.
arXiv Detail & Related papers (2023-06-05T10:46:33Z) - Parameter-Efficient Neural Reranking for Cross-Lingual and Multilingual
Retrieval [66.69799641522133]
State-of-the-art neural (re)rankers are notoriously data hungry.
Current approaches typically transfer rankers trained on English data to other languages and cross-lingual setups by means of multilingual encoders.
We show that two parameter-efficient approaches to cross-lingual transfer, namely Sparse Fine-Tuning Masks (SFTMs) and Adapters, allow for a more lightweight and more effective zero-shot transfer.
arXiv Detail & Related papers (2022-04-05T15:44:27Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z) - From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual
Transfer with Multilingual Transformers [62.637055980148816]
Massively multilingual transformers pretrained with language modeling objectives have become a de facto default transfer paradigm for NLP.
We show that cross-lingual transfer via massively multilingual transformers is substantially less effective in resource-lean scenarios and for distant languages.
arXiv Detail & Related papers (2020-05-01T22:04:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.