Efficient Test Time Adapter Ensembling for Low-resource Language
Varieties
- URL: http://arxiv.org/abs/2109.04877v1
- Date: Fri, 10 Sep 2021 13:44:46 GMT
- Title: Efficient Test Time Adapter Ensembling for Low-resource Language
Varieties
- Authors: Xinyi Wang and Yulia Tsvetkov and Sebastian Ruder and Graham Neubig
- Abstract summary: Specialized language and task adapters have been proposed to facilitate cross-lingual transfer of multilingual pretrained models.
An intuitive solution is to use a related language adapter for the new language variety, but we observe that this solution can lead to sub-optimal performance.
In this paper, we aim to improve the robustness of language adapters to uncovered languages without training new adapters.
- Score: 115.12997212870962
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adapters are light-weight modules that allow parameter-efficient fine-tuning
of pretrained models. Specialized language and task adapters have recently been
proposed to facilitate cross-lingual transfer of multilingual pretrained models
(Pfeiffer et al., 2020b). However, this approach requires training a separate
language adapter for every language one wishes to support, which can be
impractical for languages with limited data. An intuitive solution is to use a
related language adapter for the new language variety, but we observe that this
solution can lead to sub-optimal performance. In this paper, we aim to improve
the robustness of language adapters to uncovered languages without training new
adapters. We find that ensembling multiple existing language adapters makes the
fine-tuned model significantly more robust to other language varieties not
included in these adapters. Building upon this observation, we propose Entropy
Minimized Ensemble of Adapters (EMEA), a method that optimizes the ensemble
weights of the pretrained language adapters for each test sentence by
minimizing the entropy of its predictions. Experiments on three diverse groups
of language varieties show that our method leads to significant improvements on
both named entity recognition and part-of-speech tagging across all languages.
Related papers
- The Impact of Language Adapters in Cross-Lingual Transfer for NLU [0.8702432681310401]
We study the effect of including a target-language adapter in detailed ablation studies with two multilingual models and three multilingual datasets.
Our results show that the effect of target-language adapters is highly inconsistent across tasks, languages and models.
Removing the language adapter after training has only a weak negative effect, indicating that the language adapters do not have a strong impact on the predictions.
arXiv Detail & Related papers (2024-01-31T20:07:43Z) - Adapting the adapters for code-switching in multilingual ASR [10.316724084739892]
Large pre-trained multilingual speech models have shown potential in scaling Automatic Speech Recognition to many low-resource languages.
Some of these models employ language adapters in their formulation, which helps to improve monolingual performance.
This formulation restricts the usability of these models on code-switched speech, where two languages are mixed together in the same utterance.
We propose ways to effectively fine-tune such models on code-switched speech, by assimilating information from both language adapters at each language adaptation point in the network.
arXiv Detail & Related papers (2023-10-11T12:15:24Z) - Multilingual Detection of Check-Worthy Claims using World Languages and
Adapter Fusion [12.269362823116225]
Resource scarcity for non-world languages and model learning costs remain major challenges for the creation of models supporting multilingual check-worthiness detection.
This paper proposes cross-training adapters on a subset of world languages, combined by adapter fusion, to detect claims emerging globally in multiple languages.
arXiv Detail & Related papers (2023-01-13T11:50:08Z) - Parameter-efficient Zero-shot Transfer for Cross-Language Dense
Retrieval with Adapters [20.168480824057923]
A popular approach to creating a cross-language retrieval model is to substitute a monolingual pretrained language model in the retrieval model.
We show that models trained with monolingual data are more effective than fine-tuning the entire model when transferring to a Cross Language Information Retrieval setting.
arXiv Detail & Related papers (2022-12-20T17:25:04Z) - Language-Family Adapters for Low-Resource Multilingual Neural Machine
Translation [129.99918589405675]
Large multilingual models trained with self-supervision achieve state-of-the-art results in a wide range of natural language processing tasks.
Multilingual fine-tuning improves performance on low-resource languages but requires modifying the entire model and can be prohibitively expensive.
We propose training language-family adapters on top of mBART-50 to facilitate cross-lingual transfer.
arXiv Detail & Related papers (2022-09-30T05:02:42Z) - Continual Learning in Multilingual NMT via Language-Specific Embeddings [92.91823064720232]
It consists in replacing the shared vocabulary with a small language-specific vocabulary and fine-tuning the new embeddings on the new language's parallel data.
Because the parameters of the original model are not modified, its performance on the initial languages does not degrade.
arXiv Detail & Related papers (2021-10-20T10:38:57Z) - Lightweight Adapter Tuning for Multilingual Speech Translation [47.89784337058167]
Adapter modules were recently introduced as an efficient alternative to fine-tuning in NLP.
This paper proposes a comprehensive analysis of adapters for multilingual speech translation.
arXiv Detail & Related papers (2021-06-02T20:51:42Z) - Exploiting Adapters for Cross-lingual Low-resource Speech Recognition [52.40623653290499]
Cross-lingual speech adaptation aims to solve the problem of leveraging multiple rich-resource languages to build models for a low-resource target language.
We propose adapters to investigate the performance of multiple adapters for parameter-efficient cross-lingual speech adaptation.
arXiv Detail & Related papers (2021-05-18T08:30:37Z) - MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer [136.09386219006123]
We propose MAD-X, an adapter-based framework that enables high portability and parameter-efficient transfer to arbitrary tasks and languages.
MAD-X outperforms the state of the art in cross-lingual transfer across a representative set of typologically diverse languages on named entity recognition and causal commonsense reasoning.
arXiv Detail & Related papers (2020-04-30T18:54:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.