Parameter-efficient Zero-shot Transfer for Cross-Language Dense
Retrieval with Adapters
- URL: http://arxiv.org/abs/2212.10448v1
- Date: Tue, 20 Dec 2022 17:25:04 GMT
- Title: Parameter-efficient Zero-shot Transfer for Cross-Language Dense
Retrieval with Adapters
- Authors: Eugene Yang and Suraj Nair and Dawn Lawrie and James Mayfield and
Douglas W. Oard
- Abstract summary: A popular approach to creating a cross-language retrieval model is to substitute a monolingual pretrained language model in the retrieval model.
We show that models trained with monolingual data are more effective than fine-tuning the entire model when transferring to a Cross Language Information Retrieval setting.
- Score: 20.168480824057923
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: A popular approach to creating a zero-shot cross-language retrieval model is
to substitute a monolingual pretrained language model in the retrieval model
with a multilingual pretrained language model such as Multilingual BERT. This
multilingual model is fined-tuned to the retrieval task with monolingual data
such as English MS MARCO using the same training recipe as the monolingual
retrieval model used. However, such transferred models suffer from mismatches
in the languages of the input text during training and inference. In this work,
we propose transferring monolingual retrieval models using adapters, a
parameter-efficient component for a transformer network. By adding adapters
pretrained on language tasks for a specific language with task-specific
adapters, prior work has shown that the adapter-enhanced models perform better
than fine-tuning the entire model when transferring across languages in various
NLP tasks. By constructing dense retrieval models with adapters, we show that
models trained with monolingual data are more effective than fine-tuning the
entire model when transferring to a Cross Language Information Retrieval (CLIR)
setting. However, we found that the prior suggestion of replacing the language
adapters to match the target language at inference time is suboptimal for dense
retrieval models. We provide an in-depth analysis of this discrepancy between
other cross-language NLP tasks and CLIR.
Related papers
- The Impact of Language Adapters in Cross-Lingual Transfer for NLU [0.8702432681310401]
We study the effect of including a target-language adapter in detailed ablation studies with two multilingual models and three multilingual datasets.
Our results show that the effect of target-language adapters is highly inconsistent across tasks, languages and models.
Removing the language adapter after training has only a weak negative effect, indicating that the language adapters do not have a strong impact on the predictions.
arXiv Detail & Related papers (2024-01-31T20:07:43Z) - Efficient Adapter Finetuning for Tail Languages in Streaming
Multilingual ASR [44.949146169903074]
The heterogeneous nature and imbalanced data abundance of different languages may cause performance degradation.
Our proposed method brings 12.2% word error rate reduction on average and up to 37.5% on a single locale.
arXiv Detail & Related papers (2024-01-17T06:01:16Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Language-Family Adapters for Low-Resource Multilingual Neural Machine
Translation [129.99918589405675]
Large multilingual models trained with self-supervision achieve state-of-the-art results in a wide range of natural language processing tasks.
Multilingual fine-tuning improves performance on low-resource languages but requires modifying the entire model and can be prohibitively expensive.
We propose training language-family adapters on top of mBART-50 to facilitate cross-lingual transfer.
arXiv Detail & Related papers (2022-09-30T05:02:42Z) - Continual Learning in Multilingual NMT via Language-Specific Embeddings [92.91823064720232]
It consists in replacing the shared vocabulary with a small language-specific vocabulary and fine-tuning the new embeddings on the new language's parallel data.
Because the parameters of the original model are not modified, its performance on the initial languages does not degrade.
arXiv Detail & Related papers (2021-10-20T10:38:57Z) - xGQA: Cross-Lingual Visual Question Answering [100.35229218735938]
xGQA is a new multilingual evaluation benchmark for the visual question answering task.
We extend the established English GQA dataset to 7 typologically diverse languages.
We propose new adapter-based approaches to adapt multimodal transformer-based models to become multilingual.
arXiv Detail & Related papers (2021-09-13T15:58:21Z) - Efficient Test Time Adapter Ensembling for Low-resource Language
Varieties [115.12997212870962]
Specialized language and task adapters have been proposed to facilitate cross-lingual transfer of multilingual pretrained models.
An intuitive solution is to use a related language adapter for the new language variety, but we observe that this solution can lead to sub-optimal performance.
In this paper, we aim to improve the robustness of language adapters to uncovered languages without training new adapters.
arXiv Detail & Related papers (2021-09-10T13:44:46Z) - Adapting Monolingual Models: Data can be Scarce when Language Similarity
is High [3.249853429482705]
We investigate the performance of zero-shot transfer learning with as little data as possible.
We retrain the lexical layers of four BERT-based models using data from two low-resource target language varieties.
With high language similarity, 10MB of data appears sufficient to achieve substantial monolingual transfer performance.
arXiv Detail & Related papers (2021-05-06T17:43:40Z) - Learning to Scale Multilingual Representations for Vision-Language Tasks [51.27839182889422]
The effectiveness of SMALR is demonstrated with ten diverse languages, over twice the number supported in vision-language tasks to date.
We evaluate on multilingual image-sentence retrieval and outperform prior work by 3-4% with less than 1/5th the training parameters compared to other word embedding methods.
arXiv Detail & Related papers (2020-04-09T01:03:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.