Towards Best Practices for Training Multilingual Dense Retrieval Models
- URL: http://arxiv.org/abs/2204.02363v1
- Date: Tue, 5 Apr 2022 17:12:53 GMT
- Title: Towards Best Practices for Training Multilingual Dense Retrieval Models
- Authors: Xinyu Zhang, Kelechi Ogueji, Xueguang Ma, Jimmy Lin
- Abstract summary: We focus on the task of monolingual retrieval in a variety of typologically diverse languages using one such design.
Our study is organized as a "best practices" guide for training multilingual dense retrieval models.
- Score: 54.91016739123398
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dense retrieval models using a transformer-based bi-encoder design have
emerged as an active area of research. In this work, we focus on the task of
monolingual retrieval in a variety of typologically diverse languages using one
such design. Although recent work with multilingual transformers demonstrates
that they exhibit strong cross-lingual generalization capabilities, there
remain many open research questions, which we tackle here. Our study is
organized as a "best practices" guide for training multilingual dense retrieval
models, broken down into three main scenarios: where a multilingual transformer
is available, but relevance judgments are not available in the language of
interest; where both models and training data are available; and, where
training data are available not but models. In considering these scenarios, we
gain a better understanding of the role of multi-stage fine-tuning, the
strength of cross-lingual transfer under various conditions, the usefulness of
out-of-language data, and the advantages of multilingual vs. monolingual
transformers. Our recommendations offer a guide for practitioners building
search applications, particularly for low-resource languages, and while our
work leaves open a number of research questions, we provide a solid foundation
for future work.
Related papers
- Analyzing the Evaluation of Cross-Lingual Knowledge Transfer in
Multilingual Language Models [12.662039551306632]
We show that observed high performance of multilingual models can be largely attributed to factors not requiring the transfer of actual linguistic knowledge.
More specifically, we observe what has been transferred across languages is mostly data artifacts and biases, especially for low-resource languages.
arXiv Detail & Related papers (2024-02-03T09:41:52Z) - Multilingual Multimodality: A Taxonomical Survey of Datasets,
Techniques, Challenges and Opportunities [10.721189858694396]
We study the unification of multilingual and multimodal (MultiX) streams.
We review the languages studied, gold or silver data with parallel annotations, and understand how these modalities and languages interact in modeling.
We present an account of the modeling approaches along with their strengths and weaknesses to better understand what scenarios they can be used reliably.
arXiv Detail & Related papers (2022-10-30T21:46:01Z) - On the cross-lingual transferability of multilingual prototypical models
across NLU tasks [2.44288434255221]
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven to be effective for limited domain and language applications.
In practice, these approaches suffer from the drawbacks of domain-driven design and under-resourced languages.
This article proposes to investigate the cross-lingual transferability of using synergistically few-shot learning with prototypical neural networks and multilingual Transformers-based models.
arXiv Detail & Related papers (2022-07-19T09:55:04Z) - xGQA: Cross-Lingual Visual Question Answering [100.35229218735938]
xGQA is a new multilingual evaluation benchmark for the visual question answering task.
We extend the established English GQA dataset to 7 typologically diverse languages.
We propose new adapter-based approaches to adapt multimodal transformer-based models to become multilingual.
arXiv Detail & Related papers (2021-09-13T15:58:21Z) - Specializing Multilingual Language Models: An Empirical Study [50.7526245872855]
Contextualized word representations from pretrained multilingual language models have become the de facto standard for addressing natural language tasks.
For languages rarely or never seen by these models, directly using such models often results in suboptimal representation or use of data.
arXiv Detail & Related papers (2021-06-16T18:13:55Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z) - Are pre-trained text representations useful for multilingual and
multi-dimensional language proficiency modeling? [6.294759639481189]
This paper describes our experiments and observations about the role of pre-trained and fine-tuned multilingual embeddings in performing multi-dimensional, multilingual language proficiency classification.
Our results indicate that while fine-tuned embeddings are useful for multilingual proficiency modeling, none of the features achieve consistently best performance for all dimensions of language proficiency.
arXiv Detail & Related papers (2021-02-25T16:23:52Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z) - XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating
Cross-lingual Generalization [128.37244072182506]
Cross-lingual TRansfer Evaluation of Multilinguals XTREME is a benchmark for evaluating the cross-lingual generalization capabilities of multilingual representations across 40 languages and 9 tasks.
We demonstrate that while models tested on English reach human performance on many tasks, there is still a sizable gap in the performance of cross-lingually transferred models.
arXiv Detail & Related papers (2020-03-24T19:09:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.