Evaluation of contextual embeddings on less-resourced languages
- URL: http://arxiv.org/abs/2107.10614v1
- Date: Thu, 22 Jul 2021 12:32:27 GMT
- Title: Evaluation of contextual embeddings on less-resourced languages
- Authors: Matej Ul\v{c}ar and Ale\v{s} \v{Z}agar and Carlos S. Armendariz and
Andra\v{z} Repar and Senja Pollak and Matthew Purver and Marko
Robnik-\v{S}ikonja
- Abstract summary: This paper presents the first multilingual empirical comparison of two ELMo and several monolingual and multilingual BERT models using 14 tasks in nine languages.
In monolingual settings, monolingual BERT models generally dominate, with a few exceptions such as the dependency parsing task.
In cross-lingual settings, BERT models trained on only a few languages mostly do best, closely followed by massively multilingual BERT models.
- Score: 4.417922173735813
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The current dominance of deep neural networks in natural language processing
is based on contextual embeddings such as ELMo, BERT, and BERT derivatives.
Most existing work focuses on English; in contrast, we present here the first
multilingual empirical comparison of two ELMo and several monolingual and
multilingual BERT models using 14 tasks in nine languages. In monolingual
settings, our analysis shows that monolingual BERT models generally dominate,
with a few exceptions such as the dependency parsing task, where they are not
competitive with ELMo models trained on large corpora. In cross-lingual
settings, BERT models trained on only a few languages mostly do best, closely
followed by massively multilingual BERT models.
Related papers
- L3Cube-IndicSBERT: A simple approach for learning cross-lingual sentence
representations using multilingual BERT [0.7874708385247353]
The multilingual Sentence-BERT (SBERT) models map different languages to common representation space.
We propose a simple yet effective approach to convert vanilla multilingual BERT models into multilingual sentence BERT models using synthetic corpus.
We show that multilingual BERT models are inherent cross-lingual learners and this simple baseline fine-tuning approach yields exceptional cross-lingual properties.
arXiv Detail & Related papers (2023-04-22T15:45:40Z) - High-resource Language-specific Training for Multilingual Neural Machine
Translation [109.31892935605192]
We propose the multilingual translation model with the high-resource language-specific training (HLT-MT) to alleviate the negative interference.
Specifically, we first train the multilingual model only with the high-resource pairs and select the language-specific modules at the top of the decoder.
HLT-MT is further trained on all available corpora to transfer knowledge from high-resource languages to low-resource languages.
arXiv Detail & Related papers (2022-07-11T14:33:13Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z) - How Good is Your Tokenizer? On the Monolingual Performance of
Multilingual Language Models [96.32118305166412]
We study a set of nine typologically diverse languages with readily available pretrained monolingual models on a set of five diverse monolingual downstream tasks.
We find that languages which are adequately represented in the multilingual model's vocabulary exhibit negligible performance decreases over their monolingual counterparts.
arXiv Detail & Related papers (2020-12-31T14:11:00Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - Towards Fully Bilingual Deep Language Modeling [1.3455090151301572]
We consider whether it is possible to pre-train a bilingual model for two remotely related languages without compromising performance at either language.
We create a Finnish-English bilingual BERT model and evaluate its performance on datasets used to evaluate the corresponding monolingual models.
Our bilingual model performs on par with Google's original English BERT on GLUE and nearly matches the performance of monolingual Finnish BERT on a range of Finnish NLP tasks.
arXiv Detail & Related papers (2020-10-22T12:22:50Z) - Evaluating Multilingual BERT for Estonian [0.8057006406834467]
We evaluate four multilingual models -- multilingual BERT, multilingual distilled BERT, XLM and XLM-RoBERTa -- on several NLP tasks.
Our results show that multilingual BERT models can generalise well on different Estonian NLP tasks.
arXiv Detail & Related papers (2020-10-01T14:48:31Z) - Multilingual Translation with Extensible Multilingual Pretraining and
Finetuning [77.33262578776291]
Previous work has demonstrated that machine translation systems can be created by finetuning on bitext.
We show that multilingual translation models can be created through multilingual finetuning.
We demonstrate that pretrained models can be extended to incorporate additional languages without loss of performance.
arXiv Detail & Related papers (2020-08-02T05:36:55Z) - Mono vs Multilingual Transformer-based Models: a Comparison across
Several Language Tasks [1.2691047660244335]
BERT (Bidirectional Representations from Transformers) and ALBERT (A Lite BERT) are methods for pre-training language models.
We make available our trained BERT and Albert model for Portuguese.
arXiv Detail & Related papers (2020-07-19T19:13:20Z) - FinEst BERT and CroSloEngual BERT: less is more in multilingual models [0.0]
We train two trilingual BERT-like models, one for Finnish, Estonian, and English, the other for Croatian, Slovenian, and English.
We evaluate their performance on several downstream tasks, NER, POS-tagging, and dependency parsing, using the multilingual BERT and XLM-R as baselines.
The newly created FinEst BERT and CroSloEngual BERT improve the results on all tasks in most monolingual and cross-lingual situations.
arXiv Detail & Related papers (2020-06-14T12:54:01Z) - A Study of Cross-Lingual Ability and Language-specific Information in
Multilingual BERT [60.9051207862378]
multilingual BERT works remarkably well on cross-lingual transfer tasks.
Datasize and context window size are crucial factors to the transferability.
There is a computationally cheap but effective approach to improve the cross-lingual ability of multilingual BERT.
arXiv Detail & Related papers (2020-04-20T11:13:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.