Evaluating Multilingual BERT for Estonian
- URL: http://arxiv.org/abs/2010.00454v2
- Date: Fri, 8 Jan 2021 10:00:42 GMT
- Title: Evaluating Multilingual BERT for Estonian
- Authors: Claudia Kittask, Kirill Milintsevich, Kairit Sirts
- Abstract summary: We evaluate four multilingual models -- multilingual BERT, multilingual distilled BERT, XLM and XLM-RoBERTa -- on several NLP tasks.
Our results show that multilingual BERT models can generalise well on different Estonian NLP tasks.
- Score: 0.8057006406834467
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, large pre-trained language models, such as BERT, have reached
state-of-the-art performance in many natural language processing tasks, but for
many languages, including Estonian, BERT models are not yet available. However,
there exist several multilingual BERT models that can handle multiple languages
simultaneously and that have been trained also on Estonian data. In this paper,
we evaluate four multilingual models -- multilingual BERT, multilingual
distilled BERT, XLM and XLM-RoBERTa -- on several NLP tasks including POS and
morphological tagging, NER and text classification. Our aim is to establish a
comparison between these multilingual BERT models and the existing baseline
neural models for these tasks. Our results show that multilingual BERT models
can generalise well on different Estonian NLP tasks outperforming all baselines
models for POS and morphological tagging and text classification, and reaching
the comparable level with the best baseline for NER, with XLM-RoBERTa achieving
the highest results compared with other multilingual models.
Related papers
- On the Analysis of Cross-Lingual Prompt Tuning for Decoder-based
Multilingual Model [49.81429697921861]
We study the interaction between parameter-efficient fine-tuning (PEFT) and cross-lingual tasks in multilingual autoregressive models.
We show that prompt tuning is more effective in enhancing the performance of low-resource languages than fine-tuning.
arXiv Detail & Related papers (2023-11-14T00:43:33Z) - Training dataset and dictionary sizes matter in BERT models: the case of
Baltic languages [0.0]
We train a trilingual LitLat BERT-like model for Lithuanian, Latvian, and English, and a monolingual Est-RoBERTa model for Estonian.
We evaluate their performance on four downstream tasks: named entity recognition, dependency parsing, part-of-speech tagging, and word analogy.
arXiv Detail & Related papers (2021-12-20T14:26:40Z) - Evaluation of contextual embeddings on less-resourced languages [4.417922173735813]
This paper presents the first multilingual empirical comparison of two ELMo and several monolingual and multilingual BERT models using 14 tasks in nine languages.
In monolingual settings, monolingual BERT models generally dominate, with a few exceptions such as the dependency parsing task.
In cross-lingual settings, BERT models trained on only a few languages mostly do best, closely followed by massively multilingual BERT models.
arXiv Detail & Related papers (2021-07-22T12:32:27Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - EstBERT: A Pretrained Language-Specific BERT for Estonian [0.3674863913115431]
This paper presents EstBERT, a large pretrained transformer-based language-specific BERT model for Estonian.
Recent work has evaluated multilingual BERT models on Estonian tasks and found them to outperform the baselines.
We show that the models based on EstBERT outperform multilingual BERT models on five tasks out of six.
arXiv Detail & Related papers (2020-11-09T21:33:53Z) - Towards Fully Bilingual Deep Language Modeling [1.3455090151301572]
We consider whether it is possible to pre-train a bilingual model for two remotely related languages without compromising performance at either language.
We create a Finnish-English bilingual BERT model and evaluate its performance on datasets used to evaluate the corresponding monolingual models.
Our bilingual model performs on par with Google's original English BERT on GLUE and nearly matches the performance of monolingual Finnish BERT on a range of Finnish NLP tasks.
arXiv Detail & Related papers (2020-10-22T12:22:50Z) - Multilingual Translation with Extensible Multilingual Pretraining and
Finetuning [77.33262578776291]
Previous work has demonstrated that machine translation systems can be created by finetuning on bitext.
We show that multilingual translation models can be created through multilingual finetuning.
We demonstrate that pretrained models can be extended to incorporate additional languages without loss of performance.
arXiv Detail & Related papers (2020-08-02T05:36:55Z) - CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot
Cross-Lingual NLP [68.2650714613869]
We propose a data augmentation framework to generate multi-lingual code-switching data to fine-tune mBERT.
Compared with the existing work, our method does not rely on bilingual sentences for training, and requires only one training process for multiple target languages.
arXiv Detail & Related papers (2020-06-11T13:15:59Z) - ParsBERT: Transformer-based Model for Persian Language Understanding [0.7646713951724012]
This paper proposes a monolingual BERT for the Persian language (ParsBERT)
It shows its state-of-the-art performance compared to other architectures and multilingual models.
ParsBERT obtains higher scores in all datasets, including existing ones as well as composed ones.
arXiv Detail & Related papers (2020-05-26T05:05:32Z) - A Study of Cross-Lingual Ability and Language-specific Information in
Multilingual BERT [60.9051207862378]
multilingual BERT works remarkably well on cross-lingual transfer tasks.
Datasize and context window size are crucial factors to the transferability.
There is a computationally cheap but effective approach to improve the cross-lingual ability of multilingual BERT.
arXiv Detail & Related papers (2020-04-20T11:13:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.