Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained
Language Models
- URL: http://arxiv.org/abs/2102.00894v1
- Date: Mon, 1 Feb 2021 15:07:06 GMT
- Title: Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained
Language Models
- Authors: Nora Kassner, Philipp Dufter, Hinrich Sch\"utze
- Abstract summary: Masked sentences such as "Paris is the capital of [MASK]" are used as probes.
We translate the established benchmarks TREx and GoogleRE into 53 languages.
We find that using mBERT as a knowledge base yields varying performance across languages.
- Score: 6.166295570030645
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, it has been found that monolingual English language models can be
used as knowledge bases. Instead of structural knowledge base queries, masked
sentences such as "Paris is the capital of [MASK]" are used as probes. We
translate the established benchmarks TREx and GoogleRE into 53 languages.
Working with mBERT, we investigate three questions. (i) Can mBERT be used as a
multilingual knowledge base? Most prior work only considers English. Extending
research to multiple languages is important for diversity and accessibility.
(ii) Is mBERT's performance as knowledge base language-independent or does it
vary from language to language? (iii) A multilingual model is trained on more
text, e.g., mBERT is trained on 104 Wikipedias. Can mBERT leverage this for
better performance? We find that using mBERT as a knowledge base yields varying
performance across languages and pooling predictions across languages improves
performance. Conversely, mBERT exhibits a language bias; e.g., when queried in
Italian, it tends to predict Italy as the country of origin.
Related papers
- Multilingual Knowledge Editing with Language-Agnostic Factual Neurons [98.73585104789217]
We investigate how large language models (LLMs) represent multilingual factual knowledge.
We find that the same factual knowledge in different languages generally activates a shared set of neurons, which we call language-agnostic factual neurons.
Inspired by this finding, we propose a new MKE method by locating and modifying Language-Agnostic Factual Neurons (LAFN) to simultaneously edit multilingual knowledge.
arXiv Detail & Related papers (2024-06-24T08:06:56Z) - MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models [65.10456412127405]
MLaKE is a benchmark for the adaptability of knowledge editing methods across five languages.
MLaKE aggregates fact chains from Wikipedia across languages and generates questions in both free-form and multiple-choice.
We evaluate the multilingual knowledge editing generalization capabilities of existing methods on MLaKE.
arXiv Detail & Related papers (2024-04-07T15:23:28Z) - Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ [16.637598165238934]
Large language models (LLMs) need to serve everyone, including a global majority of non-English speakers.
Recent research shows that, despite limits in their intended use, people prompt LLMs in many different languages.
We introduce MultiQ, a new silver standard benchmark for basic open-ended question answering with 27.4k test questions.
arXiv Detail & Related papers (2024-03-06T16:01:44Z) - Factual Consistency of Multilingual Pretrained Language Models [0.0]
We investigate whether multilingual language models are more consistent than their monolingual counterparts.
We find that mBERT is as inconsistent as English BERT in English paraphrases.
Both mBERT and XLM-R exhibit a high degree of inconsistency in English and even more so for all the other 45 languages.
arXiv Detail & Related papers (2022-03-22T09:15:53Z) - It's not Greek to mBERT: Inducing Word-Level Translations from
Multilingual BERT [54.84185432755821]
multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages.
We study the word-level translation information embedded in mBERT and present two simple methods that expose remarkable translation capabilities with no fine-tuning.
arXiv Detail & Related papers (2020-10-16T09:49:32Z) - X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained
Language Models [103.75890012041366]
Language models (LMs) have proven surprisingly successful at capturing factual knowledge.
However, studies on LMs' factual representation ability have almost invariably been performed on English.
We create a benchmark of cloze-style probes for 23 typologically diverse languages.
arXiv Detail & Related papers (2020-10-13T05:29:56Z) - CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot
Cross-Lingual NLP [68.2650714613869]
We propose a data augmentation framework to generate multi-lingual code-switching data to fine-tune mBERT.
Compared with the existing work, our method does not rely on bilingual sentences for training, and requires only one training process for multiple target languages.
arXiv Detail & Related papers (2020-06-11T13:15:59Z) - Are All Languages Created Equal in Multilingual BERT? [22.954688396858085]
Multilingual BERT (mBERT) trained on 104 languages has shown surprisingly good cross-lingual performance on several NLP tasks.
We explore how mBERT performs on a much wider set of languages, focusing on the quality of representation for low-resource languages.
arXiv Detail & Related papers (2020-05-18T21:15:39Z) - Extending Multilingual BERT to Low-Resource Languages [71.0976635999159]
M-BERT (M-BERT) has been a huge success in both supervised and zero-shot cross-lingual transfer learning.
We propose a simple but effective approach to extend M-BERT so that it can benefit any new language.
arXiv Detail & Related papers (2020-04-28T16:36:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.