Common Sense Beyond English: Evaluating and Improving Multilingual
Language Models for Commonsense Reasoning
- URL: http://arxiv.org/abs/2106.06937v1
- Date: Sun, 13 Jun 2021 07:14:03 GMT
- Title: Common Sense Beyond English: Evaluating and Improving Multilingual
Language Models for Commonsense Reasoning
- Authors: Bill Yuchen Lin, Seyeon Lee, Xiaoyang Qiao, Xiang Ren
- Abstract summary: We aim to evaluate and improve popular multilingual language models (ML-LMs) to help advance commonsense reasoning beyond English.
We collect the Mickey Corpus, consisting of 561k sentences in 11 different languages, which can be used for analyzing and improving ML-LMs.
We propose Mickey Probe, a language-agnostic probing task for fairly evaluating the common sense of popular ML-LMs across different languages.
- Score: 33.34063636400519
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Commonsense reasoning research has so far been limited to English. We aim to
evaluate and improve popular multilingual language models (ML-LMs) to help
advance commonsense reasoning (CSR) beyond English. We collect the Mickey
Corpus, consisting of 561k sentences in 11 different languages, which can be
used for analyzing and improving ML-LMs. We propose Mickey Probe, a
language-agnostic probing task for fairly evaluating the common sense of
popular ML-LMs across different languages. In addition, we also create two new
datasets, X-CSQA and X-CODAH, by translating their English versions to 15 other
languages, so that we can evaluate popular ML-LMs for cross-lingual commonsense
reasoning. To improve the performance beyond English, we propose a simple yet
effective method -- multilingual contrastive pre-training (MCP). It
significantly enhances sentence representations, yielding a large performance
gain on both benchmarks.
Related papers
- The Roles of English in Evaluating Multilingual Language Models [6.396057276543912]
We argue that these roles have different goals: task performance versus language understanding.
We recommend to move away from this imprecise method and instead focus on furthering language understanding.
arXiv Detail & Related papers (2024-12-11T14:02:55Z) - Improving Bilingual Capabilities of Language Models to Support Diverse Linguistic Practices in Education [3.799331337558008]
Large language models (LLMs) offer promise in generating educational content, providing instructor feedback, and reducing teacher workload on assessments.
We study the effectiveness of multilingual large language models (MLLMs) across monolingual (English-only, Spanish-only) and bilingual (Spanglish) student writing.
arXiv Detail & Related papers (2024-11-06T23:16:25Z) - Dictionary Insertion Prompting for Multilingual Reasoning on Multilingual Large Language Models [52.00446751692225]
We present a novel and simple yet effective method called textbfDictionary textbfInsertion textbfPrompting (textbfDIP)
When providing a non-English prompt, DIP looks up a word dictionary and inserts words' English counterparts into the prompt for LLMs.
It then enables better translation into English and better English model thinking steps which leads to obviously better results.
arXiv Detail & Related papers (2024-11-02T05:10:50Z) - Thank You, Stingray: Multilingual Large Language Models Can Not (Yet) Disambiguate Cross-Lingual Word Sense [30.62699081329474]
We introduce a novel benchmark for cross-lingual sense disambiguation, StingrayBench.
We collect false friends in four language pairs, namely Indonesian-Malay, Indonesian-Tagalog, Chinese-Japanese, and English-German.
In our analysis of various models, we observe they tend to be biased toward higher-resource languages.
arXiv Detail & Related papers (2024-10-28T22:09:43Z) - Do Large Language Models Have an English Accent? Evaluating and Improving the Naturalness of Multilingual LLMs [13.558778781305998]
Large Language Models (LLMs) are predominantly designed with English as the primary language.
Even the few that are multilingual tend to exhibit strong English-centric biases.
This paper introduces novel automatic corpus-level metrics to assess the lexical and syntactic naturalness of multilingual outputs.
arXiv Detail & Related papers (2024-10-21T12:34:17Z) - Understanding and Mitigating Language Confusion in LLMs [76.96033035093204]
We evaluate 15 typologically diverse languages with existing and newly-created English and multilingual prompts.
We find that Llama Instruct and Mistral models exhibit high degrees of language confusion.
We find that language confusion can be partially mitigated via few-shot prompting, multilingual SFT and preference tuning.
arXiv Detail & Related papers (2024-06-28T17:03:51Z) - Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models [79.46179534911019]
Large language models (LLMs) have demonstrated multilingual capabilities; yet, they are mostly English-centric due to imbalanced training corpora.
This work extends the evaluation from NLP tasks to real user queries.
For culture-related tasks that need deep language understanding, prompting in the native language tends to be more promising.
arXiv Detail & Related papers (2024-03-15T12:47:39Z) - Zero-Shot Cross-Lingual Reranking with Large Language Models for
Low-Resource Languages [51.301942056881146]
We investigate how large language models (LLMs) function as rerankers in cross-lingual information retrieval systems for African languages.
Our implementation covers English and four African languages (Hausa, Somali, Swahili, and Yoruba)
We examine cross-lingual reranking with queries in English and passages in the African languages.
arXiv Detail & Related papers (2023-12-26T18:38:54Z) - Translate to Disambiguate: Zero-shot Multilingual Word Sense
Disambiguation with Pretrained Language Models [67.19567060894563]
Pretrained Language Models (PLMs) learn rich cross-lingual knowledge and can be finetuned to perform well on diverse tasks.
We present a new study investigating how well PLMs capture cross-lingual word sense with Contextual Word-Level Translation (C-WLT)
We find that as the model size increases, PLMs encode more cross-lingual word sense knowledge and better use context to improve WLT performance.
arXiv Detail & Related papers (2023-04-26T19:55:52Z) - A Primer on Pretrained Multilingual Language Models [18.943173499882885]
Multilingual Language Models (MLLMs) have emerged as a viable option for bringing the power of pretraining to a large number of languages.
We review the existing literature covering the above broad areas of research pertaining to MLLMs.
arXiv Detail & Related papers (2021-07-01T18:01:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.