Language Models' Factuality Depends on the Language of Inquiry
- URL: http://arxiv.org/abs/2502.17955v1
- Date: Tue, 25 Feb 2025 08:27:18 GMT
- Title: Language Models' Factuality Depends on the Language of Inquiry
- Authors: Tushar Aggarwal, Kumar Tanmay, Ayush Agrawal, Kumar Ayush, Hamid Palangi, Paul Pu Liang,
- Abstract summary: We introduce a benchmark of 10,000 country-related facts across 13 languages.<n>We propose three novel metrics: Factual Recall Score, Knowledge Transferability Score, and Cross-Lingual Factual Knowledge Transferability Score.<n>Our results reveal fundamental weaknesses in today's state-of-the-art LMs.
- Score: 36.466186024957075
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multilingual language models (LMs) are expected to recall factual knowledge consistently across languages, yet they often fail to transfer knowledge between languages even when they possess the correct information in one of the languages. For example, we find that an LM may correctly identify Rashed Al Shashai as being from Saudi Arabia when asked in Arabic, but consistently fails to do so when asked in English or Swahili. To systematically investigate this limitation, we introduce a benchmark of 10,000 country-related facts across 13 languages and propose three novel metrics: Factual Recall Score, Knowledge Transferability Score, and Cross-Lingual Factual Knowledge Transferability Score-to quantify factual recall and knowledge transferability in LMs across different languages. Our results reveal fundamental weaknesses in today's state-of-the-art LMs, particularly in cross-lingual generalization where models fail to transfer knowledge effectively across different languages, leading to inconsistent performance sensitive to the language used. Our findings emphasize the need for LMs to recognize language-specific factual reliability and leverage the most trustworthy information across languages. We release our benchmark and evaluation framework to drive future research in multilingual knowledge transfer.
Related papers
- ECLeKTic: a Novel Challenge Set for Evaluation of Cross-Lingual Knowledge Transfer [42.44703812325259]
We present ECLeKTic, a multilingual closed-book QA (CBQA) dataset that Evaluates Cross-Lingual Knowledge Transfer.
We detected information with uneven coverage across languages by controlling for presence and absence of Wikipedia articles in 12 languages.
We show that SOTA models struggle to effectively share knowledge across, languages even if they can predict the answer well for queries in the same language the knowledge was acquired in.
arXiv Detail & Related papers (2025-02-28T16:59:30Z) - Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, i.e., be crosslingual?
This study evaluates state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - Tracing the Roots of Facts in Multilingual Language Models: Independent,
Shared, and Transferred Knowledge [16.923674220979]
This study investigates how multilingual language models (ML-LMs) acquire and represent factual knowledge.
We identify three patterns of acquiring and representing facts in ML-LMs: language-independent, cross-lingual shared and transferred.
Our findings highlight the challenge of maintaining consistent factual knowledge across languages.
arXiv Detail & Related papers (2024-03-08T10:09:57Z) - Language Representation Projection: Can We Transfer Factual Knowledge
across Languages in Multilingual Language Models? [48.88328580373103]
We propose two parameter-free $textbfL$anguage $textbfR$epresentation $textbfP$rojection modules (LRP2)
The first module converts non-English representations into English-like equivalents, while the second module reverts English-like representations back into representations of the corresponding non-English language.
Experimental results on the mLAMA dataset demonstrate that LRP2 significantly improves factual knowledge retrieval accuracy and facilitates knowledge transferability across diverse non-English languages.
arXiv Detail & Related papers (2023-11-07T08:16:16Z) - Cross-Lingual Consistency of Factual Knowledge in Multilingual Language
Models [2.6626950367610402]
We study the cross-lingual consistency (CLC) of factual knowledge in various multilingual PLMs.
We propose a Ranking-based Consistency (RankC) metric to evaluate knowledge consistency across languages independently from accuracy.
arXiv Detail & Related papers (2023-10-16T13:19:17Z) - Cross-Lingual Knowledge Editing in Large Language Models [73.12622532088564]
Knowledge editing has been shown to adapt large language models to new knowledge without retraining from scratch.
It is still unknown the effect of source language editing on a different target language.
We first collect a large-scale cross-lingual synthetic dataset by translating ZsRE from English to Chinese.
arXiv Detail & Related papers (2023-09-16T11:07:52Z) - Adapters for Enhanced Modeling of Multilingual Knowledge and Text [54.02078328453149]
Language models have been extended to multilingual language models (MLLMs)
Knowledge graphs contain facts in an explicit triple format, which require careful curation and are only available in a few high-resource languages.
We propose to enhance MLLMs with knowledge from multilingual knowledge graphs (MLKGs) so as to tackle language and knowledge graph tasks across many languages.
arXiv Detail & Related papers (2022-10-24T21:33:42Z) - X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained
Language Models [103.75890012041366]
Language models (LMs) have proven surprisingly successful at capturing factual knowledge.
However, studies on LMs' factual representation ability have almost invariably been performed on English.
We create a benchmark of cloze-style probes for 23 typologically diverse languages.
arXiv Detail & Related papers (2020-10-13T05:29:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.