CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering
- URL: http://arxiv.org/abs/2501.18457v2
- Date: Mon, 10 Feb 2025 07:46:06 GMT
- Title: CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering
- Authors: Yumeng Wang, Zhiyuan Fan, Qingyun Wang, May Fung, Heng Ji,
- Abstract summary: Large Language Models (LLMs) are pretrained on extensive multilingual corpora to acquire both language-specific cultural knowledge and general knowledge.
We explore the Cross-Lingual Self-Aligning ability of Language Models (CALM) to align knowledge across languages.
We employ direct preference optimization (DPO) to align the model's knowledge across different languages.
- Score: 42.92810049636768
- License:
- Abstract: Large Language Models (LLMs) are pretrained on extensive multilingual corpora to acquire both language-specific cultural knowledge and general knowledge. Ideally, while LLMs should provide consistent responses to culture-independent questions across languages, we observe significant performance disparities. To address this, we explore the Cross-Lingual Self-Aligning ability of Language Models (CALM) to align knowledge across languages. Specifically, for a given question, we sample multiple responses across different languages and select the most self-consistent response as the target, leaving the remaining responses as negative examples. We then employ direct preference optimization (DPO) to align the model's knowledge across different languages. Evaluations on the MEDQA and X-CSQA datasets demonstrate CALM's effectiveness in enhancing cross-lingual knowledge question answering, both in zero-shot and retrieval-augmented settings. We also found that increasing the number of languages involved in CALM training leads to higher accuracy and consistency. We offer a qualitative analysis of how cross-lingual consistency can enhance knowledge alignment and explore the method's generalizability.
Related papers
- How Do Multilingual Language Models Remember Facts? [50.13632788453612]
We show that previously identified recall mechanisms in English largely apply to multilingual contexts.
We localize the role of language during recall, finding that subject enrichment is language-independent.
In decoder-only LLMs, FVs compose these two pieces of information in two separate stages.
arXiv Detail & Related papers (2024-10-18T11:39:34Z) - Multilingual Needle in a Haystack: Investigating Long-Context Behavior of Multilingual Large Language Models [22.859955360764275]
We introduce the MultiLingual Needle-in-a-Haystack (MLNeedle) test to assess a model's ability to retrieve relevant information.
We evaluate four state-of-the-art large language models on MLNeedle.
arXiv Detail & Related papers (2024-08-19T17:02:06Z) - Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - 1+1>2: Can Large Language Models Serve as Cross-Lingual Knowledge Aggregators? [46.43162333819418]
Large Language Models (LLMs) have garnered significant attention due to their remarkable ability to process information across various languages.
Despite their capabilities, they exhibit inconsistencies in handling identical queries in different languages, presenting challenges for further advancement.
This paper introduces a method to enhance the multilingual performance of LLMs by aggregating knowledge from diverse languages.
arXiv Detail & Related papers (2024-06-20T20:32:53Z) - MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models [65.10456412127405]
MLaKE is a benchmark for the adaptability of knowledge editing methods across five languages.
MLaKE aggregates fact chains from Wikipedia across languages and generates questions in both free-form and multiple-choice.
We evaluate the multilingual knowledge editing generalization capabilities of existing methods on MLaKE.
arXiv Detail & Related papers (2024-04-07T15:23:28Z) - Multilingual Pretraining and Instruction Tuning Improve Cross-Lingual Knowledge Alignment, But Only Shallowly [53.04368883943773]
Two approaches are proposed to address this, i.e., multilingual pretraining and multilingual instruction tuning.
We propose CLiKA to assess the cross-lingual knowledge alignment of LLMs in the Performance, Consistency and Conductivity levels.
Results show that while both multilingual pretraining and instruction tuning are beneficial for cross-lingual knowledge alignment, the training strategy needs to be carefully designed.
arXiv Detail & Related papers (2024-04-06T15:25:06Z) - Cross-Lingual Consistency of Factual Knowledge in Multilingual Language
Models [2.6626950367610402]
We study the cross-lingual consistency (CLC) of factual knowledge in various multilingual PLMs.
We propose a Ranking-based Consistency (RankC) metric to evaluate knowledge consistency across languages independently from accuracy.
arXiv Detail & Related papers (2023-10-16T13:19:17Z) - Cross-lingual QA: A Key to Unlocking In-context Cross-lingual Performance [2.371686365695081]
Cross-lingual QA is a cross-lingual prompting method that translates only the question and answer parts, thus reducing translation costs.
Experiments on four typologically diverse multilingual benchmarks show that Cross-lingual QA effectively stimulates models to elicit their cross-lingual knowledge.
We show that prompting open-source MLLMs with cross-lingual in-context examples enhances performance as the model scale increases.
arXiv Detail & Related papers (2023-05-24T15:14:49Z) - X-METRA-ADA: Cross-lingual Meta-Transfer Learning Adaptation to Natural
Language Understanding and Question Answering [55.57776147848929]
We propose X-METRA-ADA, a cross-lingual MEta-TRAnsfer learning ADAptation approach for Natural Language Understanding (NLU)
Our approach adapts MAML, an optimization-based meta-learning approach, to learn to adapt to new languages.
We show that our approach outperforms naive fine-tuning, reaching competitive performance on both tasks for most languages.
arXiv Detail & Related papers (2021-04-20T00:13:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.