NatLan: Native Language Prompting Facilitates Knowledge Elicitation Through Language Trigger Provision and Domain Trigger Retention
- URL: http://arxiv.org/abs/2408.03544v3
- Date: Tue, 15 Oct 2024 08:24:42 GMT
- Title: NatLan: Native Language Prompting Facilitates Knowledge Elicitation Through Language Trigger Provision and Domain Trigger Retention
- Authors: Baixuan Li, Yunlong Fan, Tianyi Ma, Zhiqiang Gao,
- Abstract summary: We analogize the dominant language of MLLMs to the native language of humans and use two human cognitive features to interpret translate-then-answer methods.
We propose Native Language Prompting (NatLan), employing a Multi-MLLM collaboration strategy and introducing an additional role-enhanced domain-specific MLLM.
- Score: 4.051397588437236
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multilingual large language models (MLLMs) do not perform as well when answering questions in non-dominant languages as they do in their dominant languages. Although existing translate-then-answer methods alleviate this issue, the mechanisms behind their effectiveness remain unclear. In this study, we analogize the dominant language of MLLMs to the native language of humans and use two human cognitive features: the Language Trigger (LT) and the Domain Trigger (DT), to interpret the mechanisms behind translate-then-answer methods. This reveals that while sufficient LTs are provided by these methods, there remains a deficiency in DT retention. To mitigate this issue, we propose Native Language Prompting (NatLan), employing a Multi-MLLM collaboration strategy and introducing an additional role-enhanced domain-specific MLLM with stronger multilingual understanding capabilities as the translator. Across five language QA benchmarks, NatLan achieves up to a 31.28% improvement in accuracy and, compared to existing state-of-the-art methods, provides comparable or greater retention of DTs in up to 87% of cases. Our code is available at https://github.com/AnonyNLP/NatLan.
Related papers
- Language Imbalance Driven Rewarding for Multilingual Self-improving [35.1576728251478]
Large Language Models (LLMs) have achieved state-of-the-art performance across numerous tasks.
This imbalance, while limiting broader applications, generates a natural preference ranking between languages.
We propose $textitLanguage Imbalance Driven Rewarding$, where the inherent imbalance between dominant and non-dominant languages is leveraged as a reward signal.
arXiv Detail & Related papers (2024-10-11T16:32:05Z) - Lens: Rethinking Multilingual Enhancement for Large Language Models [70.85065197789639]
Lens is a novel approach to enhance multilingual capabilities of large language models (LLMs)
It operates by manipulating the hidden representations within the language-agnostic and language-specific subspaces from top layers of LLMs.
It achieves superior results with much fewer computational resources compared to existing post-training approaches.
arXiv Detail & Related papers (2024-10-06T08:51:30Z) - Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - Cross-Lingual Transfer Robustness to Lower-Resource Languages on Adversarial Datasets [4.653113033432781]
Cross-lingual transfer capabilities of Multilingual Language Models (MLLMs) are investigated.
Our research provides valuable insights into cross-lingual transfer and its implications for NLP applications.
arXiv Detail & Related papers (2024-03-29T08:47:15Z) - How do Large Language Models Handle Multilingualism? [81.15060972112563]
This study explores how large language models (LLMs) handle multilingualism.
LLMs initially understand the query, converting multilingual inputs into English for task-solving.
In the intermediate layers, they employ English for thinking and incorporate multilingual knowledge with self-attention and feed-forward structures.
arXiv Detail & Related papers (2024-02-29T02:55:26Z) - How Vocabulary Sharing Facilitates Multilingualism in LLaMA? [19.136382859468693]
Large Language Models (LLMs) often show strong performance on English tasks, while exhibiting limitations on other languages.
This study endeavors to examine the multilingual capability of LLMs from the vocabulary sharing perspective.
arXiv Detail & Related papers (2023-11-15T16:13:14Z) - Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts [75.33019401706188]
Large language models (LLMs) are known to effectively perform tasks by simply observing few exemplars.
We propose to assemble synthetic exemplars from a diverse set of high-resource languages to prompt the LLMs to translate from any language into English.
Our unsupervised prompting method performs on par with supervised few-shot learning in LLMs of different sizes for translations between English and 13 Indic and 21 African low-resource languages.
arXiv Detail & Related papers (2023-06-20T08:27:47Z) - Prompt Learning to Mitigate Catastrophic Forgetting in Cross-lingual
Transfer for Open-domain Dialogue Generation [14.68491971816154]
We investigate few-shot cross-lingual transfer learning (FS-XLT) and multitask learning (MTL) in the context of open-domain dialogue generation for non-English languages with limited data.
We observed catastrophic forgetting in both FS-XLT and MTL for all 6 languages in our preliminary experiments.
We propose a simple yet effective prompt learning approach that can preserve the multilinguality of multilingual pre-trained language model.
arXiv Detail & Related papers (2023-05-12T11:41:16Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.