Exploring Multilingual Probing in Large Language Models: A Cross-Language Analysis
- URL: http://arxiv.org/abs/2409.14459v1
- Date: Sun, 22 Sep 2024 14:14:05 GMT
- Title: Exploring Multilingual Probing in Large Language Models: A Cross-Language Analysis
- Authors: Daoyang Li, Mingyu Jin, Qingcheng Zeng, Haiyan Zhao, Mengnan Du,
- Abstract summary: Probing techniques for large language models (LLMs) have primarily focused on English, overlooking the vast majority of the world's languages.
We conduct experiments on several open-source LLM models, analyzing probing accuracy, trends across layers, and similarities between probing vectors for multiple languages.
- Score: 19.37853222555255
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Probing techniques for large language models (LLMs) have primarily focused on English, overlooking the vast majority of the world's languages. In this paper, we extend these probing methods to a multilingual context, investigating the behaviors of LLMs across diverse languages. We conduct experiments on several open-source LLM models, analyzing probing accuracy, trends across layers, and similarities between probing vectors for multiple languages. Our key findings reveal: (1) a consistent performance gap between high-resource and low-resource languages, with high-resource languages achieving significantly higher probing accuracy; (2) divergent layer-wise accuracy trends, where high-resource languages show substantial improvement in deeper layers similar to English; and (3) higher representational similarities among high-resource languages, with low-resource languages demonstrating lower similarities both among themselves and with high-resource languages. These results highlight significant disparities in LLMs' multilingual capabilities and emphasize the need for improved modeling of low-resource languages.
Related papers
- Comparative Study of Multilingual Idioms and Similes in Large Language Models [4.581124233698535]
This study explores the effectiveness of prompt engineering strategies, including chain-of-thought, few-shot, and English translation prompts.
We extend the language of these datasets to Persian as well by building two new evaluation sets.
Our findings reveal that while prompt engineering methods are generally effective, their success varies by figurative type, language, and model.
arXiv Detail & Related papers (2024-10-21T19:40:05Z) - Targeted Multilingual Adaptation for Low-resource Language Families [17.212424929235624]
We study best practices for adapting a pre-trained model to a language family.
Our adapted models significantly outperform mono- and multilingual baselines.
Low-resource languages can be aggressively up-sampled during training at little detriment to performance in high-resource languages.
arXiv Detail & Related papers (2024-05-20T23:38:06Z) - Quantifying Multilingual Performance of Large Language Models Across Languages [48.40607157158246]
Large Language Models (LLMs) perform better on high-resource languages like English, German, and French, while their capabilities in low-resource languages remain inadequate.
We propose the Language Ranker, an intrinsic metric designed to benchmark and rank languages based on LLM performance using internal representations.
Our analysis reveals that high-resource languages exhibit higher similarity scores with English, demonstrating superior performance, while low-resource languages show lower similarity scores.
arXiv Detail & Related papers (2024-04-17T16:53:16Z) - Enhancing Multilingual Capabilities of Large Language Models through
Self-Distillation from Resource-Rich Languages [60.162717568496355]
Large language models (LLMs) have been pre-trained on multilingual corpora.
Their performance still lags behind in most languages compared to a few resource-rich languages.
arXiv Detail & Related papers (2024-02-19T15:07:32Z) - Zero-shot Sentiment Analysis in Low-Resource Languages Using a
Multilingual Sentiment Lexicon [78.12363425794214]
We focus on zero-shot sentiment analysis tasks across 34 languages, including 6 high/medium-resource languages, 25 low-resource languages, and 3 code-switching datasets.
We demonstrate that pretraining using multilingual lexicons, without using any sentence-level sentiment data, achieves superior zero-shot performance compared to models fine-tuned on English sentiment datasets.
arXiv Detail & Related papers (2024-02-03T10:41:05Z) - Hindi as a Second Language: Improving Visually Grounded Speech with
Semantically Similar Samples [89.16814518860357]
The objective of this work is to explore the learning of visually grounded speech models (VGS) from multilingual perspective.
Our key contribution in this work is to leverage the power of a high-resource language in a bilingual visually grounded speech model to improve the performance of a low-resource language.
arXiv Detail & Related papers (2023-03-30T16:34:10Z) - High-resource Language-specific Training for Multilingual Neural Machine
Translation [109.31892935605192]
We propose the multilingual translation model with the high-resource language-specific training (HLT-MT) to alleviate the negative interference.
Specifically, we first train the multilingual model only with the high-resource pairs and select the language-specific modules at the top of the decoder.
HLT-MT is further trained on all available corpora to transfer knowledge from high-resource languages to low-resource languages.
arXiv Detail & Related papers (2022-07-11T14:33:13Z) - HiJoNLP at SemEval-2022 Task 2: Detecting Idiomaticity of Multiword
Expressions using Multilingual Pretrained Language Models [0.6091702876917281]
This paper describes an approach to detect idiomaticity only from the contextualized representation of a MWE over multilingual pretrained language models.
Our experiments find that larger models are usually more effective in idiomaticity detection. However, using a higher layer of the model may not guarantee a better performance.
arXiv Detail & Related papers (2022-05-27T01:55:59Z) - Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis.
We cluster all the target languages into multiple groups and name each group as a representation sprachbund.
Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.