Locating Language-Specific Information in Contextualized Embeddings
- URL: http://arxiv.org/abs/2109.08040v1
- Date: Thu, 16 Sep 2021 15:11:55 GMT
- Title: Locating Language-Specific Information in Contextualized Embeddings
- Authors: Sheng Liang, Philipp Dufter, Hinrich Sch\"utze
- Abstract summary: Multilingual pretrained language models (MPLMs) exhibit multilinguality and are well suited for transfer across languages.
The question whether MPLM representations are language-agnostic or they simply interleave well with learned task prediction heads arises.
We locate language-specific information in MPLMs and identify its dimensionality and the layers where this information occurs.
- Score: 2.836066255205732
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multilingual pretrained language models (MPLMs) exhibit multilinguality and
are well suited for transfer across languages. Most MPLMs are trained in an
unsupervised fashion and the relationship between their objective and
multilinguality is unclear. More specifically, the question whether MPLM
representations are language-agnostic or they simply interleave well with
learned task prediction heads arises. In this work, we locate language-specific
information in MPLMs and identify its dimensionality and the layers where this
information occurs. We show that language-specific information is scattered
across many dimensions, which can be projected into a linear subspace. Our
study contributes to a better understanding of MPLM representations, going
beyond treating them as unanalyzable blobs of information.
Related papers
- Multilingual Large Language Models: A Systematic Survey [38.972546467173565]
This paper provides a comprehensive survey of the latest research on multilingual large language models (MLLMs)
We first discuss the architecture and pre-training objectives of MLLMs, highlighting the key components and methodologies that contribute to their multilingual capabilities.
We present a detailed taxonomy and roadmap covering the assessment of MLLMs' cross-lingual knowledge, reasoning, alignment with human values, safety, interpretability and specialized applications.
arXiv Detail & Related papers (2024-11-17T13:21:26Z) - Beneath the Surface of Consistency: Exploring Cross-lingual Knowledge Representation Sharing in LLMs [31.893686987768742]
Language models are inconsistent in their ability to answer the same factual question across languages.
We explore multilingual factual knowledge through two aspects: the model's ability to answer a query consistently across languages, and the ability to ''store'' answers in a shared representation for several languages.
arXiv Detail & Related papers (2024-08-20T08:38:30Z) - Faux Polyglot: A Study on Information Disparity in Multilingual Large Language Models [7.615938028813914]
With Retrieval Augmented Generation (RAG), Large Language Models (LLMs) are playing a pivotal role in information search.
We studied LLM's linguistic preference in a RAG-based information search setting.
We found that LLMs displayed systemic bias towards information in the same language as the query language in both information retrieval and answer generation.
arXiv Detail & Related papers (2024-07-07T21:26:36Z) - Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - Towards Truthful Multilingual Large Language Models: Benchmarking and Alignment Strategies [38.3269908062146]
We construct a benchmark for truthfulness evaluation in multilingual scenarios.
We propose Fact-aware Multilingual Selective Synergy (FaMSS) to optimize the data allocation across a large number of languages.
arXiv Detail & Related papers (2024-06-20T15:59:07Z) - Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners [67.85635044939836]
Large Language Models (LLMs) have shown impressive language capabilities.
In this work, we investigate the spontaneous multilingual alignment improvement of LLMs.
We find that LLMs instruction-tuned on the question translation data (i.e. without annotated answers) are able to encourage the alignment between English and a wide range of languages.
arXiv Detail & Related papers (2024-05-22T16:46:19Z) - How do Large Language Models Handle Multilingualism? [81.15060972112563]
This study explores how large language models (LLMs) handle multilingualism.
LLMs initially understand the query, converting multilingual inputs into English for task-solving.
In the intermediate layers, they employ English for thinking and incorporate multilingual knowledge with self-attention and feed-forward structures.
arXiv Detail & Related papers (2024-02-29T02:55:26Z) - Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models [117.20416338476856]
Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora.
We propose a novel detection method, language activation probability entropy (LAPE), to identify language-specific neurons within LLMs.
Our findings indicate that LLMs' proficiency in processing a particular language is predominantly due to a small subset of neurons.
arXiv Detail & Related papers (2024-02-26T09:36:05Z) - UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised
Fine-tuning Dataset [69.33424532827608]
Open-source large language models (LLMs) have gained significant strength across diverse fields.
In this work, we construct an open-source multilingual supervised fine-tuning dataset.
The resulting UltraLink dataset comprises approximately 1 million samples across five languages.
arXiv Detail & Related papers (2024-02-07T05:05:53Z) - Adapters for Enhanced Modeling of Multilingual Knowledge and Text [54.02078328453149]
Language models have been extended to multilingual language models (MLLMs)
Knowledge graphs contain facts in an explicit triple format, which require careful curation and are only available in a few high-resource languages.
We propose to enhance MLLMs with knowledge from multilingual knowledge graphs (MLKGs) so as to tackle language and knowledge graph tasks across many languages.
arXiv Detail & Related papers (2022-10-24T21:33:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.