Related papers: An Isotropy Analysis in the Multilingual BERT Embedding Space

An Isotropy Analysis in the Multilingual BERT Embedding Space

URL: http://arxiv.org/abs/2110.04504v1
Date: Sat, 9 Oct 2021 08:29:49 GMT
Title: An Isotropy Analysis in the Multilingual BERT Embedding Space
Authors: Sara Rajaee and Mohammad Taher Pilehvar
Abstract summary: We investigate the representation degeneration problem in multilingual contextual word representations (CWRs) of BERT. Our results show that increasing the isotropy of multilingual embedding space can significantly improve its representation power and performance. Our analysis indicates that although the degenerated directions vary in different languages, they encode similar linguistic knowledge, suggesting a shared linguistic space among languages.
Score: 18.490856440975996
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Several studies have explored various advantages of multilingual pre-trained models (e.g., multilingual BERT) in capturing shared linguistic knowledge. However, their limitations have not been paid enough attention. In this paper, we investigate the representation degeneration problem in multilingual contextual word representations (CWRs) of BERT and show that the embedding spaces of the selected languages suffer from anisotropy problem. Our experimental results demonstrate that, similarly to their monolingual counterparts, increasing the isotropy of multilingual embedding space can significantly improve its representation power and performance. Our analysis indicates that although the degenerated directions vary in different languages, they encode similar linguistic knowledge, suggesting a shared linguistic space among languages.

Related papers

Multilinguality Does not Make Sense: Investigating Factors Behind Zero-Shot Transfer in Sense-Aware Tasks [1.571499916304475]
Cross-lingual transfer allows models to perform tasks in languages unseen during training.<n>We show that multilingual training is neither necessary nor inherently beneficial for effective transfer.
arXiv Detail & Related papers (2025-05-30T17:36:20Z)
High-Dimensional Interlingual Representations of Large Language Models [65.77317753001954]
Large language models (LLMs) trained on massive multilingual datasets hint at the formation of interlingual constructs. We explore 31 diverse languages varying on their resource-levels, typologies, and geographical regions. We find that multilingual LLMs exhibit inconsistent cross-lingual alignments.
arXiv Detail & Related papers (2025-03-14T10:39:27Z)
Exploring Anisotropy and Outliers in Multilingual Language Models for Cross-Lingual Semantic Sentence Similarity [64.18762301574954]
Previous work has shown that the representations output by contextual language models are more anisotropic than static type embeddings. This seems to be true for both monolingual and multilingual models, although much less work has been done on the multilingual context. We investigate outlier dimensions and their relationship to anisotropy in multiple pre-trained multilingual language models.
arXiv Detail & Related papers (2023-06-01T09:01:48Z)
When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer [15.578267998149743]
We show that the absence of sub-word overlap significantly affects zero-shot transfer when languages differ in their word order. There is a strong correlation between transfer performance and word embedding alignment between languages. Our results call for focus in multilingual models on explicitly improving word embedding alignment between languages.
arXiv Detail & Related papers (2021-10-27T21:25:39Z)
On the Language-specificity of Multilingual BERT and the Impact of Fine-tuning [7.493779672689531]
The knowledge acquired by multilingual BERT (mBERT) has two components: a language-specific and a language-neutral one. This paper analyses the relationship between them, in the context of fine-tuning on two tasks.
arXiv Detail & Related papers (2021-09-14T19:28:31Z)
Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis. We cluster all the target languages into multiple groups and name each group as a representation sprachbund. Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z)
AM2iCo: Evaluating Word Meaning in Context across Low-ResourceLanguages with Adversarial Examples [51.048234591165155]
We present AM2iCo, Adversarial and Multilingual Meaning in Context. It aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts. Results reveal that current SotA pretrained encoders substantially lag behind human performance.
arXiv Detail & Related papers (2021-04-17T20:23:45Z)
Finding Universal Grammatical Relations in Multilingual BERT [47.74015366712623]
We show that subspaces of mBERT representations recover syntactic tree distances in languages other than English. We present an unsupervised analysis method that provides evidence mBERT learns representations of syntactic dependency labels.
arXiv Detail & Related papers (2020-05-09T20:46:02Z)
Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer [101.58431011820755]
We study gender bias in multilingual embeddings and how it affects transfer learning for NLP applications. We create a multilingual dataset for bias analysis and propose several ways for quantifying bias in multilingual representations.
arXiv Detail & Related papers (2020-05-02T04:34:37Z)
A Study of Cross-Lingual Ability and Language-specific Information in Multilingual BERT [60.9051207862378]
multilingual BERT works remarkably well on cross-lingual transfer tasks. Datasize and context window size are crucial factors to the transferability. There is a computationally cheap but effective approach to improve the cross-lingual ability of multilingual BERT.
arXiv Detail & Related papers (2020-04-20T11:13:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.