Discovering Low-rank Subspaces for Language-agnostic Multilingual
Representations
- URL: http://arxiv.org/abs/2401.05792v1
- Date: Thu, 11 Jan 2024 09:54:11 GMT
- Title: Discovering Low-rank Subspaces for Language-agnostic Multilingual
Representations
- Authors: Zhihui Xie, Handong Zhao, Tong Yu, Shuai Li
- Abstract summary: Large pretrained multilingual language models (ML-LMs) have shown remarkable capabilities of zero-shot cross-lingual transfer.
We present a novel view of projecting away language-specific factors from a multilingual embedding space.
We show that applying our method consistently leads to improvements over commonly used ML-LMs.
- Score: 38.56175462620892
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large pretrained multilingual language models (ML-LMs) have shown remarkable
capabilities of zero-shot cross-lingual transfer, without direct cross-lingual
supervision. While these results are promising, follow-up works found that,
within the multilingual embedding spaces, there exists strong language identity
information which hinders the expression of linguistic factors shared across
languages. For semantic tasks like cross-lingual sentence retrieval, it is
desired to remove such language identity signals to fully leverage semantic
information. In this work, we provide a novel view of projecting away
language-specific factors from a multilingual embedding space. Specifically, we
discover that there exists a low-rank subspace that primarily encodes
information irrelevant to semantics (e.g., syntactic information). To identify
this subspace, we present a simple but effective unsupervised method based on
singular value decomposition with multiple monolingual corpora as input. Once
the subspace is found, we can directly project the original embeddings into the
null space to boost language agnosticism without finetuning. We systematically
evaluate our method on various tasks including the challenging
language-agnostic QA retrieval task. Empirical results show that applying our
method consistently leads to improvements over commonly used ML-LMs.
Related papers
- Lens: Rethinking Multilingual Enhancement for Large Language Models [70.85065197789639]
Lens is a novel approach to enhance multilingual capabilities of large language models (LLMs)
It operates by manipulating the hidden representations within the language-agnostic and language-specific subspaces from top layers of LLMs.
It achieves superior results with much fewer computational resources compared to existing post-training approaches.
arXiv Detail & Related papers (2024-10-06T08:51:30Z) - Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - Multilingual Entity and Relation Extraction from Unified to
Language-specific Training [29.778332361215636]
Existing approaches for entity and relation extraction tasks mainly focus on the English corpora and ignore other languages.
We propose a two-stage multilingual training method and a joint model called Multilingual Entity and Relation Extraction framework (mERE) to mitigate language interference.
Our method outperforms both the monolingual and multilingual baseline methods.
arXiv Detail & Related papers (2023-01-11T12:26:53Z) - Exposing Cross-Lingual Lexical Knowledge from Multilingual Sentence
Encoders [85.80950708769923]
We probe multilingual language models for the amount of cross-lingual lexical knowledge stored in their parameters, and compare them against the original multilingual LMs.
We also devise a novel method to expose this knowledge by additionally fine-tuning multilingual models.
We report substantial gains on standard benchmarks.
arXiv Detail & Related papers (2022-04-30T13:23:16Z) - A Simple and Effective Method To Eliminate the Self Language Bias in
Multilingual Representations [7.571549274473274]
Language agnostic and semantic-language information isolation is an emerging research direction for multilingual representations models.
"Language Information Removal (LIR)" factors out language identity information from semantic related components in multilingual representations pre-trained on multi-monolingual data.
LIR reveals that for weak-alignment multilingual systems, the principal components of semantic spaces primarily encodes language identity information.
arXiv Detail & Related papers (2021-09-10T08:15:37Z) - AM2iCo: Evaluating Word Meaning in Context across Low-ResourceLanguages
with Adversarial Examples [51.048234591165155]
We present AM2iCo, Adversarial and Multilingual Meaning in Context.
It aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts.
Results reveal that current SotA pretrained encoders substantially lag behind human performance.
arXiv Detail & Related papers (2021-04-17T20:23:45Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - On the Importance of Word Order Information in Cross-lingual Sequence
Labeling [80.65425412067464]
Cross-lingual models that fit into the word order of the source language might fail to handle target languages.
We investigate whether making models insensitive to the word order of the source language can improve the adaptation performance in target languages.
arXiv Detail & Related papers (2020-01-30T03:35:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.