Cross-lingual Alignment Methods for Multilingual BERT: A Comparative
Study
- URL: http://arxiv.org/abs/2009.14304v1
- Date: Tue, 29 Sep 2020 20:56:57 GMT
- Title: Cross-lingual Alignment Methods for Multilingual BERT: A Comparative
Study
- Authors: Saurabh Kulshreshtha, Jos\'e Luis Redondo-Garc\'ia, Ching-Yun Chang
- Abstract summary: We analyse how different forms of cross-lingual supervision and various alignment methods influence the transfer capability of mBERT in zero-shot setting.
We find that supervision from parallel corpus is generally superior to dictionary alignments.
- Score: 2.101267270902429
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multilingual BERT (mBERT) has shown reasonable capability for zero-shot
cross-lingual transfer when fine-tuned on downstream tasks. Since mBERT is not
pre-trained with explicit cross-lingual supervision, transfer performance can
further be improved by aligning mBERT with cross-lingual signal. Prior work
proposes several approaches to align contextualised embeddings. In this paper
we analyse how different forms of cross-lingual supervision and various
alignment methods influence the transfer capability of mBERT in zero-shot
setting. Specifically, we compare parallel corpora vs. dictionary-based
supervision and rotational vs. fine-tuning based alignment methods. We evaluate
the performance of different alignment methodologies across eight languages on
two tasks: Name Entity Recognition and Semantic Slot Filling. In addition, we
propose a novel normalisation method which consistently improves the
performance of rotation-based alignment including a notable 3% F1 improvement
for distant and typologically dissimilar languages. Importantly we identify the
biases of the alignment methods to the type of task and proximity to the
transfer language. We also find that supervision from parallel corpus is
generally superior to dictionary alignments.
Related papers
- How Transliterations Improve Crosslingual Alignment [48.929677368744606]
Recent studies have shown that post-aligning multilingual pretrained language models (mPLMs) using alignment objectives can improve crosslingual alignment.
This paper attempts to explicitly evaluate the crosslingual alignment and identify the key elements in transliteration-based approaches that contribute to better performance.
arXiv Detail & Related papers (2024-09-25T20:05:45Z) - Improving Multi-lingual Alignment Through Soft Contrastive Learning [9.454626745893798]
We propose a novel method to align multi-lingual embeddings based on the similarity of sentences measured by a pre-trained mono-lingual embedding model.
Given translation sentence pairs, we train a multi-lingual model in a way that the similarity between cross-lingual embeddings follows the similarity of sentences measured at the mono-lingual teacher model.
arXiv Detail & Related papers (2024-05-25T09:46:07Z) - Exploring the Relationship between Alignment and Cross-lingual Transfer
in Multilingual Transformers [0.6882042556551609]
multilingual language models can achieve cross-lingual transfer without explicit cross-lingual training data.
One common way to improve this transfer is to perform realignment steps before fine-tuning.
But realignment methods were found to not always improve results across languages and tasks.
arXiv Detail & Related papers (2023-06-05T11:35:40Z) - VECO 2.0: Cross-lingual Language Model Pre-training with
Multi-granularity Contrastive Learning [56.47303426167584]
We propose a cross-lingual pre-trained model VECO2.0 based on contrastive learning with multi-granularity alignments.
Specifically, the sequence-to-sequence alignment is induced to maximize the similarity of the parallel pairs and minimize the non-parallel pairs.
token-to-token alignment is integrated to bridge the gap between synonymous tokens excavated via the thesaurus dictionary from the other unpaired tokens in a bilingual instance.
arXiv Detail & Related papers (2023-04-17T12:23:41Z) - Multilingual Sentence Transformer as A Multilingual Word Aligner [15.689680887384847]
We investigate whether multilingual sentence Transformer LaBSE is a strong multilingual word aligner.
Experiment results on seven language pairs show that our best aligner outperforms previous state-of-the-art models of all varieties.
Our aligner supports different language pairs in a single model, and even achieves new state-of-the-art on zero-shot language pairs that does not appear in the finetuning process.
arXiv Detail & Related papers (2023-01-28T09:28:55Z) - Unsupervised Alignment of Distributional Word Embeddings [0.0]
Cross-domain alignment play a key role in tasks ranging from machine translation to transfer learning.
We show that the proposed approach achieves good performance on the bilingual lexicon induction task across several language pairs.
arXiv Detail & Related papers (2022-03-09T16:39:06Z) - Word Alignment by Fine-tuning Embeddings on Parallel Corpora [96.28608163701055]
Word alignment over parallel corpora has a wide variety of applications, including learning translation lexicons, cross-lingual transfer of language processing tools, and automatic evaluation or analysis of translation outputs.
Recently, other work has demonstrated that pre-trained contextualized word embeddings derived from multilingually trained language models (LMs) prove an attractive alternative, achieving competitive results on the word alignment task even in the absence of explicit training on parallel data.
In this paper, we examine methods to marry the two approaches: leveraging pre-trained LMs but fine-tuning them on parallel text with objectives designed to improve alignment quality, and proposing
arXiv Detail & Related papers (2021-01-20T17:54:47Z) - On the Language Neutrality of Pre-trained Multilingual Representations [70.93503607755055]
We investigate the language-neutrality of multilingual contextual embeddings directly and with respect to lexical semantics.
Our results show that contextual embeddings are more language-neutral and, in general, more informative than aligned static word-type embeddings.
We show how to reach state-of-the-art accuracy on language identification and match the performance of statistical methods for word alignment of parallel sentences.
arXiv Detail & Related papers (2020-04-09T19:50:32Z) - Multilingual Alignment of Contextual Word Representations [49.42244463346612]
BERT exhibits significantly improved zero-shot performance on XNLI compared to the base model.
We introduce a contextual version of word retrieval and show that it correlates well with downstream zero-shot transfer.
These results support contextual alignment as a useful concept for understanding large multilingual pre-trained models.
arXiv Detail & Related papers (2020-02-10T03:27:21Z) - Robust Cross-lingual Embeddings from Parallel Sentences [65.85468628136927]
We propose a bilingual extension of the CBOW method which leverages sentence-aligned corpora to obtain robust cross-lingual word representations.
Our approach significantly improves crosslingual sentence retrieval performance over all other approaches.
It also achieves parity with a deep RNN method on a zero-shot cross-lingual document classification task.
arXiv Detail & Related papers (2019-12-28T16:18:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.