Looking for Clues of Language in Multilingual BERT to Improve
Cross-lingual Generalization
- URL: http://arxiv.org/abs/2010.10041v4
- Date: Mon, 1 Nov 2021 09:05:23 GMT
- Title: Looking for Clues of Language in Multilingual BERT to Improve
Cross-lingual Generalization
- Authors: Chi-Liang Liu and Tsung-Yuan Hsu and Yung-Sung Chuang and Chung-Yi Li
and Hung-yi Lee
- Abstract summary: Token embeddings in multilingual BERT (m-BERT) contain both language and semantic information.
We control the output languages of multilingual BERT by manipulating the token embeddings.
- Score: 56.87201892585477
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Token embeddings in multilingual BERT (m-BERT) contain both language and
semantic information. We find that the representation of a language can be
obtained by simply averaging the embeddings of the tokens of the language.
Given this language representation, we control the output languages of
multilingual BERT by manipulating the token embeddings, thus achieving
unsupervised token translation. We further propose a computationally cheap but
effective approach to improve the cross-lingual ability of m-BERT based on this
observation.
Related papers
- L3Cube-IndicSBERT: A simple approach for learning cross-lingual sentence
representations using multilingual BERT [0.7874708385247353]
The multilingual Sentence-BERT (SBERT) models map different languages to common representation space.
We propose a simple yet effective approach to convert vanilla multilingual BERT models into multilingual sentence BERT models using synthetic corpus.
We show that multilingual BERT models are inherent cross-lingual learners and this simple baseline fine-tuning approach yields exceptional cross-lingual properties.
arXiv Detail & Related papers (2023-04-22T15:45:40Z) - Romanization-based Large-scale Adaptation of Multilingual Language
Models [124.57923286144515]
Large multilingual pretrained language models (mPLMs) have become the de facto state of the art for cross-lingual transfer in NLP.
We study and compare a plethora of data- and parameter-efficient strategies for adapting the mPLMs to romanized and non-romanized corpora of 14 diverse low-resource languages.
Our results reveal that UROMAN-based transliteration can offer strong performance for many languages, with particular gains achieved in the most challenging setups.
arXiv Detail & Related papers (2023-04-18T09:58:34Z) - Exposing Cross-Lingual Lexical Knowledge from Multilingual Sentence
Encoders [85.80950708769923]
We probe multilingual language models for the amount of cross-lingual lexical knowledge stored in their parameters, and compare them against the original multilingual LMs.
We also devise a novel method to expose this knowledge by additionally fine-tuning multilingual models.
We report substantial gains on standard benchmarks.
arXiv Detail & Related papers (2022-04-30T13:23:16Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - To What Degree Can Language Borders Be Blurred In BERT-based
Multilingual Spoken Language Understanding? [7.245261469258502]
We show that although a BERT-based multilingual Spoken Language Understanding (SLU) model works substantially well even on distant language groups, there is still a gap to the ideal multilingual performance.
We propose a novel BERT-based adversarial model architecture to learn language-shared and language-specific representations for multilingual SLU.
arXiv Detail & Related papers (2020-11-10T09:59:24Z) - It's not Greek to mBERT: Inducing Word-Level Translations from
Multilingual BERT [54.84185432755821]
multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages.
We study the word-level translation information embedded in mBERT and present two simple methods that expose remarkable translation capabilities with no fine-tuning.
arXiv Detail & Related papers (2020-10-16T09:49:32Z) - Are All Languages Created Equal in Multilingual BERT? [22.954688396858085]
Multilingual BERT (mBERT) trained on 104 languages has shown surprisingly good cross-lingual performance on several NLP tasks.
We explore how mBERT performs on a much wider set of languages, focusing on the quality of representation for low-resource languages.
arXiv Detail & Related papers (2020-05-18T21:15:39Z) - A Study of Cross-Lingual Ability and Language-specific Information in
Multilingual BERT [60.9051207862378]
multilingual BERT works remarkably well on cross-lingual transfer tasks.
Datasize and context window size are crucial factors to the transferability.
There is a computationally cheap but effective approach to improve the cross-lingual ability of multilingual BERT.
arXiv Detail & Related papers (2020-04-20T11:13:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.