Languages You Know Influence Those You Learn: Impact of Language
Characteristics on Multi-Lingual Text-to-Text Transfer
- URL: http://arxiv.org/abs/2212.01757v1
- Date: Sun, 4 Dec 2022 07:22:21 GMT
- Title: Languages You Know Influence Those You Learn: Impact of Language
Characteristics on Multi-Lingual Text-to-Text Transfer
- Authors: Benjamin Muller, Deepanshu Gupta, Siddharth Patwardhan, Jean-Philippe
Fauconnier, David Vandyke, Sachin Agarwal
- Abstract summary: Multi-lingual language models (LM) have been remarkably successful in enabling natural language tasks in low-resource languages.
We try to better understand how such models, specifically mT5, transfer *any* linguistic and semantic knowledge across languages.
A key finding of this work is that similarity of syntax, morphology and phonology are good predictors of cross-lingual transfer.
- Score: 4.554080966463776
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Multi-lingual language models (LM), such as mBERT, XLM-R, mT5, mBART, have
been remarkably successful in enabling natural language tasks in low-resource
languages through cross-lingual transfer from high-resource ones. In this work,
we try to better understand how such models, specifically mT5, transfer *any*
linguistic and semantic knowledge across languages, even though no explicit
cross-lingual signals are provided during pre-training. Rather, only
unannotated texts from each language are presented to the model separately and
independently of one another, and the model appears to implicitly learn
cross-lingual connections. This raises several questions that motivate our
study, such as: Are the cross-lingual connections between every language pair
equally strong? What properties of source and target language impact the
strength of cross-lingual transfer? Can we quantify the impact of those
properties on the cross-lingual transfer?
In our investigation, we analyze a pre-trained mT5 to discover the attributes
of cross-lingual connections learned by the model. Through a statistical
interpretation framework over 90 language pairs across three tasks, we show
that transfer performance can be modeled by a few linguistic and data-derived
features. These observations enable us to interpret cross-lingual understanding
of the mT5 model. Through these observations, one can favorably choose the best
source language for a task, and can anticipate its training data demands. A key
finding of this work is that similarity of syntax, morphology and phonology are
good predictors of cross-lingual transfer, significantly more than just the
lexical similarity of languages. For a given language, we are able to predict
zero-shot performance, that increases on a logarithmic scale with the number of
few-shot target language data points.
Related papers
- Cross-Lingual Transfer Learning for Phrase Break Prediction with
Multilingual Language Model [13.730152819942445]
Cross-lingual transfer learning can be particularly effective for improving performance in low-resource languages.
This suggests that cross-lingual transfer can be inexpensive and effective for developing TTS front-end in resource-poor languages.
arXiv Detail & Related papers (2023-06-05T04:10:04Z) - How do languages influence each other? Studying cross-lingual data sharing during LM fine-tuning [14.02101305717738]
Multilingual large language models (MLLMs) are jointly trained on data from many different languages.
It remains unclear to what extent, and under which conditions, languages rely on each other's data.
We find that MLLMs rely on data from multiple languages from the early stages of fine-tuning and that this reliance gradually increases as fine-tuning progresses.
arXiv Detail & Related papers (2023-05-22T17:47:41Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - Cross-lingual Transfer Learning for Check-worthy Claim Identification
over Twitter [7.601937548486356]
Misinformation spread over social media has become an undeniable infodemic.
We present a systematic study of six approaches for cross-lingual check-worthiness estimation across pairs of five diverse languages with the help of Multilingual BERT (mBERT) model.
Our results show that for some language pairs, zero-shot cross-lingual transfer is possible and can perform as good as monolingual models that are trained on the target language.
arXiv Detail & Related papers (2022-11-09T18:18:53Z) - Bootstrapping Multilingual Semantic Parsers using Large Language Models [28.257114724384806]
translate-train paradigm of transferring English datasets across multiple languages remains to be the key ingredient for training task-specific multilingual models.
We consider the task of multilingual semantic parsing and demonstrate the effectiveness and flexibility offered by large language models (LLMs) for translating English datasets into several languages via few-shot prompting.
arXiv Detail & Related papers (2022-10-13T19:34:14Z) - Cross-Lingual Ability of Multilingual Masked Language Models: A Study of
Language Structure [54.01613740115601]
We study three language properties: constituent order, composition and word co-occurrence.
Our main conclusion is that the contribution of constituent order and word co-occurrence is limited, while the composition is more crucial to the success of cross-linguistic transfer.
arXiv Detail & Related papers (2022-03-16T07:09:35Z) - Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis.
We cluster all the target languages into multiple groups and name each group as a representation sprachbund.
Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z) - A Study of Cross-Lingual Ability and Language-specific Information in
Multilingual BERT [60.9051207862378]
multilingual BERT works remarkably well on cross-lingual transfer tasks.
Datasize and context window size are crucial factors to the transferability.
There is a computationally cheap but effective approach to improve the cross-lingual ability of multilingual BERT.
arXiv Detail & Related papers (2020-04-20T11:13:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.