X-SNS: Cross-Lingual Transfer Prediction through Sub-Network Similarity
- URL: http://arxiv.org/abs/2310.17166v1
- Date: Thu, 26 Oct 2023 05:39:49 GMT
- Title: X-SNS: Cross-Lingual Transfer Prediction through Sub-Network Similarity
- Authors: Taejun Yun, Jinhyeon Kim, Deokyeong Kang, Seong Hoon Lim, Jihoon Kim,
Taeuk Kim
- Abstract summary: Cross-lingual transfer (XLT) is an ability of multilingual language models that preserves their performance on a task to a significant extent when evaluated in languages that were not included in the fine-tuning process.
We propose the utilization of sub-network similarity between two languages as a proxy for predicting the compatibility of the languages in the context of XLT.
- Score: 19.15213046428148
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-lingual transfer (XLT) is an emergent ability of multilingual language
models that preserves their performance on a task to a significant extent when
evaluated in languages that were not included in the fine-tuning process. While
English, due to its widespread usage, is typically regarded as the primary
language for model adaption in various tasks, recent studies have revealed that
the efficacy of XLT can be amplified by selecting the most appropriate source
languages based on specific conditions. In this work, we propose the
utilization of sub-network similarity between two languages as a proxy for
predicting the compatibility of the languages in the context of XLT. Our
approach is model-oriented, better reflecting the inner workings of foundation
models. In addition, it requires only a moderate amount of raw text from
candidate languages, distinguishing it from the majority of previous methods
that rely on external resources. In experiments, we demonstrate that our method
is more effective than baselines across diverse tasks. Specifically, it shows
proficiency in ranking candidates for zero-shot XLT, achieving an improvement
of 4.6% on average in terms of NDCG@3. We also provide extensive analyses that
confirm the utility of sub-networks for XLT prediction.
Related papers
- Analysis of Multi-Source Language Training in Cross-Lingual Transfer [6.992785466925966]
Cross-lingual transfer (XLT) methods have contributed to addressing this data scarcity problem.
We show that the use of multiple source languages in XLT-a technique we term Multi-Source Language Training (MSLT)-leads to increased mingling of embedding spaces for different languages.
On the other hand, we discover that using an arbitrary combination of source languages does not always guarantee better performance.
arXiv Detail & Related papers (2024-02-21T06:37:07Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Model and Data Transfer for Cross-Lingual Sequence Labelling in
Zero-Resource Settings [10.871587311621974]
We experimentally demonstrate that high capacity multilingual language models applied in a zero-shot setting consistently outperform data-based cross-lingual transfer approaches.
A detailed analysis of our results suggests that this might be due to important differences in language use.
Our results also indicate that data-based cross-lingual transfer approaches remain a competitive option when high-capacity multilingual language models are not available.
arXiv Detail & Related papers (2022-10-23T05:37:35Z) - Nearest Neighbour Few-Shot Learning for Cross-lingual Classification [2.578242050187029]
Cross-lingual adaptation using a simple nearest neighbor few-shot (15 samples) inference technique for classification tasks.
Our approach consistently improves traditional fine-tuning using only a handful of labeled samples in target locales.
arXiv Detail & Related papers (2021-09-06T03:18:23Z) - Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language
Model [58.27176041092891]
Recent research indicates that pretraining cross-lingual language models on large-scale unlabeled texts yields significant performance improvements.
We propose a novel unsupervised feature decomposition method that can automatically extract domain-specific features from the entangled pretrained cross-lingual representations.
Our proposed model leverages mutual information estimation to decompose the representations computed by a cross-lingual model into domain-invariant and domain-specific parts.
arXiv Detail & Related papers (2020-11-23T16:00:42Z) - XL-WiC: A Multilingual Benchmark for Evaluating Semantic
Contextualization [98.61159823343036]
We present the Word-in-Context dataset (WiC) for assessing the ability to correctly model distinct meanings of a word.
We put forward a large multilingual benchmark, XL-WiC, featuring gold standards in 12 new languages.
Experimental results show that even when no tagged instances are available for a target language, models trained solely on the English data can attain competitive performance.
arXiv Detail & Related papers (2020-10-13T15:32:00Z) - XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning [68.57658225995966]
Cross-lingual Choice of Plausible Alternatives (XCOPA) is a typologically diverse multilingual dataset for causal commonsense reasoning in 11 languages.
We evaluate a range of state-of-the-art models on this novel dataset, revealing that the performance of current methods falls short compared to translation-based transfer.
arXiv Detail & Related papers (2020-05-01T12:22:33Z) - XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training,
Understanding and Generation [100.09099800591822]
XGLUE is a new benchmark dataset that can be used to train large-scale cross-lingual pre-trained models.
XGLUE provides 11 diversified tasks that cover both natural language understanding and generation scenarios.
arXiv Detail & Related papers (2020-04-03T07:03:12Z) - XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating
Cross-lingual Generalization [128.37244072182506]
Cross-lingual TRansfer Evaluation of Multilinguals XTREME is a benchmark for evaluating the cross-lingual generalization capabilities of multilingual representations across 40 languages and 9 tasks.
We demonstrate that while models tested on English reach human performance on many tasks, there is still a sizable gap in the performance of cross-lingually transferred models.
arXiv Detail & Related papers (2020-03-24T19:09:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.