Improving Cross-Lingual Transfer through Subtree-Aware Word Reordering
- URL: http://arxiv.org/abs/2310.13583v1
- Date: Fri, 20 Oct 2023 15:25:53 GMT
- Title: Improving Cross-Lingual Transfer through Subtree-Aware Word Reordering
- Authors: Ofir Arviv, Dmitry Nikolaev, Taelin Karidi and Omri Abend
- Abstract summary: One obstacle for effective cross-lingual transfer is variability in word-order patterns.
We present a new powerful reordering method, defined in terms of Universal Dependencies.
We show that our method consistently outperforms strong baselines over different language pairs and model architectures.
- Score: 17.166996956587155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the impressive growth of the abilities of multilingual language
models, such as XLM-R and mT5, it has been shown that they still face
difficulties when tackling typologically-distant languages, particularly in the
low-resource setting. One obstacle for effective cross-lingual transfer is
variability in word-order patterns. It can be potentially mitigated via source-
or target-side word reordering, and numerous approaches to reordering have been
proposed. However, they rely on language-specific rules, work on the level of
POS tags, or only target the main clause, leaving subordinate clauses intact.
To address these limitations, we present a new powerful reordering method,
defined in terms of Universal Dependencies, that is able to learn fine-grained
word-order patterns conditioned on the syntactic context from a small amount of
annotated data and can be applied at all levels of the syntactic tree. We
conduct experiments on a diverse set of tasks and show that our method
consistently outperforms strong baselines over different language pairs and
model architectures. This performance advantage holds true in both zero-shot
and few-shot scenarios.
Related papers
- MoSECroT: Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer [50.40191599304911]
We introduce MoSECroT Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer.
In this paper, we present the first framework that leverages relative representations to construct a common space for the embeddings of a source language PLM and the static word embeddings of a target language.
We show that although our proposed framework is competitive with weak baselines when addressing MoSECroT, it fails to achieve competitive results compared with some strong baselines.
arXiv Detail & Related papers (2024-01-09T21:09:07Z) - CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual
Labeled Sequence Translation [113.99145386490639]
Cross-lingual NER can transfer knowledge between languages via aligned cross-lingual representations or machine translation results.
We propose a Cross-lingual Entity Projection framework (CROP) to enable zero-shot cross-lingual NER.
We adopt a multilingual labeled sequence translation model to project the tagged sequence back to the target language and label the target raw sentence.
arXiv Detail & Related papers (2022-10-13T13:32:36Z) - Multilingual Transformer Encoders: a Word-Level Task-Agnostic Evaluation [0.6882042556551609]
Some Transformer-based models can perform cross-lingual transfer learning.
We propose a word-level task-agnostic method to evaluate the alignment of contextualized representations built by such models.
arXiv Detail & Related papers (2022-07-19T05:23:18Z) - Cross-lingual Text Classification with Heterogeneous Graph Neural
Network [2.6936806968297913]
Cross-lingual text classification aims at training a classifier on the source language and transferring the knowledge to target languages.
Recent multilingual pretrained language models (mPLM) achieve impressive results in cross-lingual classification tasks.
We propose a simple yet effective method to incorporate heterogeneous information within and across languages for cross-lingual text classification.
arXiv Detail & Related papers (2021-05-24T12:45:42Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z) - XL-WiC: A Multilingual Benchmark for Evaluating Semantic
Contextualization [98.61159823343036]
We present the Word-in-Context dataset (WiC) for assessing the ability to correctly model distinct meanings of a word.
We put forward a large multilingual benchmark, XL-WiC, featuring gold standards in 12 new languages.
Experimental results show that even when no tagged instances are available for a target language, models trained solely on the English data can attain competitive performance.
arXiv Detail & Related papers (2020-10-13T15:32:00Z) - On the Importance of Word Order Information in Cross-lingual Sequence
Labeling [80.65425412067464]
Cross-lingual models that fit into the word order of the source language might fail to handle target languages.
We investigate whether making models insensitive to the word order of the source language can improve the adaptation performance in target languages.
arXiv Detail & Related papers (2020-01-30T03:35:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.