On the Importance of Word Order Information in Cross-lingual Sequence
Labeling
- URL: http://arxiv.org/abs/2001.11164v4
- Date: Tue, 8 Dec 2020 11:04:04 GMT
- Title: On the Importance of Word Order Information in Cross-lingual Sequence
Labeling
- Authors: Zihan Liu, Genta Indra Winata, Samuel Cahyawijaya, Andrea Madotto,
Zhaojiang Lin, Pascale Fung
- Abstract summary: Cross-lingual models that fit into the word order of the source language might fail to handle target languages.
We investigate whether making models insensitive to the word order of the source language can improve the adaptation performance in target languages.
- Score: 80.65425412067464
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Word order variances generally exist in different languages. In this paper,
we hypothesize that cross-lingual models that fit into the word order of the
source language might fail to handle target languages. To verify this
hypothesis, we investigate whether making models insensitive to the word order
of the source language can improve the adaptation performance in target
languages. To do so, we reduce the source language word order information
fitted to sequence encoders and observe the performance changes. In addition,
based on this hypothesis, we propose a new method for fine-tuning multilingual
BERT in downstream cross-lingual sequence labeling tasks. Experimental results
on dialogue natural language understanding, part-of-speech tagging, and named
entity recognition tasks show that reducing word order information fitted to
the model can achieve better zero-shot cross-lingual performance. Furthermore,
our proposed methods can also be applied to strong cross-lingual baselines, and
improve their performances.
Related papers
- Discovering Low-rank Subspaces for Language-agnostic Multilingual
Representations [38.56175462620892]
Large pretrained multilingual language models (ML-LMs) have shown remarkable capabilities of zero-shot cross-lingual transfer.
We present a novel view of projecting away language-specific factors from a multilingual embedding space.
We show that applying our method consistently leads to improvements over commonly used ML-LMs.
arXiv Detail & Related papers (2024-01-11T09:54:11Z) - Improving Cross-Lingual Transfer through Subtree-Aware Word Reordering [17.166996956587155]
One obstacle for effective cross-lingual transfer is variability in word-order patterns.
We present a new powerful reordering method, defined in terms of Universal Dependencies.
We show that our method consistently outperforms strong baselines over different language pairs and model architectures.
arXiv Detail & Related papers (2023-10-20T15:25:53Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Tokenization Impacts Multilingual Language Modeling: Assessing
Vocabulary Allocation and Overlap Across Languages [3.716965622352967]
We propose new criteria to evaluate the quality of lexical representation and vocabulary overlap observed in sub-word tokenizers.
Our findings show that the overlap of vocabulary across languages can be actually detrimental to certain downstream tasks.
arXiv Detail & Related papers (2023-05-26T18:06:49Z) - Multilingual Transformer Encoders: a Word-Level Task-Agnostic Evaluation [0.6882042556551609]
Some Transformer-based models can perform cross-lingual transfer learning.
We propose a word-level task-agnostic method to evaluate the alignment of contextualized representations built by such models.
arXiv Detail & Related papers (2022-07-19T05:23:18Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - Leveraging Adversarial Training in Self-Learning for Cross-Lingual Text
Classification [52.69730591919885]
We present a semi-supervised adversarial training process that minimizes the maximal loss for label-preserving input perturbations.
We observe significant gains in effectiveness on document and intent classification for a diverse set of languages.
arXiv Detail & Related papers (2020-07-29T19:38:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.