FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding
- URL: http://arxiv.org/abs/2009.05166v3
- Date: Tue, 15 Dec 2020 07:11:22 GMT
- Title: FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding
- Authors: Yuwei Fang, Shuohang Wang, Zhe Gan, Siqi Sun, Jingjing Liu
- Abstract summary: We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
- Score: 85.29270319872597
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale cross-lingual language models (LM), such as mBERT, Unicoder and
XLM, have achieved great success in cross-lingual representation learning.
However, when applied to zero-shot cross-lingual transfer tasks, most existing
methods use only single-language input for LM finetuning, without leveraging
the intrinsic cross-lingual alignment between different languages that proves
essential for multilingual tasks. In this paper, we propose FILTER, an enhanced
fusion method that takes cross-lingual data as input for XLM finetuning.
Specifically, FILTER first encodes text input in the source language and its
translation in the target language independently in the shallow layers, then
performs cross-language fusion to extract multilingual knowledge in the
intermediate layers, and finally performs further language-specific encoding.
During inference, the model makes predictions based on the text input in the
target language and its translation in the source language. For simple tasks
such as classification, translated text in the target language shares the same
label as the source language. However, this shared label becomes less accurate
or even unavailable for more complex tasks such as question answering, NER and
POS tagging. To tackle this issue, we further propose an additional
KL-divergence self-teaching loss for model training, based on auto-generated
soft pseudo-labels for translated text in the target language. Extensive
experiments demonstrate that FILTER achieves new state of the art on two
challenging multilingual multi-task benchmarks, XTREME and XGLUE.
Related papers
- Cross-lingual Back-Parsing: Utterance Synthesis from Meaning Representation for Zero-Resource Semantic Parsing [6.074150063191985]
Cross-Lingual Back-Parsing is a novel data augmentation methodology designed to enhance cross-lingual transfer for semantic parsing.
Our methodology effectively performs cross-lingual data augmentation in challenging zero-resource settings.
arXiv Detail & Related papers (2024-10-01T08:53:38Z) - Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - Exposing Cross-Lingual Lexical Knowledge from Multilingual Sentence
Encoders [85.80950708769923]
We probe multilingual language models for the amount of cross-lingual lexical knowledge stored in their parameters, and compare them against the original multilingual LMs.
We also devise a novel method to expose this knowledge by additionally fine-tuning multilingual models.
We report substantial gains on standard benchmarks.
arXiv Detail & Related papers (2022-04-30T13:23:16Z) - XeroAlign: Zero-Shot Cross-lingual Transformer Alignment [9.340611077939828]
We introduce a method for task-specific alignment of cross-lingual pretrained transformers such as XLM-R.
XeroAlign uses translated task data to encourage the model to generate similar sentence embeddings for different languages.
XLM-RA's text classification accuracy exceeds that of XLM-R trained with labelled data and performs on par with state-of-the-art models on a cross-lingual adversarial paraphrasing task.
arXiv Detail & Related papers (2021-05-06T07:10:00Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.