Multilingual Transformer Encoders: a Word-Level Task-Agnostic Evaluation
- URL: http://arxiv.org/abs/2207.09076v1
- Date: Tue, 19 Jul 2022 05:23:18 GMT
- Title: Multilingual Transformer Encoders: a Word-Level Task-Agnostic Evaluation
- Authors: F\'elix Gaschi, Fran\c{c}ois Plesse, Parisa Rastin and Yannick
Toussaint
- Abstract summary: Some Transformer-based models can perform cross-lingual transfer learning.
We propose a word-level task-agnostic method to evaluate the alignment of contextualized representations built by such models.
- Score: 0.6882042556551609
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Some Transformer-based models can perform cross-lingual transfer learning:
those models can be trained on a specific task in one language and give
relatively good results on the same task in another language, despite having
been pre-trained on monolingual tasks only. But, there is no consensus yet on
whether those transformer-based models learn universal patterns across
languages. We propose a word-level task-agnostic method to evaluate the
alignment of contextualized representations built by such models. We show that
our method provides more accurate translated word pairs than previous methods
to evaluate word-level alignment. And our results show that some inner layers
of multilingual Transformer-based models outperform other explicitly aligned
representations, and even more so according to a stricter definition of
multilingual alignment.
Related papers
- T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text
Classification [50.675552118811]
Cross-lingual text classification is typically built on large-scale, multilingual language models (LMs) pretrained on a variety of languages of interest.
We propose revisiting the classic "translate-and-test" pipeline to neatly separate the translation and classification stages.
arXiv Detail & Related papers (2023-06-08T07:33:22Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - mLUKE: The Power of Entity Representations in Multilingual Pretrained
Language Models [15.873069955407406]
We train a multilingual language model with 24 languages with entity representations.
We show the model consistently outperforms word-based pretrained models in various cross-lingual transfer tasks.
We also evaluate the model with a multilingual cloze prompt task with the mLAMA dataset.
arXiv Detail & Related papers (2021-10-15T15:28:38Z) - Translate & Fill: Improving Zero-Shot Multilingual Semantic Parsing with
Synthetic Data [2.225882303328135]
We propose a novel Translate-and-Fill (TaF) method to produce silver training data for a multilingual semantic parsing task.
Experimental results on three multilingual semantic parsing datasets show that data augmentation with TaF reaches accuracies competitive with similar systems.
arXiv Detail & Related papers (2021-09-09T14:51:11Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z) - Mono vs Multilingual Transformer-based Models: a Comparison across
Several Language Tasks [1.2691047660244335]
BERT (Bidirectional Representations from Transformers) and ALBERT (A Lite BERT) are methods for pre-training language models.
We make available our trained BERT and Albert model for Portuguese.
arXiv Detail & Related papers (2020-07-19T19:13:20Z) - Learning to Scale Multilingual Representations for Vision-Language Tasks [51.27839182889422]
The effectiveness of SMALR is demonstrated with ten diverse languages, over twice the number supported in vision-language tasks to date.
We evaluate on multilingual image-sentence retrieval and outperform prior work by 3-4% with less than 1/5th the training parameters compared to other word embedding methods.
arXiv Detail & Related papers (2020-04-09T01:03:44Z) - On the Importance of Word Order Information in Cross-lingual Sequence
Labeling [80.65425412067464]
Cross-lingual models that fit into the word order of the source language might fail to handle target languages.
We investigate whether making models insensitive to the word order of the source language can improve the adaptation performance in target languages.
arXiv Detail & Related papers (2020-01-30T03:35:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.