A Call for More Rigor in Unsupervised Cross-lingual Learning
- URL: http://arxiv.org/abs/2004.14958v1
- Date: Thu, 30 Apr 2020 17:06:23 GMT
- Title: A Call for More Rigor in Unsupervised Cross-lingual Learning
- Authors: Mikel Artetxe, Sebastian Ruder, Dani Yogatama, Gorka Labaka, Eneko
Agirre
- Abstract summary: An existing rationale for such research is based on the lack of parallel data for many of the world's languages.
We argue that a scenario without any parallel data and abundant monolingual data is unrealistic in practice.
- Score: 76.6545568416577
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We review motivations, definition, approaches, and methodology for
unsupervised cross-lingual learning and call for a more rigorous position in
each of them. An existing rationale for such research is based on the lack of
parallel data for many of the world's languages. However, we argue that a
scenario without any parallel data and abundant monolingual data is unrealistic
in practice. We also discuss different training signals that have been used in
previous work, which depart from the pure unsupervised setting. We then
describe common methodological issues in tuning and evaluation of unsupervised
cross-lingual models and present best practices. Finally, we provide a unified
outlook for different types of research in this area (i.e., cross-lingual word
embeddings, deep multilingual pretraining, and unsupervised machine
translation) and argue for comparable evaluation of these models.
Related papers
- Distilling Monolingual and Crosslingual Word-in-Context Representations [18.87665111304974]
We propose a method that distils representations of word meaning in context from a pre-trained language model in both monolingual and crosslingual settings.
Our method does not require human-annotated corpora nor updates of the parameters of the pre-trained model.
Our method learns to combine the outputs of different hidden layers of the pre-trained model using self-attention.
arXiv Detail & Related papers (2024-09-13T11:10:16Z) - Understanding Cross-Lingual Alignment -- A Survey [52.572071017877704]
Cross-lingual alignment is the meaningful similarity of representations across languages in multilingual language models.
We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field.
arXiv Detail & Related papers (2024-04-09T11:39:53Z) - The Impact of Syntactic and Semantic Proximity on Machine Translation with Back-Translation [7.557957450498644]
We conduct experiments with artificial languages to determine what properties of languages make back-translation an effective training method.
We find, contrary to popular belief, that (i) parallel word frequency distributions, (ii) partially shared vocabulary, and (iii) similar syntactic structure across languages are not sufficient to explain the success of back-translation.
We conjecture that rich semantic dependencies, parallel across languages, are at the root of the success of unsupervised methods based on back-translation.
arXiv Detail & Related papers (2024-03-26T18:38:14Z) - Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of
Multilingual Language Models [73.11488464916668]
This study investigates the dynamics of the multilingual pretraining process.
We probe checkpoints taken from throughout XLM-R pretraining, using a suite of linguistic tasks.
Our analysis shows that the model achieves high in-language performance early on, with lower-level linguistic skills acquired before more complex ones.
arXiv Detail & Related papers (2022-05-24T03:35:00Z) - Cross-lingual Lifelong Learning [53.06904052325966]
We present a principled Cross-lingual Continual Learning (CCL) evaluation paradigm.
We provide insights into what makes multilingual sequential learning particularly challenging.
The implications of this analysis include a recipe for how to measure and balance different cross-lingual continual learning desiderata.
arXiv Detail & Related papers (2022-05-23T09:25:43Z) - Out of Thin Air: Is Zero-Shot Cross-Lingual Keyword Detection Better
Than Unsupervised? [8.594972401685649]
We study whether pretrained multilingual language models can be employed for zero-shot cross-lingual keyword extraction on low-resource languages.
The comparison is conducted on six news article datasets covering two high-resource languages, English and Russian, and four low-resource languages.
We find that the pretrained models fine-tuned on a multilingual corpus covering languages that do not appear in the test set, consistently outscore unsupervised models in all six languages.
arXiv Detail & Related papers (2022-02-14T12:06:45Z) - IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and
Languages [87.5457337866383]
We introduce the Image-Grounded Language Understanding Evaluation benchmark.
IGLUE brings together visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages.
We find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks.
arXiv Detail & Related papers (2022-01-27T18:53:22Z) - It's All in the Heads: Using Attention Heads as a Baseline for
Cross-Lingual Transfer in Commonsense Reasoning [4.200736775540874]
We design a simple approach to commonsense reasoning which trains a linear classifier with weights of multi-head attention as features.
The method performs competitively with recent supervised and unsupervised approaches for commonsense reasoning.
Most of the performance is given by the same small subset of attention heads for all studied languages.
arXiv Detail & Related papers (2021-06-22T21:25:43Z) - Globetrotter: Unsupervised Multilingual Translation from Visual
Alignment [24.44204156935044]
We introduce a framework that uses the visual modality to align multiple languages.
We estimate the cross-modal alignment between language and images, and use this estimate to guide the learning of cross-lingual representations.
Our language representations are trained jointly in one model with a single stage.
arXiv Detail & Related papers (2020-12-08T18:50:40Z) - Probing Task-Oriented Dialogue Representation from Language Models [106.02947285212132]
This paper investigates pre-trained language models to find out which model intrinsically carries the most informative representation for task-oriented dialogue tasks.
We fine-tune a feed-forward layer as the classifier probe on top of a fixed pre-trained language model with annotated labels in a supervised way.
arXiv Detail & Related papers (2020-10-26T21:34:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.