Cross-neutralising: Probing for joint encoding of linguistic information
in multilingual models
- URL: http://arxiv.org/abs/2010.12825v2
- Date: Sat, 13 Mar 2021 16:21:09 GMT
- Title: Cross-neutralising: Probing for joint encoding of linguistic information
in multilingual models
- Authors: Rochelle Choenni, Ekaterina Shutova
- Abstract summary: We study how relationships between languages are encoded in two state-of-the-art multilingual models.
The results suggest that linguistic properties are encoded jointly across typologically-similar languages.
- Score: 17.404220737977738
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multilingual sentence encoders are widely used to transfer NLP models across
languages. The success of this transfer is, however, dependent on the model's
ability to encode the patterns of cross-lingual similarity and variation. Yet,
little is known as to how these models are able to do this. We propose a simple
method to study how relationships between languages are encoded in two
state-of-the-art multilingual models (i.e. M-BERT and XLM-R). The results
provide insight into their information sharing mechanisms and suggest that
linguistic properties are encoded jointly across typologically-similar
languages in these models.
Related papers
- Streaming Bilingual End-to-End ASR model using Attention over Multiple
Softmax [6.386371634323785]
We propose a novel bilingual end-to-end (E2E) modeling approach, where a single neural model can recognize both languages.
The proposed model has shared encoder and prediction networks, with language-specific joint networks that are combined via a self-attention mechanism.
arXiv Detail & Related papers (2024-01-22T01:44:42Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Exploring Representational Disparities Between Multilingual and Bilingual Translation Models [16.746335565636976]
Some language pairs in multilingual models can see worse performance than in bilingual models, especially in the one-to-many translation setting.
We show that for a given language pair, its multilingual model decoder representations are consistently less isotropic and occupy fewer dimensions than comparable bilingual model decoder representations.
arXiv Detail & Related papers (2023-05-23T16:46:18Z) - Multi-lingual Evaluation of Code Generation Models [82.7357812992118]
We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X.
These datasets cover over 10 programming languages.
We are able to assess the performance of code generation models in a multi-lingual fashion.
arXiv Detail & Related papers (2022-10-26T17:17:06Z) - Breaking Down Multilingual Machine Translation [74.24795388967907]
We show that multilingual training is beneficial to encoders in general, while it only benefits decoders for low-resource languages (LRLs)
Our many-to-one models for high-resource languages and one-to-many models for LRLs outperform the best results reported by Aharoni et al.
arXiv Detail & Related papers (2021-10-15T14:57:12Z) - A Multilingual Bag-of-Entities Model for Zero-Shot Cross-Lingual Text
Classification [16.684856745734944]
We present a multilingual bag-of-entities model that boosts the performance of zero-shot cross-lingual text classification.
It leverages the multilingual nature of Wikidata: entities in multiple languages representing the same concept are defined with a unique identifier.
A model trained on entity features in a resource-rich language can thus be directly applied to other languages.
arXiv Detail & Related papers (2021-10-15T01:10:50Z) - Revisiting Language Encoding in Learning Multilingual Representations [70.01772581545103]
We propose a new approach called Cross-lingual Language Projection (XLP) to replace language embedding.
XLP projects the word embeddings into language-specific semantic space, and then the projected embeddings will be fed into the Transformer model.
Experiments show that XLP can freely and significantly boost the model performance on extensive multilingual benchmark datasets.
arXiv Detail & Related papers (2021-02-16T18:47:10Z) - Cross-lingual Machine Reading Comprehension with Language Branch
Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages.
We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC)
LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language.
We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z) - What does it mean to be language-agnostic? Probing multilingual sentence
encoders for typological properties [17.404220737977738]
We propose methods for probing sentence representations from state-of-the-art multilingual encoders.
Our results show interesting differences in encoding linguistic variation associated with different pretraining strategies.
arXiv Detail & Related papers (2020-09-27T15:00:52Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - Language-agnostic Multilingual Modeling [23.06484126933893]
We build a language-agnostic multilingual ASR system which transforms all languages to one writing system through a many-to-one transliteration transducer.
We show with four Indic languages, namely, Hindi, Bengali, Tamil and Kannada, that the language-agnostic multilingual model achieves up to 10% relative reduction in Word Error Rate (WER) over a language-dependent multilingual model.
arXiv Detail & Related papers (2020-04-20T18:57:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.