Universal and Independent: Multilingual Probing Framework for Exhaustive
Model Interpretation and Evaluation
- URL: http://arxiv.org/abs/2210.13236v1
- Date: Mon, 24 Oct 2022 13:41:17 GMT
- Title: Universal and Independent: Multilingual Probing Framework for Exhaustive
Model Interpretation and Evaluation
- Authors: Oleg Serikov, Vitaly Protasov, Ekaterina Voloshina, Viktoria
Knyazkova, Tatiana Shavrina
- Abstract summary: We present and apply the GUI-assisted framework allowing us to easily probe a massive number of languages.
Most of the regularities revealed in the mBERT model are typical for the western-European languages.
Our framework can be integrated with the existing probing toolboxes, model cards, and leaderboards.
- Score: 0.04199844472131922
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Linguistic analysis of language models is one of the ways to explain and
describe their reasoning, weaknesses, and limitations. In the probing part of
the model interpretability research, studies concern individual languages as
well as individual linguistic structures. The question arises: are the detected
regularities linguistically coherent, or on the contrary, do they dissonate at
the typological scale? Moreover, the majority of studies address the inherent
set of languages and linguistic structures, leaving the actual typological
diversity knowledge out of scope. In this paper, we present and apply the
GUI-assisted framework allowing us to easily probe a massive number of
languages for all the morphosyntactic features present in the Universal
Dependencies data. We show that reflecting the anglo-centric trend in NLP over
the past years, most of the regularities revealed in the mBERT model are
typical for the western-European languages. Our framework can be integrated
with the existing probing toolboxes, model cards, and leaderboards, allowing
practitioners to use and share their standard probing methods to interpret
multilingual models. Thus we propose a toolkit to systematize the multilingual
flaws in multilingual models, providing a reproducible experimental setup for
104 languages and 80 morphosyntactic features.
https://github.com/AIRI-Institute/Probing_framework
Related papers
- Investigating Language-Specific Calibration For Pruning Multilingual Large Language Models [11.421452042888523]
We compare different calibration languages for pruning multilingual models across diverse languages, tasks, models, and SotA pruning techniques.
Our results offer practical suggestions, for example, calibrating in the target language can efficiently retain the language modeling capability but does not necessarily benefit downstream tasks.
arXiv Detail & Related papers (2024-08-26T16:29:13Z) - The Less the Merrier? Investigating Language Representation in
Multilingual Models [8.632506864465501]
We investigate the linguistic representation of different languages in multilingual models.
We observe from our experiments that community-centered models perform better at distinguishing between languages in the same family for low-resource languages.
arXiv Detail & Related papers (2023-10-20T02:26:34Z) - Language Embeddings Sometimes Contain Typological Generalizations [0.0]
We train neural models for a range of natural language processing tasks on a massively multilingual dataset of Bible translations in 1295 languages.
The learned language representations are then compared to existing typological databases as well as to a novel set of quantitative syntactic and morphological features.
We conclude that some generalizations are surprisingly close to traditional features from linguistic typology, but that most models, as well as those of previous work, do not appear to have made linguistically meaningful generalizations.
arXiv Detail & Related papers (2023-01-19T15:09:59Z) - Integrating Linguistic Theory and Neural Language Models [2.870517198186329]
I present several case studies to illustrate how theoretical linguistics and neural language models are still relevant to each other.
This thesis contributes three studies that explore different aspects of the syntax-semantics interface in language models.
arXiv Detail & Related papers (2022-07-20T04:20:46Z) - Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of
Multilingual Language Models [73.11488464916668]
This study investigates the dynamics of the multilingual pretraining process.
We probe checkpoints taken from throughout XLM-R pretraining, using a suite of linguistic tasks.
Our analysis shows that the model achieves high in-language performance early on, with lower-level linguistic skills acquired before more complex ones.
arXiv Detail & Related papers (2022-05-24T03:35:00Z) - Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis.
We cluster all the target languages into multiple groups and name each group as a representation sprachbund.
Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z) - Are pre-trained text representations useful for multilingual and
multi-dimensional language proficiency modeling? [6.294759639481189]
This paper describes our experiments and observations about the role of pre-trained and fine-tuned multilingual embeddings in performing multi-dimensional, multilingual language proficiency classification.
Our results indicate that while fine-tuned embeddings are useful for multilingual proficiency modeling, none of the features achieve consistently best performance for all dimensions of language proficiency.
arXiv Detail & Related papers (2021-02-25T16:23:52Z) - XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning [68.57658225995966]
Cross-lingual Choice of Plausible Alternatives (XCOPA) is a typologically diverse multilingual dataset for causal commonsense reasoning in 11 languages.
We evaluate a range of state-of-the-art models on this novel dataset, revealing that the performance of current methods falls short compared to translation-based transfer.
arXiv Detail & Related papers (2020-05-01T12:22:33Z) - Linguistic Typology Features from Text: Inferring the Sparse Features of
World Atlas of Language Structures [73.06435180872293]
We construct a recurrent neural network predictor based on byte embeddings and convolutional layers.
We show that some features from various linguistic types can be predicted reliably.
arXiv Detail & Related papers (2020-04-30T21:00:53Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.