Morph Call: Probing Morphosyntactic Content of Multilingual Transformers
- URL: http://arxiv.org/abs/2104.12847v1
- Date: Mon, 26 Apr 2021 19:53:00 GMT
- Title: Morph Call: Probing Morphosyntactic Content of Multilingual Transformers
- Authors: Vladislav Mikhailov and Oleg Serikov and Ekaterina Artemova
- Abstract summary: We present Morph Call, a suite of 46 probing tasks for four Indo-European languages of different morphology: English, French, German and Russian.
We use a combination of neuron-, layer- and representation-level introspection techniques to analyze the morphosyntactic content of four multilingual transformers.
The results show that fine-tuning for POS-tagging can improve and decrease the probing performance and change how morphosyntactic knowledge is distributed across the model.
- Score: 2.041108289731398
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The outstanding performance of transformer-based language models on a great
variety of NLP and NLU tasks has stimulated interest in exploring their inner
workings. Recent research has focused primarily on higher-level and complex
linguistic phenomena such as syntax, semantics, world knowledge, and common
sense. The majority of the studies are anglocentric, and little remains known
regarding other languages, precisely their morphosyntactic properties. To this
end, our work presents Morph Call, a suite of 46 probing tasks for four
Indo-European languages of different morphology: English, French, German and
Russian. We propose a new type of probing task based on the detection of guided
sentence perturbations. We use a combination of neuron-, layer- and
representation-level introspection techniques to analyze the morphosyntactic
content of four multilingual transformers, including their less explored
distilled versions. Besides, we examine how fine-tuning for POS-tagging affects
the model knowledge. The results show that fine-tuning can improve and decrease
the probing performance and change how morphosyntactic knowledge is distributed
across the model. The code and data are publicly available, and we hope to fill
the gaps in the less studied aspect of transformers.
Related papers
- Holmes: A Benchmark to Assess the Linguistic Competence of Language Models [59.627729608055006]
We introduce Holmes, a new benchmark designed to assess language models (LMs) linguistic competence.
We use computation-based probing to examine LMs' internal representations regarding distinct linguistic phenomena.
As a result, we meet recent calls to disentangle LMs' linguistic competence from other cognitive abilities.
arXiv Detail & Related papers (2024-04-29T17:58:36Z) - Is neural language acquisition similar to natural? A chronological
probing study [0.0515648410037406]
We present the chronological probing study of transformer English models such as MultiBERT and T5.
We compare the information about the language learned by the models in the process of training on corpora.
The results show that 1) linguistic information is acquired in the early stages of training 2) both language models demonstrate capabilities to capture various features from various levels of language.
arXiv Detail & Related papers (2022-07-01T17:24:11Z) - Same Neurons, Different Languages: Probing Morphosyntax in Multilingual
Pre-trained Models [84.86942006830772]
We conjecture that multilingual pre-trained models can derive language-universal abstractions about grammar.
We conduct the first large-scale empirical study over 43 languages and 14 morphosyntactic categories with a state-of-the-art neuron-level probe.
arXiv Detail & Related papers (2022-05-04T12:22:31Z) - Modeling Target-Side Morphology in Neural Machine Translation: A
Comparison of Strategies [72.56158036639707]
Morphologically rich languages pose difficulties to machine translation.
A large amount of differently inflected word surface forms entails a larger vocabulary.
Some inflected forms of infrequent terms typically do not appear in the training corpus.
Linguistic agreement requires the system to correctly match the grammatical categories between inflected word forms in the output sentence.
arXiv Detail & Related papers (2022-03-25T10:13:20Z) - Morphology Without Borders: Clause-Level Morphological Annotation [8.559428282730021]
We propose to view morphology as a clause-level phenomenon, rather than word-level.
We deliver a novel dataset for clause-level morphology covering 4 typologically-different languages: English, German, Turkish and Hebrew.
Our experiments show that the clause-level tasks are substantially harder than the respective word-level tasks, while having comparable complexity across languages.
arXiv Detail & Related papers (2022-02-25T17:20:28Z) - Delving Deeper into Cross-lingual Visual Question Answering [115.16614806717341]
We show that simple modifications to the standard training setup can substantially reduce the transfer gap to monolingual English performance.
We analyze cross-lingual VQA across different question types of varying complexity for different multilingual multimodal Transformers.
arXiv Detail & Related papers (2022-02-15T18:22:18Z) - Evaluation of Morphological Embeddings for the Russian Language [0.0]
morphology-based embeddings trained with Skipgram objective do not outperform existing embedding model -- FastText.
A more complex, but morphology unaware model, BERT, allows to achieve significantly greater performance on the tasks that presumably require understanding of a word's morphology.
arXiv Detail & Related papers (2021-03-11T11:59:11Z) - RuSentEval: Linguistic Source, Encoder Force! [1.8160945635344525]
We introduce RuSentEval, an enhanced set of 14 probing tasks for Russian.
We apply a combination of complementary probing methods to explore the distribution of various linguistic properties in five multilingual transformers.
Our results provide intriguing findings that contradict the common understanding of how linguistic knowledge is represented.
arXiv Detail & Related papers (2021-02-28T17:43:42Z) - Verb Knowledge Injection for Multilingual Event Processing [50.27826310460763]
We investigate whether injecting explicit information on verbs' semantic-syntactic behaviour improves the performance of LM-pretrained Transformers.
We first demonstrate that injecting verb knowledge leads to performance gains in English event extraction.
We then explore the utility of verb adapters for event extraction in other languages.
arXiv Detail & Related papers (2020-12-31T03:24:34Z) - Is Supervised Syntactic Parsing Beneficial for Language Understanding?
An Empirical Investigation [71.70562795158625]
Traditional NLP has long held (supervised) syntactic parsing necessary for successful higher-level semantic language understanding (LU)
Recent advent of end-to-end neural models, self-supervised via language modeling (LM), and their success on a wide range of LU tasks, questions this belief.
We empirically investigate the usefulness of supervised parsing for semantic LU in the context of LM-pretrained transformer networks.
arXiv Detail & Related papers (2020-08-15T21:03:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.