A Systematic Analysis of Morphological Content in BERT Models for
Multiple Languages
- URL: http://arxiv.org/abs/2004.03032v1
- Date: Mon, 6 Apr 2020 22:50:27 GMT
- Title: A Systematic Analysis of Morphological Content in BERT Models for
Multiple Languages
- Authors: Daniel Edmiston
- Abstract summary: This work describes experiments which probe the hidden representations of several BERT-style models for morphological content.
The goal is to examine the extent to which discrete linguistic structure, in the form of morphological features and feature values, presents itself in the vector representations and attention distributions of pre-trained language models for five European languages.
- Score: 2.345305607613153
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work describes experiments which probe the hidden representations of
several BERT-style models for morphological content. The goal is to examine the
extent to which discrete linguistic structure, in the form of morphological
features and feature values, presents itself in the vector representations and
attention distributions of pre-trained language models for five European
languages. The experiments contained herein show that (i) Transformer
architectures largely partition their embedding space into convex sub-regions
highly correlated with morphological feature value, (ii) the contextualized
nature of transformer embeddings allows models to distinguish ambiguous
morphological forms in many, but not all cases, and (iii) very specific
attention head/layer combinations appear to hone in on subject-verb agreement.
Related papers
- Linguistically Grounded Analysis of Language Models using Shapley Head Values [2.914115079173979]
We investigate the processing of morphosyntactic phenomena by leveraging a recently proposed method for probing language models via Shapley Head Values (SHVs)
Using the English language BLiMP dataset, we test our approach on two widely used models, BERT and RoBERTa, and compare how linguistic constructions are handled.
Our results show that SHV-based attributions reveal distinct patterns across both models, providing insights into how language models organize and process linguistic information.
arXiv Detail & Related papers (2024-10-17T09:48:08Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Labeled Morphological Segmentation with Semi-Markov Models [127.69031138022534]
We present labeled morphological segmentation, an alternative view of morphological processing that unifies several tasks.
We additionally introduce a new hierarchy of morphotactic tagsets.
We develop modelname, a discriminative morphological segmentation system that explicitly models morphotactics.
arXiv Detail & Related papers (2024-04-13T12:51:53Z) - A Joint Matrix Factorization Analysis of Multilingual Representations [28.751144371901958]
We present an analysis tool based on joint matrix factorization for comparing latent representations of multilingual and monolingual models.
We study to what extent and how morphosyntactic features are reflected in the representations learned by multilingual pre-trained models.
arXiv Detail & Related papers (2023-10-24T04:43:45Z) - Exploring Linguistic Probes for Morphological Generalization [11.568042812213712]
Testing these probes on three morphologically distinct languages, we find evidence that three leading morphological inflection systems employ distinct generalization strategies over conjugational classes and feature sets on both orthographic and phonologically transcribed inputs.
arXiv Detail & Related papers (2023-10-20T17:45:30Z) - Investigating semantic subspaces of Transformer sentence embeddings
through linear structural probing [2.5002227227256864]
We present experiments with semantic structural probing, a method for studying sentence-level representations.
We apply our method to language models from different families (encoder-only, decoder-only, encoder-decoder) and of different sizes in the context of two tasks.
We find that model families differ substantially in their performance and layer dynamics, but that the results are largely model-size invariant.
arXiv Detail & Related papers (2023-10-18T12:32:07Z) - Model Criticism for Long-Form Text Generation [113.13900836015122]
We apply a statistical tool, model criticism in latent space, to evaluate the high-level structure of generated text.
We perform experiments on three representative aspects of high-level discourse -- coherence, coreference, and topicality.
We find that transformer-based language models are able to capture topical structures but have a harder time maintaining structural coherence or modeling coreference.
arXiv Detail & Related papers (2022-10-16T04:35:58Z) - A Massively Multilingual Analysis of Cross-linguality in Shared
Embedding Space [61.18554842370824]
In cross-lingual language models, representations for many different languages live in the same space.
We compute a task-based measure of cross-lingual alignment in the form of bitext retrieval performance.
We examine a range of linguistic, quasi-linguistic, and training-related features as potential predictors of these alignment metrics.
arXiv Detail & Related papers (2021-09-13T21:05:37Z) - On the Transferability of Neural Models of Morphological Analogies [7.89271130004391]
In this paper, we focus on morphological tasks and we propose a deep learning approach to detect morphological analogies.
We present an empirical study to see how our framework transfers across languages, and that highlights interesting similarities and differences between these languages.
In view of these results, we also discuss the possibility of building a multilingual morphological model.
arXiv Detail & Related papers (2021-08-09T11:08:33Z) - APo-VAE: Text Generation in Hyperbolic Space [116.11974607497986]
In this paper, we investigate text generation in a hyperbolic latent space to learn continuous hierarchical representations.
An Adrial Poincare Variversaational Autoencoder (APo-VAE) is presented, where both the prior and variational posterior of latent variables are defined over a Poincare ball via wrapped normal distributions.
Experiments in language modeling and dialog-response generation tasks demonstrate the winning effectiveness of the proposed APo-VAE model.
arXiv Detail & Related papers (2020-04-30T19:05:41Z) - Evaluating Transformer-Based Multilingual Text Classification [55.53547556060537]
We argue that NLP tools perform unequally across languages with different syntactic and morphological structures.
We calculate word order and morphological similarity indices to aid our empirical study.
arXiv Detail & Related papers (2020-04-29T03:34:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.