Do Not Fire the Linguist: Grammatical Profiles Help Language Models
Detect Semantic Change
- URL: http://arxiv.org/abs/2204.05717v1
- Date: Tue, 12 Apr 2022 11:20:42 GMT
- Title: Do Not Fire the Linguist: Grammatical Profiles Help Language Models
Detect Semantic Change
- Authors: Mario Giulianelli, Andrey Kutuzov, Lidia Pivovarova
- Abstract summary: We first compare the performance of grammatical profiles against that of a multilingual neural language model (XLM-R) on 10 datasets, covering 7 languages.
Our results show that ensembling grammatical profiles with XLM-R improves semantic change detection performance for most datasets and languages.
- Score: 6.7485485663645495
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Morphological and syntactic changes in word usage (as captured, e.g., by
grammatical profiles) have been shown to be good predictors of a word's meaning
change. In this work, we explore whether large pre-trained contextualised
language models, a common tool for lexical semantic change detection, are
sensitive to such morphosyntactic changes. To this end, we first compare the
performance of grammatical profiles against that of a multilingual neural
language model (XLM-R) on 10 datasets, covering 7 languages, and then combine
the two approaches in ensembles to assess their complementarity. Our results
show that ensembling grammatical profiles with XLM-R improves semantic change
detection performance for most datasets and languages. This indicates that
language models do not fully cover the fine-grained morphological and syntactic
signals that are explicitly represented in grammatical profiles.
An interesting exception are the test sets where the time spans under
analysis are much longer than the time gap between them (for example,
century-long spans with a one-year gap between them). Morphosyntactic change is
slow so grammatical profiles do not detect in such cases. In contrast, language
models, thanks to their access to lexical information, are able to detect fast
topical changes.
Related papers
- Why do language models perform worse for morphologically complex languages? [0.913127392774573]
We find new evidence for a performance gap between agglutinative and fusional languages.
We propose three possible causes for this performance gap: morphological alignment of tokenizers, tokenization quality, and disparities in dataset sizes and measurement.
Results suggest that no language is harder or easier for a language model to learn on the basis of its morphological typology.
arXiv Detail & Related papers (2024-11-21T15:06:51Z) - Syntactic Language Change in English and German: Metrics, Parsers, and Convergences [56.47832275431858]
The current paper looks at diachronic trends in syntactic language change in both English and German, using corpora of parliamentary debates from the last c. 160 years.
We base our observations on five dependencys, including the widely used Stanford Core as well as 4 newer alternatives.
We show that changes in syntactic measures seem to be more frequent at the tails of sentence length distributions.
arXiv Detail & Related papers (2024-02-18T11:46:16Z) - Improving Temporal Generalization of Pre-trained Language Models with
Lexical Semantic Change [28.106524698188675]
Recent research has revealed that neural language models at scale suffer from poor temporal generalization capability.
We propose a simple yet effective lexical-level masking strategy to post-train a converged language model.
arXiv Detail & Related papers (2022-10-31T08:12:41Z) - Contextualized language models for semantic change detection: lessons
learned [4.436724861363513]
We present a qualitative analysis of the outputs of contextualized embedding-based methods for detecting diachronic semantic change.
Our findings show that contextualized methods can often predict high change scores for words which are not undergoing any real diachronic semantic shift.
Our conclusion is that pre-trained contextualized language models are prone to confound changes in lexicographic senses and changes in contextual variance.
arXiv Detail & Related papers (2022-08-31T23:35:24Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - On The Ingredients of an Effective Zero-shot Semantic Parser [95.01623036661468]
We analyze zero-shot learning by paraphrasing training examples of canonical utterances and programs from a grammar.
We propose bridging these gaps using improved grammars, stronger paraphrasers, and efficient learning methods.
Our model achieves strong performance on two semantic parsing benchmarks (Scholar, Geo) with zero labeled data.
arXiv Detail & Related papers (2021-10-15T21:41:16Z) - Grammatical Profiling for Semantic Change Detection [6.3596637237946725]
We use grammatical profiling as an alternative method for semantic change detection.
We demonstrate that it can be used for semantic change detection and even outperforms some distributional semantic methods.
arXiv Detail & Related papers (2021-09-21T18:38:18Z) - A Massively Multilingual Analysis of Cross-linguality in Shared
Embedding Space [61.18554842370824]
In cross-lingual language models, representations for many different languages live in the same space.
We compute a task-based measure of cross-lingual alignment in the form of bitext retrieval performance.
We examine a range of linguistic, quasi-linguistic, and training-related features as potential predictors of these alignment metrics.
arXiv Detail & Related papers (2021-09-13T21:05:37Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - Word Frequency Does Not Predict Grammatical Knowledge in Language Models [2.1984302611206537]
We investigate whether there are systematic sources of variation in the language models' accuracy.
We find that certain nouns are systematically understood better than others, an effect which is robust across grammatical tasks and different language models.
We find that a novel noun's grammatical properties can be few-shot learned from various types of training data.
arXiv Detail & Related papers (2020-10-26T19:51:36Z) - Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size.
We propose a fully compositional output embedding layer for language models.
To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.