A Tale of Two Laws of Semantic Change: Predicting Synonym Changes with
Distributional Semantic Models
- URL: http://arxiv.org/abs/2305.19143v1
- Date: Tue, 30 May 2023 15:50:29 GMT
- Title: A Tale of Two Laws of Semantic Change: Predicting Synonym Changes with
Distributional Semantic Models
- Authors: Bastien Li\'etard and Mikaela Keller and Pascal Denis
- Abstract summary: There are two competing, apparently opposite hypotheses in the historical linguistic literature regarding how synonymous words evolve.
We take a first step toward detecting whether Law of Differentiation (LD) or Law of Parallel Change (LPC) operates for given word pairs.
We then propose various computational approaches to the problem using Distributional Semantic Models and grounded in recent literature on Lexical Semantic Change detection.
- Score: 1.856334276134661
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lexical Semantic Change is the study of how the meaning of words evolves
through time. Another related question is whether and how lexical relations
over pairs of words, such as synonymy, change over time. There are currently
two competing, apparently opposite hypotheses in the historical linguistic
literature regarding how synonymous words evolve: the Law of Differentiation
(LD) argues that synonyms tend to take on different meanings over time, whereas
the Law of Parallel Change (LPC) claims that synonyms tend to undergo the same
semantic change and therefore remain synonyms. So far, there has been little
research using distributional models to assess to what extent these laws apply
on historical corpora. In this work, we take a first step toward detecting
whether LD or LPC operates for given word pairs. After recasting the problem
into a more tractable task, we combine two linguistic resources to propose the
first complete evaluation framework on this problem and provide empirical
evidence in favor of a dominance of LD. We then propose various computational
approaches to the problem using Distributional Semantic Models and grounded in
recent literature on Lexical Semantic Change detection. Our best approaches
achieve a balanced accuracy above 0.6 on our dataset. We discuss challenges
still faced by these approaches, such as polysemy or the potential confusion
between synonymy and hypernymy.
Related papers
- Distributional Semantics, Holism, and the Instability of Meaning [0.0]
A standard objection to meaning holism is the charge of instability.
In this article we examine whether the instability objection poses a problem for distributional models of meaning.
arXiv Detail & Related papers (2024-05-20T14:53:25Z) - Towards Unsupervised Recognition of Token-level Semantic Differences in
Related Documents [61.63208012250885]
We formulate recognizing semantic differences as a token-level regression task.
We study three unsupervised approaches that rely on a masked language model.
Our results show that an approach based on word alignment and sentence-level contrastive learning has a robust correlation to gold labels.
arXiv Detail & Related papers (2023-05-22T17:58:04Z) - Unsupervised Semantic Variation Prediction using the Distribution of
Sibling Embeddings [17.803726860514193]
Detection of semantic variation of words is an important task for various NLP applications.
We argue that mean representations alone cannot accurately capture such semantic variations.
We propose a method that uses the entire cohort of the contextualised embeddings of the target word.
arXiv Detail & Related papers (2023-05-15T13:58:21Z) - Contextualized language models for semantic change detection: lessons
learned [4.436724861363513]
We present a qualitative analysis of the outputs of contextualized embedding-based methods for detecting diachronic semantic change.
Our findings show that contextualized methods can often predict high change scores for words which are not undergoing any real diachronic semantic shift.
Our conclusion is that pre-trained contextualized language models are prone to confound changes in lexicographic senses and changes in contextual variance.
arXiv Detail & Related papers (2022-08-31T23:35:24Z) - Semantic-Preserving Adversarial Text Attacks [85.32186121859321]
We propose a Bigram and Unigram based adaptive Semantic Preservation Optimization (BU-SPO) method to examine the vulnerability of deep models.
Our method achieves the highest attack success rates and semantics rates by changing the smallest number of words compared with existing methods.
arXiv Detail & Related papers (2021-08-23T09:05:18Z) - Exploring the Representation of Word Meanings in Context: A Case Study
on Homonymy and Synonymy [0.0]
We assess the ability of both static and contextualized models to adequately represent different lexical-semantic relations.
Experiments are performed in Galician, Portuguese, English, and Spanish.
arXiv Detail & Related papers (2021-06-25T10:54:23Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - Lexical semantic change for Ancient Greek and Latin [61.69697586178796]
Associating a word's correct meaning in its historical context is a central challenge in diachronic research.
We build on a recent computational approach to semantic change based on a dynamic Bayesian mixture model.
We provide a systematic comparison of dynamic Bayesian mixture models for semantic change with state-of-the-art embedding-based models.
arXiv Detail & Related papers (2021-01-22T12:04:08Z) - NLP-CIC @ DIACR-Ita: POS and Neighbor Based Distributional Models for
Lexical Semantic Change in Diachronic Italian Corpora [62.997667081978825]
We present our systems and findings on unsupervised lexical semantic change for the Italian language.
The task is to determine whether a target word has evolved its meaning with time, only relying on raw-text from two time-specific datasets.
We propose two models representing the target words across the periods to predict the changing words using threshold and voting schemes.
arXiv Detail & Related papers (2020-11-07T11:27:18Z) - Speakers Fill Lexical Semantic Gaps with Context [65.08205006886591]
We operationalise the lexical ambiguity of a word as the entropy of meanings it can take.
We find significant correlations between our estimate of ambiguity and the number of synonyms a word has in WordNet.
This suggests that, in the presence of ambiguity, speakers compensate by making contexts more informative.
arXiv Detail & Related papers (2020-10-05T17:19:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.