Lexical Semantic Change Discovery
- URL: http://arxiv.org/abs/2106.03111v1
- Date: Sun, 6 Jun 2021 13:02:38 GMT
- Title: Lexical Semantic Change Discovery
- Authors: Sinan Kurtyigit, Maike Park, Dominik Schlechtweg, Jonas Kuhn, Sabine
Schulte im Walde
- Abstract summary: We propose a shift from change detection to change discovery, i.e., discovering novel word senses over time from the full corpus vocabulary.
By heavily fine-tuning a type-based and a token-based approach on recently published German data, we demonstrate that both models can successfully be applied to discover new words undergoing meaning change.
- Score: 22.934650688233734
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While there is a large amount of research in the field of Lexical Semantic
Change Detection, only few approaches go beyond a standard benchmark evaluation
of existing models. In this paper, we propose a shift of focus from change
detection to change discovery, i.e., discovering novel word senses over time
from the full corpus vocabulary. By heavily fine-tuning a type-based and a
token-based approach on recently published German data, we demonstrate that
both models can successfully be applied to discover new words undergoing
meaning change. Furthermore, we provide an almost fully automated framework for
both evaluation and discovery.
Related papers
- Beyond Coarse-Grained Matching in Video-Text Retrieval [50.799697216533914]
We introduce a new approach for fine-grained evaluation.
Our approach can be applied to existing datasets by automatically generating hard negative test captions.
Experiments on our fine-grained evaluations demonstrate that this approach enhances a model's ability to understand fine-grained differences.
arXiv Detail & Related papers (2024-10-16T09:42:29Z) - Semantic change detection for Slovene language: a novel dataset and an
approach based on optimal transport [0.0]
We focus on the detection of semantic changes in Slovene, a less resourced Slavic language with two million speakers.
We present the first Slovene dataset for evaluating semantic change detection systems.
arXiv Detail & Related papers (2024-02-26T14:27:06Z) - Graph-based Clustering for Detecting Semantic Change Across Time and
Languages [10.058655884092094]
We propose a graph-based clustering approach to capture nuanced changes in both high- and low-frequency word senses across time and languages.
Our approach substantially surpasses previous approaches in the SemEval 2020 binary classification task across four languages.
arXiv Detail & Related papers (2024-02-01T21:27:19Z) - Towards Unsupervised Recognition of Token-level Semantic Differences in
Related Documents [61.63208012250885]
We formulate recognizing semantic differences as a token-level regression task.
We study three unsupervised approaches that rely on a masked language model.
Our results show that an approach based on word alignment and sentence-level contrastive learning has a robust correlation to gold labels.
arXiv Detail & Related papers (2023-05-22T17:58:04Z) - Verifying the Robustness of Automatic Credibility Assessment [79.08422736721764]
Text classification methods have been widely investigated as a way to detect content of low credibility.
In some cases insignificant changes in input text can mislead the models.
We introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Contextualized language models for semantic change detection: lessons
learned [4.436724861363513]
We present a qualitative analysis of the outputs of contextualized embedding-based methods for detecting diachronic semantic change.
Our findings show that contextualized methods can often predict high change scores for words which are not undergoing any real diachronic semantic shift.
Our conclusion is that pre-trained contextualized language models are prone to confound changes in lexicographic senses and changes in contextual variance.
arXiv Detail & Related papers (2022-08-31T23:35:24Z) - Always Keep your Target in Mind: Studying Semantics and Improving
Performance of Neural Lexical Substitution [124.99894592871385]
We present a large-scale comparative study of lexical substitution methods employing both old and most recent language models.
We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly.
arXiv Detail & Related papers (2022-06-07T16:16:19Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - SChME at SemEval-2020 Task 1: A Model Ensemble for Detecting Lexical
Semantic Change [58.87961226278285]
This paper describes SChME, a method used in SemEval-2020 Task 1 on unsupervised detection of lexical semantic change.
SChME usesa model ensemble combining signals of distributional models (word embeddings) and wordfrequency models where each model casts a vote indicating the probability that a word sufferedsemantic change according to that feature.
arXiv Detail & Related papers (2020-12-02T23:56:34Z) - SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection [10.606357227329822]
Evaluation is currently the most pressing problem in Lexical Semantic Change detection.
No gold standards are available to the community, which hinders progress.
We present the results of the first shared task that addresses this gap.
arXiv Detail & Related papers (2020-07-22T14:37:42Z) - Analysing Lexical Semantic Change with Contextualised Word
Representations [7.071298726856781]
We propose a novel method that exploits the BERT neural language model to obtain representations of word usages.
We create a new evaluation dataset and show that the model representations and the detected semantic shifts are positively correlated with human judgements.
arXiv Detail & Related papers (2020-04-29T12:18:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.