UWB at SemEval-2020 Task 1: Lexical Semantic Change Detection
- URL: http://arxiv.org/abs/2012.00004v1
- Date: Mon, 30 Nov 2020 10:47:45 GMT
- Title: UWB at SemEval-2020 Task 1: Lexical Semantic Change Detection
- Authors: Ond\v{r}ej Pra\v{z}\'ak, Pavel P\v{r}ib\'a\v{n}, Stephen Taylor, and
Jakub Sido
- Abstract summary: We examine semantic differences between specific words in two corpora, chosen from different time periods, for English, German, Latin, and Swedish.
Our method was created for the SemEval 2020 Task 1: textitUnsupervised Lexical Semantic Change Detection.
- Score: 1.2599533416395767
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we describe our method for the detection of lexical semantic
change, i.e., word sense changes over time. We examine semantic differences
between specific words in two corpora, chosen from different time periods, for
English, German, Latin, and Swedish. Our method was created for the SemEval
2020 Task 1: \textit{Unsupervised Lexical Semantic Change Detection.} We ranked
$1^{st}$ in Sub-task 1: binary change detection, and $4^{th}$ in Sub-task 2:
ranked change detection. Our method is fully unsupervised and language
independent. It consists of preparing a semantic vector space for each corpus,
earlier and later; computing a linear transformation between earlier and later
spaces, using Canonical Correlation Analysis and Orthogonal Transformation; and
measuring the cosines between the transformed vector for the target word from
the earlier corpus and the vector for the target word in the later corpus.
Related papers
- Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective [50.261681681643076]
We propose a novel metric called SemVarEffect and a benchmark named SemVarBench to evaluate the causality between semantic variations in inputs and outputs in text-to-image synthesis.
Our work establishes an effective evaluation framework that advances the T2I synthesis community's exploration of human instruction understanding.
arXiv Detail & Related papers (2024-10-14T08:45:35Z) - Can Word Sense Distribution Detect Semantic Changes of Words? [35.17635565325166]
We show that word sense distributions can be accurately used to predict semantic changes of words in English, German, Swedish and Latin.
Our experimental results on SemEval 2020 Task 1 dataset show that word sense distributions can be accurately used to predict semantic changes of words.
arXiv Detail & Related papers (2023-10-16T13:41:27Z) - Backpack Language Models [108.65930795825416]
We present Backpacks, a new neural architecture that marries strong modeling performance with an interface for interpretability and control.
We find that, after training, sense vectors specialize, each encoding a different aspect of a word.
We present simple algorithms that intervene on sense vectors to perform controllable text generation and debiasing.
arXiv Detail & Related papers (2023-05-26T09:26:23Z) - Simple, Interpretable and Stable Method for Detecting Words with Usage
Change across Corpora [54.757845511368814]
The problem of comparing two bodies of text and searching for words that differ in their usage arises often in digital humanities and computational social science.
This is commonly approached by training word embeddings on each corpus, aligning the vector spaces, and looking for words whose cosine distance in the aligned space is large.
We propose an alternative approach that does not use vector space alignment, and instead considers the neighbors of each word.
arXiv Detail & Related papers (2021-12-28T23:46:00Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - SChME at SemEval-2020 Task 1: A Model Ensemble for Detecting Lexical
Semantic Change [58.87961226278285]
This paper describes SChME, a method used in SemEval-2020 Task 1 on unsupervised detection of lexical semantic change.
SChME usesa model ensemble combining signals of distributional models (word embeddings) and wordfrequency models where each model casts a vote indicating the probability that a word sufferedsemantic change according to that feature.
arXiv Detail & Related papers (2020-12-02T23:56:34Z) - UWB @ DIACR-Ita: Lexical Semantic Change Detection with CCA and
Orthogonal Transformation [1.3764085113103222]
We describe our method for detection of lexical semantic change (i.e., word sense changes over time) for the DIACR-Ita shared task.
We examine semantic differences between specific words in two Italian corpora, chosen from different time periods.
arXiv Detail & Related papers (2020-11-30T10:41:50Z) - UoB at SemEval-2020 Task 1: Automatic Identification of Novel Word
Senses [0.6980076213134383]
This paper presents an approach to lexical semantic change detection based on Bayesian word sense induction suitable for novel word sense identification.
The same approach is also applied to a corpus gleaned from 15 years of Twitter data, the results of which are then used to identify words which may be instances of slang.
arXiv Detail & Related papers (2020-10-18T19:27:06Z) - SST-BERT at SemEval-2020 Task 1: Semantic Shift Tracing by Clustering in
BERT-based Embedding Spaces [63.17308641484404]
We propose to identify clusters among different occurrences of each target word, considering these as representatives of different word meanings.
Disagreements in obtained clusters naturally allow to quantify the level of semantic shift per each target word in four target languages.
Our approach performs well both measured separately (per language) and overall, where we surpass all provided SemEval baselines.
arXiv Detail & Related papers (2020-10-02T08:38:40Z) - GloVeInit at SemEval-2020 Task 1: Using GloVe Vector Initialization for
Unsupervised Lexical Semantic Change Detection [0.0]
This paper presents a Vector Initialization approach for the SemEval 2020 Task 1: Unsupervised Lexical Semantic Change Detection.
The proposed approach is based on using Vector Initialization method to align GloVe embeddings.
Our model ranks 13th and 10th among 33 teams in the two subtasks.
arXiv Detail & Related papers (2020-07-10T21:35:17Z) - Unsupervised Embedding-based Detection of Lexical Semantic Changes [1.7403133838762452]
This paper describes EmbLexChange, a system introduced by the "Life-Language" team for SemEval-2020 Task 1.
EmmLexChange is defined as the divergence between the embedding based profiles of word w in the source and the target domains.
We show that using a resampling framework for the selection of reference words, we can reliably detect lexical-semantic changes in English, German, Swedish, and Latin.
arXiv Detail & Related papers (2020-05-16T13:05:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.