SST-BERT at SemEval-2020 Task 1: Semantic Shift Tracing by Clustering in
BERT-based Embedding Spaces
- URL: http://arxiv.org/abs/2010.00857v1
- Date: Fri, 2 Oct 2020 08:38:40 GMT
- Title: SST-BERT at SemEval-2020 Task 1: Semantic Shift Tracing by Clustering in
BERT-based Embedding Spaces
- Authors: K Vani, Sandra Mitrovic, Alessandro Antonucci, Fabio Rinaldi
- Abstract summary: We propose to identify clusters among different occurrences of each target word, considering these as representatives of different word meanings.
Disagreements in obtained clusters naturally allow to quantify the level of semantic shift per each target word in four target languages.
Our approach performs well both measured separately (per language) and overall, where we surpass all provided SemEval baselines.
- Score: 63.17308641484404
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lexical semantic change detection (also known as semantic shift tracing) is a
task of identifying words that have changed their meaning over time.
Unsupervised semantic shift tracing, focal point of SemEval2020, is
particularly challenging. Given the unsupervised setup, in this work, we
propose to identify clusters among different occurrences of each target word,
considering these as representatives of different word meanings. As such,
disagreements in obtained clusters naturally allow to quantify the level of
semantic shift per each target word in four target languages. To leverage this
idea, clustering is performed on contextualized (BERT-based) embeddings of word
occurrences. The obtained results show that our approach performs well both
measured separately (per language) and overall, where we surpass all provided
SemEval baselines.
Related papers
- Tomato, Tomahto, Tomate: Measuring the Role of Shared Semantics among Subwords in Multilingual Language Models [88.07940818022468]
We take an initial step on measuring the role of shared semantics among subwords in the encoder-only multilingual language models (mLMs)
We form "semantic tokens" by merging the semantically similar subwords and their embeddings.
inspections on the grouped subwords show that they exhibit a wide range of semantic similarities.
arXiv Detail & Related papers (2024-11-07T08:38:32Z) - Graph-based Clustering for Detecting Semantic Change Across Time and
Languages [10.058655884092094]
We propose a graph-based clustering approach to capture nuanced changes in both high- and low-frequency word senses across time and languages.
Our approach substantially surpasses previous approaches in the SemEval 2020 binary classification task across four languages.
arXiv Detail & Related papers (2024-02-01T21:27:19Z) - Can Word Sense Distribution Detect Semantic Changes of Words? [35.17635565325166]
We show that word sense distributions can be accurately used to predict semantic changes of words in English, German, Swedish and Latin.
Our experimental results on SemEval 2020 Task 1 dataset show that word sense distributions can be accurately used to predict semantic changes of words.
arXiv Detail & Related papers (2023-10-16T13:41:27Z) - Breaking Down Word Semantics from Pre-trained Language Models through
Layer-wise Dimension Selection [0.0]
This paper aims to disentangle semantic sense from BERT by applying a binary mask to middle outputs across the layers.
The disentangled embeddings are evaluated through binary classification to determine if the target word in two different sentences has the same meaning.
arXiv Detail & Related papers (2023-10-08T11:07:19Z) - Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos [63.94040814459116]
Self-supervised methods have shown remarkable progress in learning high-level semantics and low-level temporal correspondence.
We propose a novel semantic-aware masked slot attention on top of the fused semantic features and correspondence maps.
We adopt semantic- and instance-level temporal consistency as self-supervision to encourage temporally coherent object-centric representations.
arXiv Detail & Related papers (2023-08-19T09:12:13Z) - Unsupervised Semantic Variation Prediction using the Distribution of
Sibling Embeddings [17.803726860514193]
Detection of semantic variation of words is an important task for various NLP applications.
We argue that mean representations alone cannot accurately capture such semantic variations.
We propose a method that uses the entire cohort of the contextualised embeddings of the target word.
arXiv Detail & Related papers (2023-05-15T13:58:21Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - SemGloVe: Semantic Co-occurrences for GloVe from BERT [55.420035541274444]
GloVe learns word embeddings by leveraging statistical information from word co-occurrence matrices.
We propose SemGloVe, which distills semantic co-occurrences from BERT into static GloVe word embeddings.
arXiv Detail & Related papers (2020-12-30T15:38:26Z) - SChME at SemEval-2020 Task 1: A Model Ensemble for Detecting Lexical
Semantic Change [58.87961226278285]
This paper describes SChME, a method used in SemEval-2020 Task 1 on unsupervised detection of lexical semantic change.
SChME usesa model ensemble combining signals of distributional models (word embeddings) and wordfrequency models where each model casts a vote indicating the probability that a word sufferedsemantic change according to that feature.
arXiv Detail & Related papers (2020-12-02T23:56:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.