UoB at SemEval-2020 Task 1: Automatic Identification of Novel Word
Senses
- URL: http://arxiv.org/abs/2010.09072v1
- Date: Sun, 18 Oct 2020 19:27:06 GMT
- Title: UoB at SemEval-2020 Task 1: Automatic Identification of Novel Word
Senses
- Authors: Eleri Sarsfield and Harish Tayyar Madabushi
- Abstract summary: This paper presents an approach to lexical semantic change detection based on Bayesian word sense induction suitable for novel word sense identification.
The same approach is also applied to a corpus gleaned from 15 years of Twitter data, the results of which are then used to identify words which may be instances of slang.
- Score: 0.6980076213134383
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Much as the social landscape in which languages are spoken shifts, language
too evolves to suit the needs of its users. Lexical semantic change analysis is
a burgeoning field of semantic analysis which aims to trace changes in the
meanings of words over time. This paper presents an approach to lexical
semantic change detection based on Bayesian word sense induction suitable for
novel word sense identification. This approach is used for a submission to
SemEval-2020 Task 1, which shows the approach to be capable of the SemEval
task. The same approach is also applied to a corpus gleaned from 15 years of
Twitter data, the results of which are then used to identify words which may be
instances of slang.
Related papers
- Survey in Characterization of Semantic Change [0.1474723404975345]
Understanding the meaning of words is vital for interpreting texts from different cultures.
Semantic changes can potentially impact the quality of the outcomes of computational linguistics algorithms.
arXiv Detail & Related papers (2024-02-29T12:13:50Z) - Can Word Sense Distribution Detect Semantic Changes of Words? [35.17635565325166]
We show that word sense distributions can be accurately used to predict semantic changes of words in English, German, Swedish and Latin.
Our experimental results on SemEval 2020 Task 1 dataset show that word sense distributions can be accurately used to predict semantic changes of words.
arXiv Detail & Related papers (2023-10-16T13:41:27Z) - Interpretable Word Sense Representations via Definition Generation: The
Case of Semantic Change Analysis [3.515619810213763]
We propose using automatically generated natural language definitions of contextualised word usages as interpretable word and word sense representations.
We demonstrate how the resulting sense labels can make existing approaches to semantic change analysis more interpretable.
arXiv Detail & Related papers (2023-05-19T20:36:21Z) - Sentiment-Aware Word and Sentence Level Pre-training for Sentiment
Analysis [64.70116276295609]
SentiWSP is a Sentiment-aware pre-trained language model with combined Word-level and Sentence-level Pre-training tasks.
SentiWSP achieves new state-of-the-art performance on various sentence-level and aspect-level sentiment classification benchmarks.
arXiv Detail & Related papers (2022-10-18T12:25:29Z) - Always Keep your Target in Mind: Studying Semantics and Improving
Performance of Neural Lexical Substitution [124.99894592871385]
We present a large-scale comparative study of lexical substitution methods employing both old and most recent language models.
We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly.
arXiv Detail & Related papers (2022-06-07T16:16:19Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - UWB at SemEval-2020 Task 1: Lexical Semantic Change Detection [1.2599533416395767]
We examine semantic differences between specific words in two corpora, chosen from different time periods, for English, German, Latin, and Swedish.
Our method was created for the SemEval 2020 Task 1: textitUnsupervised Lexical Semantic Change Detection.
arXiv Detail & Related papers (2020-11-30T10:47:45Z) - Speakers Fill Lexical Semantic Gaps with Context [65.08205006886591]
We operationalise the lexical ambiguity of a word as the entropy of meanings it can take.
We find significant correlations between our estimate of ambiguity and the number of synonyms a word has in WordNet.
This suggests that, in the presence of ambiguity, speakers compensate by making contexts more informative.
arXiv Detail & Related papers (2020-10-05T17:19:10Z) - SST-BERT at SemEval-2020 Task 1: Semantic Shift Tracing by Clustering in
BERT-based Embedding Spaces [63.17308641484404]
We propose to identify clusters among different occurrences of each target word, considering these as representatives of different word meanings.
Disagreements in obtained clusters naturally allow to quantify the level of semantic shift per each target word in four target languages.
Our approach performs well both measured separately (per language) and overall, where we surpass all provided SemEval baselines.
arXiv Detail & Related papers (2020-10-02T08:38:40Z) - Unsupervised Embedding-based Detection of Lexical Semantic Changes [1.7403133838762452]
This paper describes EmbLexChange, a system introduced by the "Life-Language" team for SemEval-2020 Task 1.
EmmLexChange is defined as the divergence between the embedding based profiles of word w in the source and the target domains.
We show that using a resampling framework for the selection of reference words, we can reliably detect lexical-semantic changes in English, German, Swedish, and Latin.
arXiv Detail & Related papers (2020-05-16T13:05:47Z) - Word Sense Disambiguation for 158 Languages using Word Embeddings Only [80.79437083582643]
Disambiguation of word senses in context is easy for humans, but a major challenge for automatic approaches.
We present a method that takes as input a standard pre-trained word embedding model and induces a fully-fledged word sense inventory.
We use this method to induce a collection of sense inventories for 158 languages on the basis of the original pre-trained fastText word embeddings.
arXiv Detail & Related papers (2020-03-14T14:50:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.