GloVeInit at SemEval-2020 Task 1: Using GloVe Vector Initialization for
Unsupervised Lexical Semantic Change Detection
- URL: http://arxiv.org/abs/2007.05618v1
- Date: Fri, 10 Jul 2020 21:35:17 GMT
- Title: GloVeInit at SemEval-2020 Task 1: Using GloVe Vector Initialization for
Unsupervised Lexical Semantic Change Detection
- Authors: Vaibhav Jain
- Abstract summary: This paper presents a Vector Initialization approach for the SemEval 2020 Task 1: Unsupervised Lexical Semantic Change Detection.
The proposed approach is based on using Vector Initialization method to align GloVe embeddings.
Our model ranks 13th and 10th among 33 teams in the two subtasks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This paper presents a vector initialization approach for the SemEval2020 Task
1: Unsupervised Lexical Semantic Change Detection. Given two corpora belonging
to different time periods and a set of target words, this task requires us to
classify whether a word gained or lost a sense over time (subtask 1) and to
rank them on the basis of the changes in their word senses (subtask 2). The
proposed approach is based on using Vector Initialization method to align GloVe
embeddings. The idea is to consecutively train GloVe embeddings for both
corpora, while using the first model to initialize the second one. This paper
is based on the hypothesis that GloVe embeddings are more suited for the Vector
Initialization method than SGNS embeddings. It presents an intuitive reasoning
behind this hypothesis, and also talks about the impact of various factors and
hyperparameters on the performance of the proposed approach. Our model ranks
13th and 10th among 33 teams in the two subtasks. The implementation has been
shared publicly.
Related papers
- Deep-change at AXOLOTL-24: Orchestrating WSD and WSI Models for Semantic Change Modeling [0.19116784879310028]
This paper describes our solution of the first subtask from the AXOLOTL-24 shared task on Semantic Change Modeling.
We propose and experiment with three new methods solving this task.
We develop a model that can tell if a given word usage is not described by any of the provided sense definitions.
arXiv Detail & Related papers (2024-08-09T17:15:54Z) - TartuNLP @ AXOLOTL-24: Leveraging Classifier Output for New Sense Detection in Lexical Semantics [0.21485350418225246]
We present our submission to the AXOLOTL-24 shared task.
The task comprises two subtasks: identifying new senses that words gain with time and producing the definitions for the identified new senses.
We trained adapter-based binary classification models to match glosses with usage examples and leveraged the probability output of the models to identify novel senses.
arXiv Detail & Related papers (2024-07-04T11:46:39Z) - Backpack Language Models [108.65930795825416]
We present Backpacks, a new neural architecture that marries strong modeling performance with an interface for interpretability and control.
We find that, after training, sense vectors specialize, each encoding a different aspect of a word.
We present simple algorithms that intervene on sense vectors to perform controllable text generation and debiasing.
arXiv Detail & Related papers (2023-05-26T09:26:23Z) - Speaker Embedding-aware Neural Diarization: a Novel Framework for
Overlapped Speech Diarization in the Meeting Scenario [51.5031673695118]
We reformulate overlapped speech diarization as a single-label prediction problem.
We propose the speaker embedding-aware neural diarization (SEND) system.
arXiv Detail & Related papers (2022-03-18T06:40:39Z) - WOVe: Incorporating Word Order in GloVe Word Embeddings [0.0]
Defining a word as a vector makes it easy for machine learning algorithms to understand a text and extract information from it.
Word vector representations have been used in many applications such word synonyms, word analogy, syntactic parsing, and many others.
arXiv Detail & Related papers (2021-05-18T15:28:20Z) - SChME at SemEval-2020 Task 1: A Model Ensemble for Detecting Lexical
Semantic Change [58.87961226278285]
This paper describes SChME, a method used in SemEval-2020 Task 1 on unsupervised detection of lexical semantic change.
SChME usesa model ensemble combining signals of distributional models (word embeddings) and wordfrequency models where each model casts a vote indicating the probability that a word sufferedsemantic change according to that feature.
arXiv Detail & Related papers (2020-12-02T23:56:34Z) - UWB at SemEval-2020 Task 1: Lexical Semantic Change Detection [1.2599533416395767]
We examine semantic differences between specific words in two corpora, chosen from different time periods, for English, German, Latin, and Swedish.
Our method was created for the SemEval 2020 Task 1: textitUnsupervised Lexical Semantic Change Detection.
arXiv Detail & Related papers (2020-11-30T10:47:45Z) - NLP-CIC @ DIACR-Ita: POS and Neighbor Based Distributional Models for
Lexical Semantic Change in Diachronic Italian Corpora [62.997667081978825]
We present our systems and findings on unsupervised lexical semantic change for the Italian language.
The task is to determine whether a target word has evolved its meaning with time, only relying on raw-text from two time-specific datasets.
We propose two models representing the target words across the periods to predict the changing words using threshold and voting schemes.
arXiv Detail & Related papers (2020-11-07T11:27:18Z) - SST-BERT at SemEval-2020 Task 1: Semantic Shift Tracing by Clustering in
BERT-based Embedding Spaces [63.17308641484404]
We propose to identify clusters among different occurrences of each target word, considering these as representatives of different word meanings.
Disagreements in obtained clusters naturally allow to quantify the level of semantic shift per each target word in four target languages.
Our approach performs well both measured separately (per language) and overall, where we surpass all provided SemEval baselines.
arXiv Detail & Related papers (2020-10-02T08:38:40Z) - UiO-UvA at SemEval-2020 Task 1: Contextualised Embeddings for Lexical
Semantic Change Detection [5.099262949886174]
This paper focuses on Subtask 2, ranking words by the degree of their semantic drift over time.
We find that the most effective algorithms rely on the cosine similarity between averaged token embeddings and the pairwise distances between token embeddings.
arXiv Detail & Related papers (2020-04-30T18:43:57Z) - Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence
Lip-Reading [96.48553941812366]
Lip-reading aims to infer the speech content from the lip movement sequence.
Traditional learning process of seq2seq models suffers from two problems.
We propose a novel pseudo-convolutional policy gradient (PCPG) based method to address these two problems.
arXiv Detail & Related papers (2020-03-09T09:12:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.