Word Embeddings: Stability and Semantic Change
- URL: http://arxiv.org/abs/2007.16006v1
- Date: Thu, 23 Jul 2020 16:03:50 GMT
- Title: Word Embeddings: Stability and Semantic Change
- Authors: Lucas Rettenmeier
- Abstract summary: We present an experimental study on the instability of the training process of three of the most influential embedding techniques of the last decade: word2vec, GloVe and fastText.
We propose a statistical model to describe the instability of embedding techniques and introduce a novel metric to measure the instability of the representation of an individual word.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Word embeddings are computed by a class of techniques within natural language
processing (NLP), that create continuous vector representations of words in a
language from a large text corpus. The stochastic nature of the training
process of most embedding techniques can lead to surprisingly strong
instability, i.e. subsequently applying the same technique to the same data
twice, can produce entirely different results. In this work, we present an
experimental study on the instability of the training process of three of the
most influential embedding techniques of the last decade: word2vec, GloVe and
fastText. Based on the experimental results, we propose a statistical model to
describe the instability of embedding techniques and introduce a novel metric
to measure the instability of the representation of an individual word.
Finally, we propose a method to minimize the instability - by computing a
modified average over multiple runs - and apply it to a specific linguistic
problem: The detection and quantification of semantic change, i.e. measuring
changes in the meaning and usage of words over time.
Related papers
- Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective [50.261681681643076]
We propose a novel metric called SemVarEffect and a benchmark named SemVarBench to evaluate the causality between semantic variations in inputs and outputs in text-to-image synthesis.
Our work establishes an effective evaluation framework that advances the T2I synthesis community's exploration of human instruction understanding.
arXiv Detail & Related papers (2024-10-14T08:45:35Z) - Statistical Uncertainty in Word Embeddings: GloVe-V [35.04183792123882]
We introduce a method to obtain approximate, easy-to-use, and scalable reconstruction error variance estimates for GloVe.
To demonstrate the value of embeddings with variance (GloVe-V), we illustrate how our approach enables principled hypothesis testing in core word embedding tasks.
arXiv Detail & Related papers (2024-06-18T00:35:02Z) - Towards preserving word order importance through Forced Invalidation [80.33036864442182]
We show that pre-trained language models are insensitive to word order.
We propose Forced Invalidation to help preserve the importance of word order.
Our experiments demonstrate that Forced Invalidation significantly improves the sensitivity of the models to word order.
arXiv Detail & Related papers (2023-04-11T13:42:10Z) - Exploring Dimensionality Reduction Techniques in Multilingual
Transformers [64.78260098263489]
This paper gives a comprehensive account of the impact of dimensional reduction techniques on the performance of state-of-the-art multilingual Siamese Transformers.
It shows that it is possible to achieve an average reduction in the number of dimensions of $91.58% pm 2.59%$ and $54.65% pm 32.20%$, respectively.
arXiv Detail & Related papers (2022-04-18T17:20:55Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - Learning to Remove: Towards Isotropic Pre-trained BERT Embedding [7.765987411382461]
Research in word representation shows that isotropic embeddings can significantly improve performance on downstream tasks.
We measure and analyze the geometry of pre-trained BERT embedding and find that it is far from isotropic.
We propose a simple, and yet effective method to fix this problem: remove several dominant directions of BERT embedding with a set of learnable weights.
arXiv Detail & Related papers (2021-04-12T08:13:59Z) - Exploring the Relationship Between Algorithm Performance, Vocabulary,
and Run-Time in Text Classification [2.7261840344953807]
This study examines how preprocessing techniques affect the vocabulary size, model performance, and model run-time.
We show that some individual methods can reduce run-time with no loss of accuracy, while some combinations of methods can trade 2-5% of the accuracy for up to a 65% reduction of run-time.
arXiv Detail & Related papers (2021-04-08T15:49:59Z) - Statistically significant detection of semantic shifts using contextual
word embeddings [7.439525715543974]
We propose an approach to estimate semantic shifts by combining contextual word embeddings with permutation-based statistical tests.
We demonstrate the performance of this approach in simulation, achieving consistently high precision by suppressing false positives.
We additionally analyzed real-world data from SemEval-2020 Task 1 and the Liverpool FC subreddit corpus.
arXiv Detail & Related papers (2021-04-08T13:58:54Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.