Related papers: Capturing Evolution in Word Usage: Just Add More Clusters?

Capturing Evolution in Word Usage: Just Add More Clusters?

URL: http://arxiv.org/abs/2001.06629v2
Date: Fri, 24 Jan 2020 01:58:05 GMT
Title: Capturing Evolution in Word Usage: Just Add More Clusters?
Authors: Matej Martinc, Syrielle Montariol, Elaine Zosa and Lidia Pivovarova
Abstract summary: We focus on a new set of methods relying on contextualised embeddings, a type of semantic modelling that revolutionised the NLP field recently. We leverage the ability of the transformer-based BERT model to generate contextualised embeddings capable of detecting semantic change of words across time.
Score: 9.209873675320834
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The way the words are used evolves through time, mirroring cultural or technological evolution of society. Semantic change detection is the task of detecting and analysing word evolution in textual data, even in short periods of time. In this paper we focus on a new set of methods relying on contextualised embeddings, a type of semantic modelling that revolutionised the NLP field recently. We leverage the ability of the transformer-based BERT model to generate contextualised embeddings capable of detecting semantic change of words across time. Several approaches are compared in a common setting in order to establish strengths and weaknesses for each of them. We also propose several ideas for improvements, managing to drastically improve the performance of existing approaches.

Related papers

Survey in Characterization of Semantic Change [0.1474723404975345]
Understanding the meaning of words is vital for interpreting texts from different cultures. Semantic changes can potentially impact the quality of the outcomes of computational linguistics algorithms.
arXiv Detail & Related papers (2024-02-29T12:13:50Z)
Graph-based Clustering for Detecting Semantic Change Across Time and Languages [10.058655884092094]
We propose a graph-based clustering approach to capture nuanced changes in both high- and low-frequency word senses across time and languages. Our approach substantially surpasses previous approaches in the SemEval 2020 binary classification task across four languages.
arXiv Detail & Related papers (2024-02-01T21:27:19Z)
Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution [124.99894592871385]
We present a large-scale comparative study of lexical substitution methods employing both old and most recent language models. We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly.
arXiv Detail & Related papers (2022-06-07T16:16:19Z)
$\textit{latent}$-GLAT: Glancing at Latent Variables for Parallel Text Generation [65.29170569821093]
parallel text generation has received widespread attention due to its success in generation efficiency. In this paper, we propose $textitlatent$-GLAT, which employs the discrete latent variables to capture word categorical information. Experiment results show that our method outperforms strong baselines without the help of an autoregressive model.
arXiv Detail & Related papers (2022-04-05T07:34:12Z)
To Augment or Not to Augment? A Comparative Study on Text Augmentation Techniques for Low-Resource NLP [0.0]
We investigate three categories of text augmentation methodologies which perform changes on the syntax. We compare them on part-of-speech tagging, dependency parsing and semantic role labeling for a diverse set of language families. Our results suggest that the augmentation techniques can further improve over strong baselines based on mBERT.
arXiv Detail & Related papers (2021-11-18T10:52:48Z)
Lexical Semantic Change Discovery [22.934650688233734]
We propose a shift from change detection to change discovery, i.e., discovering novel word senses over time from the full corpus vocabulary. By heavily fine-tuning a type-based and a token-based approach on recently published German data, we demonstrate that both models can successfully be applied to discover new words undergoing meaning change.
arXiv Detail & Related papers (2021-06-06T13:02:38Z)
Meta-Learning with Variational Semantic Memory for Word Sense Disambiguation [56.830395467247016]
We propose a model of semantic memory for WSD in a meta-learning setting. Our model is based on hierarchical variational inference and incorporates an adaptive memory update rule via a hypernetwork. We show our model advances the state of the art in few-shot WSD, supports effective learning in extremely data scarce scenarios.
arXiv Detail & Related papers (2021-06-05T20:40:01Z)
Neural Text Generation with Part-of-Speech Guided Softmax [82.63394952538292]
We propose using linguistic annotation, i.e., part-of-speech (POS), to guide the text generation. We show that our proposed methods can generate more diverse text while maintaining comparable quality.
arXiv Detail & Related papers (2021-05-08T08:53:16Z)
Fake it Till You Make it: Self-Supervised Semantic Shifts for Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change. We show that our method can be used for the detection of semantic change with any alignment method. We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z)
Lexical semantic change for Ancient Greek and Latin [61.69697586178796]
Associating a word's correct meaning in its historical context is a central challenge in diachronic research. We build on a recent computational approach to semantic change based on a dynamic Bayesian mixture model. We provide a systematic comparison of dynamic Bayesian mixture models for semantic change with state-of-the-art embedding-based models.
arXiv Detail & Related papers (2021-01-22T12:04:08Z)
Improving Adversarial Text Generation by Modeling the Distant Future [155.83051741029732]
We consider a text planning scheme and present a model-based imitation-learning approach to alleviate the aforementioned issues. We propose a novel guider network to focus on the generative process over a longer horizon, which can assist next-word prediction and provide intermediate rewards for generator optimization.
arXiv Detail & Related papers (2020-05-04T05:45:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.