An Embedded Diachronic Sense Change Model with a Case Study from Ancient Greek
- URL: http://arxiv.org/abs/2311.00541v5
- Date: Tue, 25 Jun 2024 16:13:12 GMT
- Title: An Embedded Diachronic Sense Change Model with a Case Study from Ancient Greek
- Authors: Schyan Zafar, Geoff K. Nicholls,
- Abstract summary: This paper introduces EDiSC, an Embedded model, which combines word embeddings with languages to provide superior model performance.
It is shown empirically that EDiSC offers improved accuracy, ground-truth recovery and uncertainty quantification.
- Score: 0.4143603294943439
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Word meanings change over time, and word senses evolve, emerge or die out in the process. For ancient languages, where the corpora are often small and sparse, modelling such changes accurately proves challenging, and quantifying uncertainty in sense-change estimates consequently becomes important. GASC (Genre-Aware Semantic Change) and DiSC (Diachronic Sense Change) are existing generative models that have been used to analyse sense change for target words from an ancient Greek text corpus, using unsupervised learning without the help of any pre-training. These models represent the senses of a given target word such as "kosmos" (meaning decoration, order or world) as distributions over context words, and sense prevalence as a distribution over senses. The models are fitted using Markov Chain Monte Carlo (MCMC) methods to measure temporal changes in these representations. This paper introduces EDiSC, an Embedded DiSC model, which combines word embeddings with DiSC to provide superior model performance. It is shown empirically that EDiSC offers improved predictive accuracy, ground-truth recovery and uncertainty quantification, as well as better sampling efficiency and scalability properties with MCMC methods. The challenges of fitting these models are also discussed.
Related papers
- PseudoNeg-MAE: Self-Supervised Point Cloud Learning using Conditional Pseudo-Negative Embeddings [55.55445978692678]
PseudoNeg-MAE is a self-supervised learning framework that enhances global feature representation of point cloud mask autoencoders.
We show that PseudoNeg-MAE achieves state-of-the-art performance on the ModelNet40 and ScanObjectNN datasets.
arXiv Detail & Related papers (2024-09-24T07:57:21Z) - SSCAE -- Semantic, Syntactic, and Context-aware natural language Adversarial Examples generator [0.44998333629984877]
Machine learning models are vulnerable to maliciously crafted Adversarial Examples (AEs)
This paper introduces a practical and efficient adversarial attack model called SSCAE for textbfSemantic, textbfSyntactic, and textbfContext-aware natural language textbfAEs generator.
arXiv Detail & Related papers (2024-03-18T14:45:20Z) - Semantic Sensitivities and Inconsistent Predictions: Measuring the
Fragility of NLI Models [44.56781176879151]
State-of-the-art Natural Language Inference (NLI) models are sensitive towards minor semantics preserving surface-form variations.
We show that semantic sensitivity causes performance degradations of $12.92%$ and $23.71%$ average over $textbfin-$ and $textbfout-of-$ domain settings.
arXiv Detail & Related papers (2024-01-25T14:47:05Z) - Calibration of Time-Series Forecasting: Detecting and Adapting Context-Driven Distribution Shift [28.73747033245012]
We introduce a universal calibration methodology for the detection and adaptation of context-driven distribution shifts.
A novel CDS detector, termed the "residual-based CDS detector" or "Reconditionor", quantifies the model's vulnerability to CDS.
A high Reconditionor score indicates a severe susceptibility, thereby necessitating model adaptation.
arXiv Detail & Related papers (2023-10-23T11:58:01Z) - Contextualized language models for semantic change detection: lessons
learned [4.436724861363513]
We present a qualitative analysis of the outputs of contextualized embedding-based methods for detecting diachronic semantic change.
Our findings show that contextualized methods can often predict high change scores for words which are not undergoing any real diachronic semantic shift.
Our conclusion is that pre-trained contextualized language models are prone to confound changes in lexicographic senses and changes in contextual variance.
arXiv Detail & Related papers (2022-08-31T23:35:24Z) - ER: Equivariance Regularizer for Knowledge Graph Completion [107.51609402963072]
We propose a new regularizer, namely, Equivariance Regularizer (ER)
ER can enhance the generalization ability of the model by employing the semantic equivariance between the head and tail entities.
The experimental results indicate a clear and substantial improvement over the state-of-the-art relation prediction methods.
arXiv Detail & Related papers (2022-06-24T08:18:05Z) - Meta-Learning with Variational Semantic Memory for Word Sense
Disambiguation [56.830395467247016]
We propose a model of semantic memory for WSD in a meta-learning setting.
Our model is based on hierarchical variational inference and incorporates an adaptive memory update rule via a hypernetwork.
We show our model advances the state of the art in few-shot WSD, supports effective learning in extremely data scarce scenarios.
arXiv Detail & Related papers (2021-06-05T20:40:01Z) - A comprehensive comparative evaluation and analysis of Distributional
Semantic Models [61.41800660636555]
We perform a comprehensive evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT.
The results show that the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous.
We borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models.
arXiv Detail & Related papers (2021-05-20T15:18:06Z) - Measuring diachronic sense change: new models and Monte Carlo methods
for Bayesian inference [3.1727619150610837]
In a bag-of-words model, the senses of a word with multiple meanings are represented as probability distributions over context words.
We adapt an existing generative sense change model to develop a simpler model for the main effects of sense and time.
We carry out automatic sense-annotation of snippets containing "kosmos" using our model, and measure the time-evolution of its three senses and their prevalence.
arXiv Detail & Related papers (2021-04-14T11:40:21Z) - Lexical semantic change for Ancient Greek and Latin [61.69697586178796]
Associating a word's correct meaning in its historical context is a central challenge in diachronic research.
We build on a recent computational approach to semantic change based on a dynamic Bayesian mixture model.
We provide a systematic comparison of dynamic Bayesian mixture models for semantic change with state-of-the-art embedding-based models.
arXiv Detail & Related papers (2021-01-22T12:04:08Z) - SChME at SemEval-2020 Task 1: A Model Ensemble for Detecting Lexical
Semantic Change [58.87961226278285]
This paper describes SChME, a method used in SemEval-2020 Task 1 on unsupervised detection of lexical semantic change.
SChME usesa model ensemble combining signals of distributional models (word embeddings) and wordfrequency models where each model casts a vote indicating the probability that a word sufferedsemantic change according to that feature.
arXiv Detail & Related papers (2020-12-02T23:56:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.