A Systematic Comparison of Contextualized Word Embeddings for Lexical
Semantic Change
- URL: http://arxiv.org/abs/2402.12011v3
- Date: Fri, 8 Mar 2024 14:50:40 GMT
- Title: A Systematic Comparison of Contextualized Word Embeddings for Lexical
Semantic Change
- Authors: Francesco Periti, Nina Tahmasebi
- Abstract summary: We evaluate state-of-the-art models and approaches for Graded Change Detection (GCD)
We break the LSC problem into Word-in-Context (WiC) and Word Sense Induction (WSI) tasks, and compare models across these different levels.
Our evaluation is performed across different languages on eight available benchmarks for LSC, and shows that (i) APD outperforms other approaches for GCD; (ii) XL-LEXEME outperforms other contextualized models for WiC, WSI, and GCD, while being comparable to GPT-4.
- Score: 0.696194614504832
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Contextualized embeddings are the preferred tool for modeling Lexical
Semantic Change (LSC). Current evaluations typically focus on a specific task
known as Graded Change Detection (GCD). However, performance comparison across
work are often misleading due to their reliance on diverse settings. In this
paper, we evaluate state-of-the-art models and approaches for GCD under equal
conditions. We further break the LSC problem into Word-in-Context (WiC) and
Word Sense Induction (WSI) tasks, and compare models across these different
levels. Our evaluation is performed across different languages on eight
available benchmarks for LSC, and shows that (i) APD outperforms other
approaches for GCD; (ii) XL-LEXEME outperforms other contextualized models for
WiC, WSI, and GCD, while being comparable to GPT-4; (iii) there is a clear need
for improving the modeling of word meanings, as well as focus on how, when, and
why these meanings change, rather than solely focusing on the extent of
semantic change.
Related papers
- Investigating the Contextualised Word Embedding Dimensions Responsible for Contextual and Temporal Semantic Changes [30.563130208194977]
It remains unclear as to how the meaning changes are encoded in the embedding space.
We compare pre-trained CWEs and their fine-tuned versions on semantic change benchmarks.
Our results reveal several novel insights such as (a) although there exist a smaller number of axes that are responsible for semantic changes of words in the pre-trained CWE space, this information gets distributed across all dimensions when fine-tuned.
arXiv Detail & Related papers (2024-07-03T05:42:20Z) - The LSCD Benchmark: a Testbed for Diachronic Word Meaning Tasks [3.8042401909826964]
Lexical Semantic Change Detection (LSCD) is a complex, lemma-level task.
This repository reflects the task's modularity by allowing model evaluation for WiC, WSI and LSCD.
arXiv Detail & Related papers (2024-03-29T22:11:54Z) - Textual Knowledge Matters: Cross-Modality Co-Teaching for Generalized
Visual Class Discovery [69.91441987063307]
Generalized Category Discovery (GCD) aims to cluster unlabeled data from both known and unknown categories.
Current GCD methods rely on only visual cues, which neglect the multi-modality perceptive nature of human cognitive processes in discovering novel visual categories.
We propose a two-phase TextGCD framework to accomplish multi-modality GCD by exploiting powerful Visual-Language Models.
arXiv Detail & Related papers (2024-03-12T07:06:50Z) - Align, Perturb and Decouple: Toward Better Leverage of Difference
Information for RSI Change Detection [24.249552791014644]
Change detection is a widely adopted technique in remote sense imagery (RSI) analysis.
We propose a series of operations to fully exploit the difference information: Alignment, Perturbation and Decoupling.
arXiv Detail & Related papers (2023-05-30T03:39:53Z) - The Better Your Syntax, the Better Your Semantics? Probing Pretrained
Language Models for the English Comparative Correlative [7.03497683558609]
Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasising the connection between syntax and semantics.
We present an investigation of their capability to classify and understand one of the most commonly studied constructions, the English comparative correlative (CC)
Our results show that all three investigated PLMs are able to recognise the structure of the CC but fail to use its meaning.
arXiv Detail & Related papers (2022-10-24T13:01:24Z) - Word Sense Induction with Hierarchical Clustering and Mutual Information
Maximization [14.997937028599255]
Word sense induction is a difficult problem in natural language processing.
We propose a novel unsupervised method based on hierarchical clustering and invariant information clustering.
We empirically demonstrate that, in certain cases, our approach outperforms prior WSI state-of-the-art methods.
arXiv Detail & Related papers (2022-10-11T13:04:06Z) - ContraCLM: Contrastive Learning For Causal Language Model [54.828635613501376]
We present ContraCLM, a novel contrastive learning framework at both token-level and sequence-level.
We show that ContraCLM enhances discrimination of the representations and bridges the gap with the encoder-only models.
arXiv Detail & Related papers (2022-10-03T18:56:35Z) - Always Keep your Target in Mind: Studying Semantics and Improving
Performance of Neural Lexical Substitution [124.99894592871385]
We present a large-scale comparative study of lexical substitution methods employing both old and most recent language models.
We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly.
arXiv Detail & Related papers (2022-06-07T16:16:19Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - XL-WiC: A Multilingual Benchmark for Evaluating Semantic
Contextualization [98.61159823343036]
We present the Word-in-Context dataset (WiC) for assessing the ability to correctly model distinct meanings of a word.
We put forward a large multilingual benchmark, XL-WiC, featuring gold standards in 12 new languages.
Experimental results show that even when no tagged instances are available for a target language, models trained solely on the English data can attain competitive performance.
arXiv Detail & Related papers (2020-10-13T15:32:00Z) - A Comparative Study of Lexical Substitution Approaches based on Neural
Language Models [117.96628873753123]
We present a large-scale comparative study of popular neural language and masked language models.
We show that already competitive results achieved by SOTA LMs/MLMs can be further improved if information about the target word is injected properly.
arXiv Detail & Related papers (2020-05-29T18:43:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.