Related papers: Investigating the Contextualised Word Embedding Dimensions Responsible for Contextual and Temporal Semantic Changes

Investigating the Contextualised Word Embedding Dimensions Responsible for Contextual and Temporal Semantic Changes

URL: http://arxiv.org/abs/2407.02820v1
Date: Wed, 3 Jul 2024 05:42:20 GMT
Title: Investigating the Contextualised Word Embedding Dimensions Responsible for Contextual and Temporal Semantic Changes
Authors: Taichi Aida, Danushka Bollegala,
Abstract summary: It remains unclear as to how the meaning changes are encoded in the embedding space. We compare pre-trained CWEs and their fine-tuned versions on semantic change benchmarks. Our results reveal several novel insights such as (a) although there exist a smaller number of axes that are responsible for semantic changes of words in the pre-trained CWE space, this information gets distributed across all dimensions when fine-tuned.
Score: 30.563130208194977
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Words change their meaning over time as well as in different contexts. The sense-aware contextualised word embeddings (SCWEs) such as the ones produced by XL-LEXEME by fine-tuning masked langauge models (MLMs) on Word-in-Context (WiC) data attempt to encode such semantic changes of words within the contextualised word embedding (CWE) spaces. Despite the superior performance of SCWEs in contextual/temporal semantic change detection (SCD) benchmarks, it remains unclear as to how the meaning changes are encoded in the embedding space. To study this, we compare pre-trained CWEs and their fine-tuned versions on contextual and temporal semantic change benchmarks under Principal Component Analysis (PCA) and Independent Component Analysis (ICA) transformations. Our experimental results reveal several novel insights such as (a) although there exist a smaller number of axes that are responsible for semantic changes of words in the pre-trained CWE space, this information gets distributed across all dimensions when fine-tuned, and (b) in contrast to prior work studying the geometry of CWEs, we find that PCA to better represent semantic changes than ICA. Source code is available at https://github.com/LivNLP/svp-dims .

Related papers

Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective [50.261681681643076]
We propose a novel metric called SemVarEffect and a benchmark named SemVarBench to evaluate the causality between semantic variations in inputs and outputs in text-to-image synthesis. Our work establishes an effective evaluation framework that advances the T2I synthesis community's exploration of human instruction understanding.
arXiv Detail & Related papers (2024-10-14T08:45:35Z)
A Systematic Comparison of Contextualized Word Embeddings for Lexical Semantic Change [0.696194614504832]
We evaluate state-of-the-art models and approaches for Graded Change Detection (GCD) We break the LSC problem into Word-in-Context (WiC) and Word Sense Induction (WSI) tasks, and compare models across these different levels. Our evaluation is performed across different languages on eight available benchmarks for LSC, and shows that (i) APD outperforms other approaches for GCD; (ii) XL-LEXEME outperforms other contextualized models for WiC, WSI, and GCD, while being comparable to GPT-4.
arXiv Detail & Related papers (2024-02-19T10:04:59Z)
Align, Perturb and Decouple: Toward Better Leverage of Difference Information for RSI Change Detection [24.249552791014644]
Change detection is a widely adopted technique in remote sense imagery (RSI) analysis. We propose a series of operations to fully exploit the difference information: Alignment, Perturbation and Decoupling.
arXiv Detail & Related papers (2023-05-30T03:39:53Z)
Unsupervised Semantic Variation Prediction using the Distribution of Sibling Embeddings [17.803726860514193]
Detection of semantic variation of words is an important task for various NLP applications. We argue that mean representations alone cannot accurately capture such semantic variations. We propose a method that uses the entire cohort of the contextualised embeddings of the target word.
arXiv Detail & Related papers (2023-05-15T13:58:21Z)
A Survey on Contextualised Semantic Shift Detection [0.0]
Semantic Shift Detection (SSD) is the task of identifying, interpreting, and assessing the possible change over time in the meanings of a target word. We propose a classification framework characterised by meaning representation, time-awareness, and learning modality dimensions.
arXiv Detail & Related papers (2023-04-04T09:50:19Z)
Joint Spatio-Temporal Modeling for the Semantic Change Detection in Remote Sensing Images [22.72105435238235]
We propose a Semantic Change (SCanFormer) to explicitly model the 'from-to' semantic transitions between the bi-temporal RSIss. Then, we introduce a semantic learning scheme to leverage the Transformer-temporal constraints, which are coherent to the SCD task, to guide the learning of semantic changes. The resulting network (SCanNet) outperforms the baseline method in terms of both detection of critical semantic changes and semantic consistency in the obtained bi-temporal results.
arXiv Detail & Related papers (2022-12-10T08:49:19Z)
Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation [70.58243648754507]
We introduce a new method to improve existing multilingual sentence embeddings with Abstract Meaning Representation (AMR) Compared with the original textual input, AMR is a structured semantic representation that presents the core concepts and relations in a sentence explicitly and unambiguously. Experiment results show that retrofitting multilingual sentence embeddings with AMR leads to better state-of-the-art performance on both semantic similarity and transfer tasks.
arXiv Detail & Related papers (2022-10-18T11:37:36Z)
Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation. This paper aims to address the issue with a mask-and-predict strategy. We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions. Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z)
HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning [74.76431541169342]
Zero-shot learning (ZSL) tackles the unseen class recognition problem, transferring semantic knowledge from seen classes to unseen ones. We propose a novel hierarchical semantic-visual adaptation (HSVA) framework to align semantic and visual domains. Experiments on four benchmark datasets demonstrate HSVA achieves superior performance on both conventional and generalized ZSL.
arXiv Detail & Related papers (2021-09-30T14:27:50Z)
Meta-Learning with Variational Semantic Memory for Word Sense Disambiguation [56.830395467247016]
We propose a model of semantic memory for WSD in a meta-learning setting. Our model is based on hierarchical variational inference and incorporates an adaptive memory update rule via a hypernetwork. We show our model advances the state of the art in few-shot WSD, supports effective learning in extremely data scarce scenarios.
arXiv Detail & Related papers (2021-06-05T20:40:01Z)
EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph walk over word senses [0.0]
We leverage the rich semantic structures in WordNet to enhance the quality of multi-sense embeddings. We derive new distributional semantic similarity measures for M-SE from prior ones. We report evaluation results on 11 benchmark datasets involving WSD and Word Similarity tasks.
arXiv Detail & Related papers (2021-02-27T14:36:55Z)
Fake it Till You Make it: Self-Supervised Semantic Shifts for Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change. We show that our method can be used for the detection of semantic change with any alignment method. We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z)
Unsupervised Distillation of Syntactic Information from Contextualized Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations. To this end, we automatically generate groups of sentences which are structurally similar but semantically different. We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z)
Simultaneous Semantic Alignment Network for Heterogeneous Domain Adaptation [67.37606333193357]
We propose aSimultaneous Semantic Alignment Network (SSAN) to simultaneously exploit correlations among categories and align the centroids for each category across domains. By leveraging target pseudo-labels, a robust triplet-centroid alignment mechanism is explicitly applied to align feature representations for each category. Experiments on various HDA tasks across text-to-image, image-to-image and text-to-text successfully validate the superiority of our SSAN against state-of-the-art HDA methods.
arXiv Detail & Related papers (2020-08-04T16:20:37Z)
Text classification with word embedding regularization and soft similarity measure [0.20999222360659603]
Two word embedding regularization techniques were shown to reduce storage and memory costs, and to improve training speed, document processing speed, and task performance. We show 39% average $k$NN test error reduction with regularized word embeddings compared to non-regularized word embeddings. We also show that the SCM with regularized word embeddings significantly outperforms the WMD on text classification and is over 10,000 times faster.
arXiv Detail & Related papers (2020-03-10T22:07:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.