Learning Sense-Specific Static Embeddings using Contextualised Word
Embeddings as a Proxy
- URL: http://arxiv.org/abs/2110.02204v2
- Date: Wed, 6 Oct 2021 10:30:37 GMT
- Title: Learning Sense-Specific Static Embeddings using Contextualised Word
Embeddings as a Proxy
- Authors: Yi Zhou and Danushka Bollegala
- Abstract summary: We propose Context Derived Embeddings of Senses (CDES)
CDES extracts sense related information from contextualised embeddings and injects it into static embeddings to create sense-specific static embeddings.
We show that CDES can accurately learn sense-specific static embeddings reporting comparable performance to the current state-of-the-art sense embeddings.
- Score: 26.385418377513332
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contextualised word embeddings generated from Neural Language Models (NLMs),
such as BERT, represent a word with a vector that considers the semantics of
the target word as well its context. On the other hand, static word embeddings
such as GloVe represent words by relatively low-dimensional, memory- and
compute-efficient vectors but are not sensitive to the different senses of the
word. We propose Context Derived Embeddings of Senses (CDES), a method that
extracts sense related information from contextualised embeddings and injects
it into static embeddings to create sense-specific static embeddings.
Experimental results on multiple benchmarks for word sense disambiguation and
sense discrimination tasks show that CDES can accurately learn sense-specific
static embeddings reporting comparable performance to the current
state-of-the-art sense embeddings.
Related papers
- Together We Make Sense -- Learning Meta-Sense Embeddings from Pretrained
Static Sense Embeddings [19.12036493733793]
We propose the first-ever meta-sense embedding method -- Neighbour Preserving Meta-Sense Embeddings.
Our proposed method can combine source sense embeddings that cover different sets of word senses.
Experimental results on Word Sense Disambiguation (WSD) and Word-in-Context (WiC) tasks show that the proposed meta-sense embedding method consistently outperforms several competitive baselines.
arXiv Detail & Related papers (2023-05-30T14:53:44Z) - Semantic Specialization for Knowledge-based Word Sense Disambiguation [12.573927420408365]
A promising approach for knowledge-based Word Sense Disambiguation (WSD) is to select the sense whose contextualized embeddings are closest to those computed for a target word in a given sentence.
We propose a semantic specialization for WSD where contextualized embeddings are adapted to the WSD task using solely lexical knowledge.
arXiv Detail & Related papers (2023-04-22T07:40:23Z) - Word Sense Induction with Knowledge Distillation from BERT [6.88247391730482]
This paper proposes a method to distill multiple word senses from a pre-trained language model (BERT) by using attention over the senses of a word in a context.
Experiments on the contextual word similarity and sense induction tasks show that this method is superior to or competitive with state-of-the-art multi-sense embeddings.
arXiv Detail & Related papers (2023-04-20T21:05:35Z) - Connect-the-Dots: Bridging Semantics between Words and Definitions via
Aligning Word Sense Inventories [47.03271152494389]
Word Sense Disambiguation aims to automatically identify the exact meaning of one word according to its context.
Existing supervised models struggle to make correct predictions on rare word senses due to limited training data.
We propose a gloss alignment algorithm that can align definition sentences with the same meaning from different sense inventories to collect rich lexical knowledge.
arXiv Detail & Related papers (2021-10-27T00:04:33Z) - Large Scale Substitution-based Word Sense Induction [48.49573297876054]
We present a word-sense induction method based on pre-trained masked language models (MLMs), which can cheaply scale to large vocabularies and large corpora.
The result is a corpus which is sense-tagged according to a corpus-derived sense inventory and where each sense is associated with indicative words.
Evaluation on English Wikipedia that was sense-tagged using our method shows that both the induced senses, and the per-instance sense assignment, are of high quality even compared to WSD methods, such as Babelfy.
arXiv Detail & Related papers (2021-10-14T19:40:37Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - EDS-MEMBED: Multi-sense embeddings based on enhanced distributional
semantic structures via a graph walk over word senses [0.0]
We leverage the rich semantic structures in WordNet to enhance the quality of multi-sense embeddings.
We derive new distributional semantic similarity measures for M-SE from prior ones.
We report evaluation results on 11 benchmark datasets involving WSD and Word Similarity tasks.
arXiv Detail & Related papers (2021-02-27T14:36:55Z) - Unsupervised Distillation of Syntactic Information from Contextualized
Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations.
To this end, we automatically generate groups of sentences which are structurally similar but semantically different.
We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - Word Sense Disambiguation for 158 Languages using Word Embeddings Only [80.79437083582643]
Disambiguation of word senses in context is easy for humans, but a major challenge for automatic approaches.
We present a method that takes as input a standard pre-trained word embedding model and induces a fully-fledged word sense inventory.
We use this method to induce a collection of sense inventories for 158 languages on the basis of the original pre-trained fastText word embeddings.
arXiv Detail & Related papers (2020-03-14T14:50:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.