Semantic Specialization for Knowledge-based Word Sense Disambiguation
- URL: http://arxiv.org/abs/2304.11340v1
- Date: Sat, 22 Apr 2023 07:40:23 GMT
- Title: Semantic Specialization for Knowledge-based Word Sense Disambiguation
- Authors: Sakae Mizuki and Naoaki Okazaki
- Abstract summary: A promising approach for knowledge-based Word Sense Disambiguation (WSD) is to select the sense whose contextualized embeddings are closest to those computed for a target word in a given sentence.
We propose a semantic specialization for WSD where contextualized embeddings are adapted to the WSD task using solely lexical knowledge.
- Score: 12.573927420408365
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A promising approach for knowledge-based Word Sense Disambiguation (WSD) is
to select the sense whose contextualized embeddings computed for its definition
sentence are closest to those computed for a target word in a given sentence.
This approach relies on the similarity of the \textit{sense} and
\textit{context} embeddings computed by a pre-trained language model. We
propose a semantic specialization for WSD where contextualized embeddings are
adapted to the WSD task using solely lexical knowledge. The key idea is, for a
given sense, to bring semantically related senses and contexts closer and send
different/unrelated senses farther away. We realize this idea as the joint
optimization of the Attract-Repel objective for sense pairs and the
self-training objective for context-sense pairs while controlling deviations
from the original embeddings. The proposed method outperformed previous studies
that adapt contextualized embeddings. It achieved state-of-the-art performance
on knowledge-based WSD when combined with the reranking heuristic that uses the
sense inventory. We found that the similarity characteristics of specialized
embeddings conform to the key idea. We also found that the (dis)similarity of
embeddings between the related/different/unrelated senses correlates well with
the performance of WSD.
Related papers
- Together We Make Sense -- Learning Meta-Sense Embeddings from Pretrained
Static Sense Embeddings [19.12036493733793]
We propose the first-ever meta-sense embedding method -- Neighbour Preserving Meta-Sense Embeddings.
Our proposed method can combine source sense embeddings that cover different sets of word senses.
Experimental results on Word Sense Disambiguation (WSD) and Word-in-Context (WiC) tasks show that the proposed meta-sense embedding method consistently outperforms several competitive baselines.
arXiv Detail & Related papers (2023-05-30T14:53:44Z) - Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings.
RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z) - A Quadratic 0-1 Programming Approach for Word Sense Disambiguation [0.0]
Word Sense Disambiguation (WSD) is the task to determine the sense of an ambiguous word in a given context.
We argue the following cause as one of the major difficulties behind finding the right patterns.
In this work, we approach the interactions between senses of different target words by a Quadratic Programming model (QIP) that maximizes a WSD problem.
arXiv Detail & Related papers (2022-01-13T10:46:06Z) - Connect-the-Dots: Bridging Semantics between Words and Definitions via
Aligning Word Sense Inventories [47.03271152494389]
Word Sense Disambiguation aims to automatically identify the exact meaning of one word according to its context.
Existing supervised models struggle to make correct predictions on rare word senses due to limited training data.
We propose a gloss alignment algorithm that can align definition sentences with the same meaning from different sense inventories to collect rich lexical knowledge.
arXiv Detail & Related papers (2021-10-27T00:04:33Z) - Large Scale Substitution-based Word Sense Induction [48.49573297876054]
We present a word-sense induction method based on pre-trained masked language models (MLMs), which can cheaply scale to large vocabularies and large corpora.
The result is a corpus which is sense-tagged according to a corpus-derived sense inventory and where each sense is associated with indicative words.
Evaluation on English Wikipedia that was sense-tagged using our method shows that both the induced senses, and the per-instance sense assignment, are of high quality even compared to WSD methods, such as Babelfy.
arXiv Detail & Related papers (2021-10-14T19:40:37Z) - Learning Sense-Specific Static Embeddings using Contextualised Word
Embeddings as a Proxy [26.385418377513332]
We propose Context Derived Embeddings of Senses (CDES)
CDES extracts sense related information from contextualised embeddings and injects it into static embeddings to create sense-specific static embeddings.
We show that CDES can accurately learn sense-specific static embeddings reporting comparable performance to the current state-of-the-art sense embeddings.
arXiv Detail & Related papers (2021-10-05T17:50:48Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - Dive into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty
Estimation for Facial Expression Recognition [59.52434325897716]
We propose a solution, named DMUE, to address the problem of annotation ambiguity from two perspectives.
For the former, an auxiliary multi-branch learning framework is introduced to better mine and describe the latent distribution in the label space.
For the latter, the pairwise relationship of semantic feature between instances are fully exploited to estimate the ambiguity extent in the instance space.
arXiv Detail & Related papers (2021-04-01T03:21:57Z) - EDS-MEMBED: Multi-sense embeddings based on enhanced distributional
semantic structures via a graph walk over word senses [0.0]
We leverage the rich semantic structures in WordNet to enhance the quality of multi-sense embeddings.
We derive new distributional semantic similarity measures for M-SE from prior ones.
We report evaluation results on 11 benchmark datasets involving WSD and Word Similarity tasks.
arXiv Detail & Related papers (2021-02-27T14:36:55Z) - SensPick: Sense Picking for Word Sense Disambiguation [1.1429576742016154]
We use both context and related gloss information of a target word to model the semantic relationship between the word and the set of glosses.
We propose SensPick, a type of stacked bidirectional Long Short Term Memory (LSTM) network to perform the WSD task.
arXiv Detail & Related papers (2021-02-10T04:52:42Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.