Context-Aware Semantic Similarity Measurement for Unsupervised Word
Sense Disambiguation
- URL: http://arxiv.org/abs/2305.03520v4
- Date: Wed, 13 Dec 2023 07:13:19 GMT
- Title: Context-Aware Semantic Similarity Measurement for Unsupervised Word
Sense Disambiguation
- Authors: Jorge Martinez-Gil
- Abstract summary: This research proposes a new context-aware approach to unsupervised word sense disambiguation.
It provides a flexible mechanism for incorporating contextual information into the similarity measurement process.
Our findings underscore the significance of integrating contextual information in semantic similarity measurements.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The issue of word sense ambiguity poses a significant challenge in natural
language processing due to the scarcity of annotated data to feed machine
learning models to face the challenge. Therefore, unsupervised word sense
disambiguation methods have been developed to overcome that challenge without
relying on annotated data. This research proposes a new context-aware approach
to unsupervised word sense disambiguation, which provides a flexible mechanism
for incorporating contextual information into the similarity measurement
process. We experiment with a popular benchmark dataset to evaluate the
proposed strategy and compare its performance with state-of-the-art
unsupervised word sense disambiguation techniques. The experimental results
indicate that our approach substantially enhances disambiguation accuracy and
surpasses the performance of several existing techniques. Our findings
underscore the significance of integrating contextual information in semantic
similarity measurements to manage word sense ambiguity in unsupervised
scenarios effectively.
Related papers
- Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding [118.75567341513897]
Existing methods typically analyze target text in isolation or solely with non-member contexts.
We propose Con-ReCall, a novel approach that leverages the asymmetric distributional shifts induced by member and non-member contexts.
arXiv Detail & Related papers (2024-09-05T09:10:38Z) - Embarrassingly Simple Unsupervised Aspect Based Sentiment Tuple Extraction [0.6429156819529861]
We propose a simple and novel unsupervised approach to extract opinion terms and the corresponding sentiment polarity for aspect terms in a sentence.
Our experimental evaluations, conducted on four benchmark datasets, demonstrate compelling performance to extract the oriented aspect opinion words.
arXiv Detail & Related papers (2024-04-21T19:20:42Z) - How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored.
Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges.
We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z) - Interpretable Automatic Fine-grained Inconsistency Detection in Text
Summarization [56.94741578760294]
We propose the task of fine-grained inconsistency detection, the goal of which is to predict the fine-grained types of factual errors in a summary.
Motivated by how humans inspect factual inconsistency in summaries, we propose an interpretable fine-grained inconsistency detection model, FineGrainFact.
arXiv Detail & Related papers (2023-05-23T22:11:47Z) - Dive into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty
Estimation for Facial Expression Recognition [59.52434325897716]
We propose a solution, named DMUE, to address the problem of annotation ambiguity from two perspectives.
For the former, an auxiliary multi-branch learning framework is introduced to better mine and describe the latent distribution in the label space.
For the latter, the pairwise relationship of semantic feature between instances are fully exploited to estimate the ambiguity extent in the instance space.
arXiv Detail & Related papers (2021-04-01T03:21:57Z) - Disambiguation of weak supervision with exponential convergence rates [88.99819200562784]
In supervised learning, data are annotated with incomplete yet discriminative information.
In this paper, we focus on partial labelling, an instance of weak supervision where, from a given input, we are given a set of potential targets.
We propose an empirical disambiguation algorithm to recover full supervision from weak supervision.
arXiv Detail & Related papers (2021-02-04T18:14:32Z) - Multi-sense embeddings through a word sense disambiguation process [2.2344764434954256]
Most Suitable Sense.
(MSSA) disambiguates and annotates each word by its specific sense, considering the semantic effects of its context.
We test our approach on six different benchmarks for the word similarity task, showing that our approach can produce state-of-the-art results.
arXiv Detail & Related papers (2021-01-21T16:22:34Z) - A Novel Word Sense Disambiguation Approach Using WordNet Knowledge Graph [0.0]
This paper presents a knowledge-based word sense disambiguation algorithm, namely Sequential Contextual Similarity Matrix multiplication (SCSMM)
The SCSMM algorithm combines semantic similarity, knowledge, and document context to respectively exploit the merits of local context.
The proposed algorithm outperformed all other algorithms when disambiguating nouns on the combined gold standard datasets.
arXiv Detail & Related papers (2021-01-08T06:47:32Z) - Detecting Word Sense Disambiguation Biases in Machine Translation for
Model-Agnostic Adversarial Attacks [84.61578555312288]
We introduce a method for the prediction of disambiguation errors based on statistical data properties.
We develop a simple adversarial attack strategy that minimally perturbs sentences in order to elicit disambiguation errors.
Our findings indicate that disambiguation robustness varies substantially between domains and that different models trained on the same data are vulnerable to different attacks.
arXiv Detail & Related papers (2020-11-03T17:01:44Z) - Towards Semantic Noise Cleansing of Categorical Data based on Semantic
Infusion [4.825584239754082]
We formalize semantic noise as a sequence of terms that do not contribute to the narrative of the text.
We present a novel Semantic Infusion technique to associate meta-data with the categorical corpus text.
We propose an unsupervised text-preprocessing framework to filter the semantic noise using the context of the terms.
arXiv Detail & Related papers (2020-02-06T13:11:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.