Context-gloss Augmentation for Improving Word Sense Disambiguation
- URL: http://arxiv.org/abs/2110.07174v1
- Date: Thu, 14 Oct 2021 06:27:19 GMT
- Title: Context-gloss Augmentation for Improving Word Sense Disambiguation
- Authors: Guan-Ting Lin, Manuel Giambi
- Abstract summary: The goal of Word Sense Disambiguation (WSD) is to identify the sense of a polysemous word in a specific context.
We show that both sentence-level and word-level augmentation methods are effective strategies for WSD.
Also, we find out that performance can be improved by adding hypernyms' glosses obtained from a lexical knowledge base.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The goal of Word Sense Disambiguation (WSD) is to identify the sense of a
polysemous word in a specific context. Deep-learning techniques using BERT have
achieved very promising results in the field and different methods have been
proposed to integrate structured knowledge to enhance performance. At the same
time, an increasing number of data augmentation techniques have been proven to
be useful for NLP tasks. Building upon previous works leveraging BERT and
WordNet knowledge, we explore different data augmentation techniques on
context-gloss pairs to improve the performance of WSD. In our experiment, we
show that both sentence-level and word-level augmentation methods are effective
strategies for WSD. Also, we find out that performance can be improved by
adding hypernyms' glosses obtained from a lexical knowledge base. We compare
and analyze different context-gloss augmentation techniques, and the results
show that applying back translation on gloss performs the best.
Related papers
- TG-LLaVA: Text Guided LLaVA via Learnable Latent Embeddings [61.9257731511557]
We propose Text Guided LLaVA (TG-LLaVA) to optimize vision-language models (VLMs)
We use learnable latent embeddings as a bridge to analyze textual instruction and add the analysis results to the vision encoder as guidance.
With the guidance of text, the vision encoder can extract text-related features, similar to how humans focus on the most relevant parts of an image when considering a question.
arXiv Detail & Related papers (2024-09-15T00:38:34Z) - BERTer: The Efficient One [0.0]
We explore advanced fine-tuning techniques to boost BERT's performance in sentiment analysis, paraphrase detection, and semantic textual similarity.
Our findings reveal substantial improvements in model efficiency and effectiveness when combining multiple fine-tuning architectures.
arXiv Detail & Related papers (2024-07-19T05:33:09Z) - FecTek: Enhancing Term Weight in Lexicon-Based Retrieval with Feature Context and Term-level Knowledge [54.61068946420894]
We introduce an innovative method by introducing FEature Context and TErm-level Knowledge modules.
To effectively enrich the feature context representations of term weight, the Feature Context Module (FCM) is introduced.
We also develop a term-level knowledge guidance module (TKGM) for effectively utilizing term-level knowledge to intelligently guide the modeling process of term weight.
arXiv Detail & Related papers (2024-04-18T12:58:36Z) - LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named
Entity Recognition [67.96794382040547]
$LLM-DA$ is a novel data augmentation technique based on large language models (LLMs) for the few-shot NER task.
Our approach involves employing 14 contextual rewriting strategies, designing entity replacements of the same type, and incorporating noise injection to enhance robustness.
arXiv Detail & Related papers (2024-02-22T14:19:56Z) - Distributional Data Augmentation Methods for Low Resource Language [0.9208007322096533]
Easy data augmentation (EDA) augments the training data by injecting and replacing synonyms and randomly permuting sentences.
One major obstacle with EDA is the need for versatile and complete synonym dictionaries, which cannot be easily found in low-resource languages.
We propose two extensions, easy distributional data augmentation (EDDA) and type specific similar word replacement (TSSR), which uses semantic word context information and part-of-speech tags for word replacement and augmentation.
arXiv Detail & Related papers (2023-09-09T19:01:59Z) - Iterative Prompt Learning for Unsupervised Backlit Image Enhancement [86.90993077000789]
We propose a novel unsupervised backlit image enhancement method, abbreviated as CLIP-LIT.
We show that the open-world CLIP prior aids in distinguishing between backlit and well-lit images.
Our method alternates between updating the prompt learning framework and enhancement network until visually pleasing results are achieved.
arXiv Detail & Related papers (2023-03-30T17:37:14Z) - CMSBERT-CLR: Context-driven Modality Shifting BERT with Contrastive
Learning for linguistic, visual, acoustic Representations [0.7081604594416336]
We present a Context-driven Modality Shifting BERT with Contrastive Learning for linguistic, visual, acoustic Representations (CMSBERT-CLR)
CMSBERT-CLR incorporates the whole context's non-verbal and verbal information and aligns modalities more effectively through contrastive learning.
In our experiments, we demonstrate that our approach achieves state-of-the-art results.
arXiv Detail & Related papers (2022-08-21T08:21:43Z) - Syntax-driven Data Augmentation for Named Entity Recognition [3.0603554929274908]
In low resource settings, data augmentation strategies are commonly leveraged to improve performance.
We compare simple masked language model replacement and an augmentation method using constituency tree mutations to improve named entity recognition.
arXiv Detail & Related papers (2022-08-15T01:24:55Z) - Data Augmentation for Voice-Assistant NLU using BERT-based
Interchangeable Rephrase [39.09474362100266]
We introduce a data augmentation technique based on byte pair encoding and a BERT-like self-attention model to boost performance on spoken language understanding tasks.
We show our method performs strongly on domain and intent classification tasks for a voice assistant and in a user-study focused on utterance naturalness and semantic similarity.
arXiv Detail & Related papers (2021-04-16T17:53:58Z) - SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation.
Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z) - Syntax-aware Data Augmentation for Neural Machine Translation [76.99198797021454]
We propose a novel data augmentation strategy for neural machine translation.
We set sentence-specific probability for word selection by considering their roles in sentence.
Our proposed method is evaluated on WMT14 English-to-German dataset and IWSLT14 German-to-English dataset.
arXiv Detail & Related papers (2020-04-29T13:45:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.