FEWS: Large-Scale, Low-Shot Word Sense Disambiguation with the
Dictionary
- URL: http://arxiv.org/abs/2102.07983v1
- Date: Tue, 16 Feb 2021 07:13:34 GMT
- Title: FEWS: Large-Scale, Low-Shot Word Sense Disambiguation with the
Dictionary
- Authors: Terra Blevins, Mandar Joshi, and Luke Zettlemoyer
- Abstract summary: Current models for Word Sense Disambiguation (WSD) struggle to disambiguate rare senses.
This paper introduces FEWS, a new low-shot WSD dataset automatically extracted from example sentences in Wiktionary.
- Score: 43.32179344258548
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current models for Word Sense Disambiguation (WSD) struggle to disambiguate
rare senses, despite reaching human performance on global WSD metrics. This
stems from a lack of data for both modeling and evaluating rare senses in
existing WSD datasets. In this paper, we introduce FEWS (Few-shot Examples of
Word Senses), a new low-shot WSD dataset automatically extracted from example
sentences in Wiktionary. FEWS has high sense coverage across different natural
language domains and provides: (1) a large training set that covers many more
senses than previous datasets and (2) a comprehensive evaluation set containing
few- and zero-shot examples of a wide variety of senses. We establish baselines
on FEWS with knowledge-based and neural WSD approaches and present transfer
learning experiments demonstrating that models additionally trained with FEWS
better capture rare senses in existing WSD datasets. Finally, we find humans
outperform the best baseline models on FEWS, indicating that FEWS will support
significant future work on low-shot WSD.
Related papers
- Beyond Coarse-Grained Matching in Video-Text Retrieval [50.799697216533914]
We introduce a new approach for fine-grained evaluation.
Our approach can be applied to existing datasets by automatically generating hard negative test captions.
Experiments on our fine-grained evaluations demonstrate that this approach enhances a model's ability to understand fine-grained differences.
arXiv Detail & Related papers (2024-10-16T09:42:29Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Retrieval-based Disentangled Representation Learning with Natural
Language Supervision [61.75109410513864]
We present Vocabulary Disentangled Retrieval (VDR), a retrieval-based framework that harnesses natural language as proxies of the underlying data variation to drive disentangled representation learning.
Our approach employ a bi-encoder model to represent both data and natural language in a vocabulary space, enabling the model to distinguish intrinsic dimensions that capture characteristics within data through its natural language counterpart, thus disentanglement.
arXiv Detail & Related papers (2022-12-15T10:20:42Z) - Meta-Learning with Variational Semantic Memory for Word Sense
Disambiguation [56.830395467247016]
We propose a model of semantic memory for WSD in a meta-learning setting.
Our model is based on hierarchical variational inference and incorporates an adaptive memory update rule via a hypernetwork.
We show our model advances the state of the art in few-shot WSD, supports effective learning in extremely data scarce scenarios.
arXiv Detail & Related papers (2021-06-05T20:40:01Z) - LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and
Beyond [2.9005223064604078]
Recent Transformer-based Language Models have proven capable of producing contextual word representations that reliably convey sense-specific information.
We introduce a more principled approach to leverage information from all layers of NLMs, informed by a probing analysis on 14 NLM variants.
We also emphasize the versatility of these sense embeddings in contrast to task-specific models, applying them on several sense-related tasks, besides WSD.
arXiv Detail & Related papers (2021-05-26T10:14:22Z) - SensPick: Sense Picking for Word Sense Disambiguation [1.1429576742016154]
We use both context and related gloss information of a target word to model the semantic relationship between the word and the set of glosses.
We propose SensPick, a type of stacked bidirectional Long Short Term Memory (LSTM) network to perform the WSD task.
arXiv Detail & Related papers (2021-02-10T04:52:42Z) - mask-Net: Learning Context Aware Invariant Features using Adversarial
Forgetting (Student Abstract) [46.61843360106884]
We propose a novel approach to induce invariance using adversarial forgetting (AF)
Our initial experiments on learning invariant features such as accent on the STT task achieve better generalizations in terms of word error rate (WER) compared to the traditional models.
arXiv Detail & Related papers (2020-11-25T19:02:13Z) - Analysis and Evaluation of Language Models for Word Sense Disambiguation [18.001457030065712]
Transformer-based language models have taken many fields in NLP by storm.
BERT can accurately capture high-level sense distinctions, even when a limited number of examples is available for each word sense.
BERT and its derivatives dominate most of the existing evaluation benchmarks.
arXiv Detail & Related papers (2020-08-26T15:07:07Z) - Stance Detection Benchmark: How Robust Is Your Stance Detection? [65.91772010586605]
Stance Detection (StD) aims to detect an author's stance towards a certain topic or claim.
We introduce a StD benchmark that learns from ten StD datasets of various domains in a multi-dataset learning setting.
Within this benchmark setup, we are able to present new state-of-the-art results on five of the datasets.
arXiv Detail & Related papers (2020-01-06T13:37:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.