Meta-Learning with Variational Semantic Memory for Word Sense
Disambiguation
- URL: http://arxiv.org/abs/2106.02960v1
- Date: Sat, 5 Jun 2021 20:40:01 GMT
- Title: Meta-Learning with Variational Semantic Memory for Word Sense
Disambiguation
- Authors: Yingjun Du, Nithin Holla, Xiantong Zhen, Cees G.M. Snoek, Ekaterina
Shutova
- Abstract summary: We propose a model of semantic memory for WSD in a meta-learning setting.
Our model is based on hierarchical variational inference and incorporates an adaptive memory update rule via a hypernetwork.
We show our model advances the state of the art in few-shot WSD, supports effective learning in extremely data scarce scenarios.
- Score: 56.830395467247016
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A critical challenge faced by supervised word sense disambiguation (WSD) is
the lack of large annotated datasets with sufficient coverage of words in their
diversity of senses. This inspired recent research on few-shot WSD using
meta-learning. While such work has successfully applied meta-learning to learn
new word senses from very few examples, its performance still lags behind its
fully supervised counterpart. Aiming to further close this gap, we propose a
model of semantic memory for WSD in a meta-learning setting. Semantic memory
encapsulates prior experiences seen throughout the lifetime of the model, which
aids better generalization in limited data settings. Our model is based on
hierarchical variational inference and incorporates an adaptive memory update
rule via a hypernetwork. We show our model advances the state of the art in
few-shot WSD, supports effective learning in extremely data scarce (e.g.
one-shot) scenarios and produces meaning prototypes that capture similar senses
of distinct words.
Related papers
- Text-Video Retrieval with Global-Local Semantic Consistent Learning [122.15339128463715]
We propose a simple yet effective method, Global-Local Semantic Consistent Learning (GLSCL)
GLSCL capitalizes on latent shared semantics across modalities for text-video retrieval.
Our method achieves comparable performance with SOTA as well as being nearly 220 times faster in terms of computational cost.
arXiv Detail & Related papers (2024-05-21T11:59:36Z) - Continuously Learning New Words in Automatic Speech Recognition [56.972851337263755]
We propose an self-supervised continual learning approach to recognize new words.
We use a memory-enhanced Automatic Speech Recognition model from previous work.
We show that with this approach, we obtain increasing performance on the new words when they occur more frequently.
arXiv Detail & Related papers (2024-01-09T10:39:17Z) - Context-Aware Meta-Learning [52.09326317432577]
We propose a meta-learning algorithm that emulates Large Language Models by learning new visual concepts during inference without fine-tuning.
Our approach exceeds or matches the state-of-the-art algorithm, P>M>F, on 8 out of 11 meta-learning benchmarks.
arXiv Detail & Related papers (2023-10-17T03:35:27Z) - Meta Learning to Bridge Vision and Language Models for Multimodal
Few-Shot Learning [38.37682598345653]
We introduce a multimodal meta-learning approach to bridge the gap between vision and language models.
We define a meta-mapper network, acting as a meta-learner, to efficiently bridge frozen large-scale vision and language models.
We evaluate our approach on recently proposed multimodal few-shot benchmarks, measuring how rapidly the model can bind novel visual concepts to words.
arXiv Detail & Related papers (2023-02-28T17:46:18Z) - DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for
Open-world Detection [118.36746273425354]
This paper presents a paralleled visual-concept pre-training method for open-world detection by resorting to knowledge enrichment from a designed concept dictionary.
By enriching the concepts with their descriptions, we explicitly build the relationships among various concepts to facilitate the open-domain learning.
The proposed framework demonstrates strong zero-shot detection performances, e.g., on the LVIS dataset, our DetCLIP-T outperforms GLIP-T by 9.9% mAP and obtains a 13.5% improvement on rare categories.
arXiv Detail & Related papers (2022-09-20T02:01:01Z) - EDS-MEMBED: Multi-sense embeddings based on enhanced distributional
semantic structures via a graph walk over word senses [0.0]
We leverage the rich semantic structures in WordNet to enhance the quality of multi-sense embeddings.
We derive new distributional semantic similarity measures for M-SE from prior ones.
We report evaluation results on 11 benchmark datasets involving WSD and Word Similarity tasks.
arXiv Detail & Related papers (2021-02-27T14:36:55Z) - FEWS: Large-Scale, Low-Shot Word Sense Disambiguation with the
Dictionary [43.32179344258548]
Current models for Word Sense Disambiguation (WSD) struggle to disambiguate rare senses.
This paper introduces FEWS, a new low-shot WSD dataset automatically extracted from example sentences in Wiktionary.
arXiv Detail & Related papers (2021-02-16T07:13:34Z) - SensPick: Sense Picking for Word Sense Disambiguation [1.1429576742016154]
We use both context and related gloss information of a target word to model the semantic relationship between the word and the set of glosses.
We propose SensPick, a type of stacked bidirectional Long Short Term Memory (LSTM) network to perform the WSD task.
arXiv Detail & Related papers (2021-02-10T04:52:42Z) - Learning to Learn Variational Semantic Memory [132.39737669936125]
We introduce variational semantic memory into meta-learning to acquire long-term knowledge for few-shot learning.
The semantic memory is grown from scratch and gradually consolidated by absorbing information from tasks it experiences.
We formulate memory recall as the variational inference of a latent memory variable from addressed contents.
arXiv Detail & Related papers (2020-10-20T15:05:26Z) - Don't Neglect the Obvious: On the Role of Unambiguous Words in Word
Sense Disambiguation [5.8523859781812435]
We show how a state-of-the-art propagation-based model can use it to extend the coverage and quality of its word sense embeddings.
We introduce the UWA (Unambiguous Word s) dataset and show how a state-of-the-art propagation-based model can use it to extend the coverage and quality of its word sense embeddings.
arXiv Detail & Related papers (2020-04-29T16:51:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.