Related papers: Decomposing Word Embedding with the Capsule Network

Decomposing Word Embedding with the Capsule Network

URL: http://arxiv.org/abs/2004.13844v2
Date: Tue, 30 Jun 2020 01:58:27 GMT
Title: Decomposing Word Embedding with the Capsule Network
Authors: Xin Liu, Qingcai Chen, Yan Liu, Joanna Siebert, Baotian Hu, Xiangping Wu and Buzhou Tang
Abstract summary: We propose a capsule network-based method to Decompose the unsupervised word Embedding of an ambiguous word into context specific Sense embedding. With attention operations, CapsDecE2S integrates the word context to reconstruct the multiple morpheme-like vectors into the context-specific sense embedding. In this method, we convert the sense learning into a binary classification that explicitly learns the relation between senses by the label of matching and non-matching.
Score: 23.294890047230584
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Word sense disambiguation tries to learn the appropriate sense of an ambiguous word in a given context. The existing pre-trained language methods and the methods based on multi-embeddings of word did not explore the power of the unsupervised word embedding sufficiently. In this paper, we discuss a capsule network-based approach, taking advantage of capsule's potential for recognizing highly overlapping features and dealing with segmentation. We propose a Capsule network-based method to Decompose the unsupervised word Embedding of an ambiguous word into context specific Sense embedding, called CapsDecE2S. In this approach, the unsupervised ambiguous embedding is fed into capsule network to produce its multiple morpheme-like vectors, which are defined as the basic semantic language units of meaning. With attention operations, CapsDecE2S integrates the word context to reconstruct the multiple morpheme-like vectors into the context-specific sense embedding. To train CapsDecE2S, we propose a sense matching training method. In this method, we convert the sense learning into a binary classification that explicitly learns the relation between senses by the label of matching and non-matching. The CapsDecE2S was experimentally evaluated on two sense learning tasks, i.e., word in context and word sense disambiguation. Results on two public corpora Word-in-Context and English all-words Word Sense Disambiguation show that, the CapsDecE2S model achieves the new state-of-the-art for the word in context and word sense disambiguation tasks.

Related papers

To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models [0.9176056742068812]
Polysemy and synonymy are crucial facets of lexical ambiguity. In this paper, we introduce Concept Induction, the unsupervised task of learning a soft clustering among words. We propose a bi-level approach to Concept Induction that leverages both a local lemma-centric view and a global cross-lexicon perspective.
arXiv Detail & Related papers (2024-06-28T17:07:06Z)
Word sense extension [8.939269057094661]
We present a paradigm of word sense extension (WSE) that enables words to spawn new senses toward novel context. We develop a framework that simulates novel word sense extension by partitioning a polysemous word type into two pseudo-tokens that mark its different senses. Our framework combines cognitive models of chaining with a learning scheme that transforms a language model embedding space to support various types of word sense extension.
arXiv Detail & Related papers (2023-06-09T00:54:21Z)
Together We Make Sense -- Learning Meta-Sense Embeddings from Pretrained Static Sense Embeddings [19.12036493733793]
We propose the first-ever meta-sense embedding method -- Neighbour Preserving Meta-Sense Embeddings. Our proposed method can combine source sense embeddings that cover different sets of word senses. Experimental results on Word Sense Disambiguation (WSD) and Word-in-Context (WiC) tasks show that the proposed meta-sense embedding method consistently outperforms several competitive baselines.
arXiv Detail & Related papers (2023-05-30T14:53:44Z)
Word Sense Induction with Knowledge Distillation from BERT [6.88247391730482]
This paper proposes a method to distill multiple word senses from a pre-trained language model (BERT) by using attention over the senses of a word in a context. Experiments on the contextual word similarity and sense induction tasks show that this method is superior to or competitive with state-of-the-art multi-sense embeddings.
arXiv Detail & Related papers (2023-04-20T21:05:35Z)
Latent Topology Induction for Understanding Contextualized Representations [84.7918739062235]
We study the representation space of contextualized embeddings and gain insight into the hidden topology of large language models. We show there exists a network of latent states that summarize linguistic properties of contextualized representations.
arXiv Detail & Related papers (2022-06-03T11:22:48Z)
Learning Sense-Specific Static Embeddings using Contextualised Word Embeddings as a Proxy [26.385418377513332]
We propose Context Derived Embeddings of Senses (CDES) CDES extracts sense related information from contextualised embeddings and injects it into static embeddings to create sense-specific static embeddings. We show that CDES can accurately learn sense-specific static embeddings reporting comparable performance to the current state-of-the-art sense embeddings.
arXiv Detail & Related papers (2021-10-05T17:50:48Z)
R$^2$-Net: Relation of Relation Learning Network for Sentence Semantic Matching [58.72111690643359]
We propose a Relation of Relation Learning Network (R2-Net) for sentence semantic matching. We first employ BERT to encode the input sentences from a global perspective. Then a CNN-based encoder is designed to capture keywords and phrase information from a local perspective. To fully leverage labels for better relation information extraction, we introduce a self-supervised relation of relation classification task.
arXiv Detail & Related papers (2020-12-16T13:11:30Z)
SST-BERT at SemEval-2020 Task 1: Semantic Shift Tracing by Clustering in BERT-based Embedding Spaces [63.17308641484404]
We propose to identify clusters among different occurrences of each target word, considering these as representatives of different word meanings. Disagreements in obtained clusters naturally allow to quantify the level of semantic shift per each target word in four target languages. Our approach performs well both measured separately (per language) and overall, where we surpass all provided SemEval baselines.
arXiv Detail & Related papers (2020-10-02T08:38:40Z)
Moving Down the Long Tail of Word Sense Disambiguation with Gloss-Informed Biencoders [79.38278330678965]
A major obstacle in Word Sense Disambiguation (WSD) is that word senses are not uniformly distributed. We propose a bi-encoder model that independently embeds (1) the target word with its surrounding context and (2) the dictionary definition, or gloss, of each sense.
arXiv Detail & Related papers (2020-05-06T04:21:45Z)
Word Sense Disambiguation for 158 Languages using Word Embeddings Only [80.79437083582643]
Disambiguation of word senses in context is easy for humans, but a major challenge for automatic approaches. We present a method that takes as input a standard pre-trained word embedding model and induces a fully-fledged word sense inventory. We use this method to induce a collection of sense inventories for 158 languages on the basis of the original pre-trained fastText word embeddings.
arXiv Detail & Related papers (2020-03-14T14:50:04Z)
CASE: Context-Aware Semantic Expansion [68.30244980290742]
This paper defines and studies a new task called Context-Aware Semantic Expansion (CASE) Given a seed term in a sentential context, we aim to suggest other terms that well fit the context as the seed. We show that annotations for this task can be harvested at scale from existing corpora, in a fully automatic manner.
arXiv Detail & Related papers (2019-12-31T06:38:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.