Dense Embeddings Preserving the Semantic Relationships in WordNet
- URL: http://arxiv.org/abs/2004.10863v2
- Date: Sun, 12 Jun 2022 17:53:15 GMT
- Title: Dense Embeddings Preserving the Semantic Relationships in WordNet
- Authors: Canlin Zhang and Xiuwen Liu
- Abstract summary: We provide a novel way to generate low dimensional vector embeddings for noun and verb synsets in WordNet.
We call this embedding the Sense Spectrum (and Sense Spectra for embeddings)
In order to create suitable labels for the training of sense spectra, we designed a new similarity measurement for noun and verb synsets in WordNet.
- Score: 2.9443230571766854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we provide a novel way to generate low dimensional vector
embeddings for the noun and verb synsets in WordNet, where the hypernym-hyponym
relationship is preserved in the embeddings. We call this embedding the Sense
Spectrum (and Sense Spectra for embeddings). In order to create suitable labels
for the training of sense spectra, we designed a new similarity measurement for
noun and verb synsets in WordNet. We call this similarity measurement the
Hypernym Intersection Similarity (HIS), since it compares the common and unique
hypernyms between two synsets. Our experiments show that on the noun and verb
pairs of the SimLex-999 dataset, HIS outperforms the three similarity
measurements in WordNet. Moreover, to the best of our knowledge, the sense
spectra provide the first dense synset embeddings that preserve the semantic
relationships in WordNet.
Related papers
- Self-Supervised Speech Representations are More Phonetic than Semantic [52.02626675137819]
Self-supervised speech models (S3Ms) have become an effective backbone for speech applications.
We seek a more fine-grained analysis of the word-level linguistic properties encoded in S3Ms.
Our study reveals that S3M representations consistently and significantly exhibit more phonetic than semantic similarity.
arXiv Detail & Related papers (2024-06-12T20:04:44Z) - Homonymy Information for English WordNet [9.860944032009847]
We exploit recent advances in language modelling to synthesise homonymy annotation for Princeton WordNet.
We pair definitions based on their proximity in an embedding space produced by a Transformer model.
Despite the simplicity of this approach, our best model attains an F1 of.97 on an evaluation set that we annotate.
arXiv Detail & Related papers (2022-12-16T10:23:26Z) - Word Embeddings Are Capable of Capturing Rhythmic Similarity of Words [0.0]
Word embedding systems such as Word2Vec and GloVe are well-known in deep learning approaches to NLP.
In this work we investigated their usefulness in capturing rhythmic similarity of words instead.
The results show that vectors these embeddings assign to rhyming words are more similar to each other, compared to the other words.
arXiv Detail & Related papers (2022-04-11T02:33:23Z) - Interval Probabilistic Fuzzy WordNet [8.396691008449704]
We present an algorithm for constructing the Interval Probabilistic Fuzzy (IPF) synsets in any language.
We constructed and published the IPF synsets of WordNet for English language.
arXiv Detail & Related papers (2021-04-04T17:28:37Z) - SensPick: Sense Picking for Word Sense Disambiguation [1.1429576742016154]
We use both context and related gloss information of a target word to model the semantic relationship between the word and the set of glosses.
We propose SensPick, a type of stacked bidirectional Long Short Term Memory (LSTM) network to perform the WSD task.
arXiv Detail & Related papers (2021-02-10T04:52:42Z) - SemGloVe: Semantic Co-occurrences for GloVe from BERT [55.420035541274444]
GloVe learns word embeddings by leveraging statistical information from word co-occurrence matrices.
We propose SemGloVe, which distills semantic co-occurrences from BERT into static GloVe word embeddings.
arXiv Detail & Related papers (2020-12-30T15:38:26Z) - R$^2$-Net: Relation of Relation Learning Network for Sentence Semantic
Matching [58.72111690643359]
We propose a Relation of Relation Learning Network (R2-Net) for sentence semantic matching.
We first employ BERT to encode the input sentences from a global perspective.
Then a CNN-based encoder is designed to capture keywords and phrase information from a local perspective.
To fully leverage labels for better relation information extraction, we introduce a self-supervised relation of relation classification task.
arXiv Detail & Related papers (2020-12-16T13:11:30Z) - Syntactic representation learning for neural network based TTS with
syntactic parse tree traversal [49.05471750563229]
We propose a syntactic representation learning method based on syntactic parse tree to automatically utilize the syntactic structure information.
Experimental results demonstrate the effectiveness of our proposed approach.
For sentences with multiple syntactic parse trees, prosodic differences can be clearly perceived from the synthesized speeches.
arXiv Detail & Related papers (2020-12-13T05:52:07Z) - SynSetExpan: An Iterative Framework for Joint Entity Set Expansion and
Synonym Discovery [66.24624547470175]
SynSetExpan is a novel framework that enables two tasks to mutually enhance each other.
We create the first large-scale Synonym-Enhanced Set Expansion dataset via crowdsourcing.
Experiments on the SE2 dataset and previous benchmarks demonstrate the effectiveness of SynSetExpan for both entity set expansion and synonym discovery tasks.
arXiv Detail & Related papers (2020-09-29T07:32:17Z) - Using Distributional Thesaurus Embedding for Co-hyponymy Detection [11.165092545013799]
We investigate whether the network embedding of distributional thesaurus can be effectively utilized to detect co-hyponymy relations.
We show that the vector representation obtained by applying node2vec on distributional thesaurus outperforms the state-of-the-art models for binary classification of co-hyponymy vs. hypernymy.
arXiv Detail & Related papers (2020-02-24T20:11:35Z) - Multiplex Word Embeddings for Selectional Preference Acquisition [70.33531759861111]
We propose a multiplex word embedding model, which can be easily extended according to various relations among words.
Our model can effectively distinguish words with respect to different relations without introducing unnecessary sparseness.
arXiv Detail & Related papers (2020-01-09T04:47:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.