Interval Probabilistic Fuzzy WordNet
- URL: http://arxiv.org/abs/2104.10660v1
- Date: Sun, 4 Apr 2021 17:28:37 GMT
- Title: Interval Probabilistic Fuzzy WordNet
- Authors: Yousef Alizadeh-Q, Behrouz Minaei-Bidgoli, Sayyed-Ali Hossayni,
Mohammad-R Akbarzadeh-T, Diego Reforgiato Recupero, Mohammad-Reza Rajati,
Aldo Gangemi
- Abstract summary: We present an algorithm for constructing the Interval Probabilistic Fuzzy (IPF) synsets in any language.
We constructed and published the IPF synsets of WordNet for English language.
- Score: 8.396691008449704
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: WordNet lexical-database groups English words into sets of synonyms called
"synsets." Synsets are utilized for several applications in the field of
text-mining. However, they were also open to criticism because although, in
reality, not all the members of a synset represent the meaning of that synset
with the same degree, in practice, they are considered as members of the
synset, identically. Thus, the fuzzy version of synsets, called fuzzy-synsets
(or fuzzy word-sense classes) were proposed and studied. In this study, we
discuss why (type-1) fuzzy synsets (T1 F-synsets) do not properly model the
membership uncertainty, and propose an upgraded version of fuzzy synsets in
which membership degrees of word-senses are represented by intervals, similar
to what in Interval Type 2 Fuzzy Sets (IT2 FS) and discuss that IT2 FS
theoretical framework is insufficient for analysis and design of such synsets,
and propose a new concept, called Interval Probabilistic Fuzzy (IPF) sets. Then
we present an algorithm for constructing the IPF synsets in any language, given
a corpus and a word-sense-disambiguation system. Utilizing our algorithm and
the open-American-online-corpus (OANC) and UKB word-sense-disambiguation, we
constructed and published the IPF synsets of WordNet for English language.
Related papers
- Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics
Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions.
This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z) - A Benchmark and Scoring Algorithm for Enriching Arabic Synonyms [0.0]
Given a mono/multilingual synset and a threshold (a fuzzy value [0-1]), our goal is to extract new synonyms above this threshold from existing lexicons.
The dataset consists of 3K candidate synonyms for 500 synsets.
Our evaluations show that the algorithm behaves like a linguist and its fuzzy values are close to those proposed by linguists.
arXiv Detail & Related papers (2023-02-04T20:30:32Z) - PropSegmEnt: A Large-Scale Corpus for Proposition-Level Segmentation and
Entailment Recognition [63.51569687229681]
We argue for the need to recognize the textual entailment relation of each proposition in a sentence individually.
We propose PropSegmEnt, a corpus of over 45K propositions annotated by expert human raters.
Our dataset structure resembles the tasks of (1) segmenting sentences within a document to the set of propositions, and (2) classifying the entailment relation of each proposition with respect to a different yet topically-aligned document.
arXiv Detail & Related papers (2022-12-21T04:03:33Z) - Latent Topology Induction for Understanding Contextualized
Representations [84.7918739062235]
We study the representation space of contextualized embeddings and gain insight into the hidden topology of large language models.
We show there exists a network of latent states that summarize linguistic properties of contextualized representations.
arXiv Detail & Related papers (2022-06-03T11:22:48Z) - FastKASSIM: A Fast Tree Kernel-Based Syntactic Similarity Metric [48.66580267438049]
We present FastKASSIM, a metric for utterance- and document-level syntactic similarity.
It pairs and averages the most similar dependency parse trees between a pair of documents based on tree kernels.
It runs up to to 5.2 times faster than our baseline method over the documents in the r/ChangeMyView corpus.
arXiv Detail & Related papers (2022-03-15T22:33:26Z) - Cross-linguistically Consistent Semantic and Syntactic Annotation of Child-directed Speech [27.657676278734534]
This paper proposes a methodology for constructing such corpora of child directed speech paired with sentential logical forms.
The approach enforces a cross-linguistically consistent representation, building on recent advances in dependency representation and semantic parsing.
arXiv Detail & Related papers (2021-09-22T18:17:06Z) - Syntactic representation learning for neural network based TTS with
syntactic parse tree traversal [49.05471750563229]
We propose a syntactic representation learning method based on syntactic parse tree to automatically utilize the syntactic structure information.
Experimental results demonstrate the effectiveness of our proposed approach.
For sentences with multiple syntactic parse trees, prosodic differences can be clearly perceived from the synthesized speeches.
arXiv Detail & Related papers (2020-12-13T05:52:07Z) - SynSetExpan: An Iterative Framework for Joint Entity Set Expansion and
Synonym Discovery [66.24624547470175]
SynSetExpan is a novel framework that enables two tasks to mutually enhance each other.
We create the first large-scale Synonym-Enhanced Set Expansion dataset via crowdsourcing.
Experiments on the SE2 dataset and previous benchmarks demonstrate the effectiveness of SynSetExpan for both entity set expansion and synonym discovery tasks.
arXiv Detail & Related papers (2020-09-29T07:32:17Z) - An Algorithm for Fuzzification of WordNets, Supported by a Mathematical
Proof [3.684688928766659]
We present an algorithm for constructing fuzzy versions of WLDs of any language.
We publish online the fuzzified version of English WordNet (FWN)
arXiv Detail & Related papers (2020-06-07T04:47:40Z) - Dense Embeddings Preserving the Semantic Relationships in WordNet [2.9443230571766854]
We provide a novel way to generate low dimensional vector embeddings for noun and verb synsets in WordNet.
We call this embedding the Sense Spectrum (and Sense Spectra for embeddings)
In order to create suitable labels for the training of sense spectra, we designed a new similarity measurement for noun and verb synsets in WordNet.
arXiv Detail & Related papers (2020-04-22T21:09:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.