Related papers: Semi-automatic WordNet Linking using Word Embeddings

Semi-automatic WordNet Linking using Word Embeddings

URL: http://arxiv.org/abs/2201.01747v1
Date: Wed, 5 Jan 2022 18:15:55 GMT
Title: Semi-automatic WordNet Linking using Word Embeddings
Authors: Kevin Patel, Diptesh Kanojia, Pushpak Bhattacharyya
Abstract summary: Linked wordnets are extensions of wordnets, which link similar concepts in wordnets of different languages. We propose an approach to link wordnets. Given a synset of the source language, the approach returns a ranked list of potential candidate synsets. Our technique is able to retrieve a winner synset in the top 10 ranked list for 60% of all synsets and 70% of noun synsets.
Score: 33.15250956247636
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Wordnets are rich lexico-semantic resources. Linked wordnets are extensions of wordnets, which link similar concepts in wordnets of different languages. Such resources are extremely useful in many Natural Language Processing (NLP) applications, primarily those based on knowledge-based approaches. In such approaches, these resources are considered as gold standard/oracle. Thus, it is crucial that these resources hold correct information. Thereby, they are created by human experts. However, manual maintenance of such resources is a tedious and costly affair. Thus techniques that can aid the experts are desirable. In this paper, we propose an approach to link wordnets. Given a synset of the source language, the approach returns a ranked list of potential candidate synsets in the target language from which the human expert can choose the correct one(s). Our technique is able to retrieve a winner synset in the top 10 ranked list for 60% of all synsets and 70% of noun synsets.

Related papers

OpenGloss: A Synthetic Encyclopedic Dictionary and Semantic Knowledge Graph [0.0]
OpenGloss is a synthetic encyclopedic dictionary and semantic knowledge graph for English.<n>It integrates lexicographic definitions, encyclopedic context, etymological histories, and semantic relationships in a unified resource.<n>The entire resource was produced in under one week for under $1,000.
arXiv Detail & Related papers (2025-11-23T21:33:53Z)
Automatically constructing Wordnet synsets [2.363388546004777]
We propose approaches to generate Wordnet synsets for languages both resource-rich and resource-poor. Our algorithms translate synsets of existing Wordnets to a target language T, then apply a ranking method on the translation candidates to find best translations in T.
arXiv Detail & Related papers (2022-08-08T02:02:18Z)
Towards Automatic Construction of Filipino WordNet: Word Sense Induction and Synset Induction Using Sentence Embeddings [0.7214142393172727]
This study proposes a method for word sense induction and synset induction using only two linguistic resources. The resulting sense inventory and synonym sets can be used in automatically creating a wordnet. This study empirically shows that the 30% of the induced word senses are valid and 40% of the induced synsets are valid in which 20% are novel synsets.
arXiv Detail & Related papers (2022-04-07T06:50:37Z)
Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation [133.7313847857935]
Our study highlights how NLP methods can be adapted to thousands more languages that are under-served by current technology. For 19 under-represented languages across 3 tasks, our methods lead to consistent improvements of up to 5 and 15 points with and without extra monolingual text respectively.
arXiv Detail & Related papers (2022-03-17T16:48:22Z)
Indian Language Wordnets and their Linkages with Princeton WordNet [38.50911435531732]
We release mappings of 18 Indian language wordnets linked with Princeton WordNet. We believe that availability of such resources will have a direct impact on the progress in NLP for these languages.
arXiv Detail & Related papers (2022-01-09T10:12:31Z)
Multilingual Irony Detection with Dependency Syntax and Neural Models [61.32653485523036]
It focuses on the contribution from syntactic knowledge, exploiting linguistic resources where syntax is annotated according to the Universal Dependencies scheme. The results suggest that fine-grained dependency-based syntactic information is informative for the detection of irony.
arXiv Detail & Related papers (2020-11-11T11:22:05Z)
Computational linguistic assessment of textbook and online learning media by means of threshold concepts in business education [59.003956312175795]
From a linguistic perspective, threshold concepts are instances of specialized vocabularies, exhibiting particular linguistic features. The profiles of 63 threshold concepts from business education have been investigated in textbooks, newspapers, and Wikipedia. The three kinds of resources can indeed be distinguished in terms of their threshold concepts' profiles.
arXiv Detail & Related papers (2020-08-05T12:56:16Z)
An Algorithm for Fuzzification of WordNets, Supported by a Mathematical Proof [3.684688928766659]
We present an algorithm for constructing fuzzy versions of WLDs of any language. We publish online the fuzzified version of English WordNet (FWN)
arXiv Detail & Related papers (2020-06-07T04:47:40Z)
Word Sense Disambiguation for 158 Languages using Word Embeddings Only [80.79437083582643]
Disambiguation of word senses in context is easy for humans, but a major challenge for automatic approaches. We present a method that takes as input a standard pre-trained word embedding model and induces a fully-fledged word sense inventory. We use this method to induce a collection of sense inventories for 158 languages on the basis of the original pre-trained fastText word embeddings.
arXiv Detail & Related papers (2020-03-14T14:50:04Z)
Automatic Compilation of Resources for Academic Writing and Evaluating with Informal Word Identification and Paraphrasing System [24.42822218256954]
We present the first approach to automatically building resources for academic writing. The aim is to build a writing aid system that automatically edits a text so that it better adheres to the academic style of writing.
arXiv Detail & Related papers (2020-03-05T22:55:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.