Semeval-2022 Task 1: CODWOE -- Comparing Dictionaries and Word
Embeddings
- URL: http://arxiv.org/abs/2205.13858v1
- Date: Fri, 27 May 2022 09:40:33 GMT
- Title: Semeval-2022 Task 1: CODWOE -- Comparing Dictionaries and Word
Embeddings
- Authors: Timothee Mickus and Kees van Deemter and Mathieu Constant and Denis
Paperno
- Abstract summary: We focus on relating opaque word vectors with human-readable definitions.
This problem naturally divides into two subtasks: converting definitions into embeddings, and converting embeddings into definitions.
This task was conducted in a multilingual setting, using comparable sets of embeddings trained homogeneously.
- Score: 1.5293427903448025
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Word embeddings have advanced the state of the art in NLP across numerous
tasks. Understanding the contents of dense neural representations is of utmost
interest to the computational semantics community. We propose to focus on
relating these opaque word vectors with human-readable definitions, as found in
dictionaries. This problem naturally divides into two subtasks: converting
definitions into embeddings, and converting embeddings into definitions. This
task was conducted in a multilingual setting, using comparable sets of
embeddings trained homogeneously.
Related papers
- Domain Embeddings for Generating Complex Descriptions of Concepts in
Italian Language [65.268245109828]
We propose a Distributional Semantic resource enriched with linguistic and lexical information extracted from electronic dictionaries.
The resource comprises 21 domain-specific matrices, one comprehensive matrix, and a Graphical User Interface.
Our model facilitates the generation of reasoned semantic descriptions of concepts by selecting matrices directly associated with concrete conceptual knowledge.
arXiv Detail & Related papers (2024-02-26T15:04:35Z) - Monolingual alignment of word senses and definitions in lexicographical
resources [0.0]
The focus of this thesis is broadly on the alignment of lexicographical data, particularly dictionaries.
The first task aims to find an optimal alignment given the sense definitions of a headword in two different monolingual dictionaries.
This benchmark can be used for evaluation purposes of word-sense alignment systems.
arXiv Detail & Related papers (2022-09-06T13:09:52Z) - IRB-NLP at SemEval-2022 Task 1: Exploring the Relationship Between Words
and Their Semantic Representations [0.0]
We present our findings based on the descriptive, exploratory, and predictive data analysis conducted on the CODWOE dataset.
We give a detailed overview of the systems that we designed for Definition Modeling and Reverse Dictionary tasks.
arXiv Detail & Related papers (2022-05-13T18:15:20Z) - A Survey On Neural Word Embeddings [0.4822598110892847]
The study of meaning in natural language processing relies on the distributional hypothesis.
The revolutionary idea of distributed representation for a concept is close to the working of a human mind.
Neural word embeddings transformed the whole field of NLP by introducing substantial improvements in all NLP tasks.
arXiv Detail & Related papers (2021-10-05T03:37:57Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - RAW-C: Relatedness of Ambiguous Words--in Context (A New Lexical
Resource for English) [2.792030485253753]
We evaluate how well contextualized embeddings accommodate the continuous, dynamic nature of word meaning.
We show that cosine distance systematically underestimates how similar humans find uses of the same sense of a word to be.
We propose a synthesis between psycholinguistic theories of the mental lexicon and computational models of lexical semantics.
arXiv Detail & Related papers (2021-05-27T16:07:13Z) - SemGloVe: Semantic Co-occurrences for GloVe from BERT [55.420035541274444]
GloVe learns word embeddings by leveraging statistical information from word co-occurrence matrices.
We propose SemGloVe, which distills semantic co-occurrences from BERT into static GloVe word embeddings.
arXiv Detail & Related papers (2020-12-30T15:38:26Z) - SST-BERT at SemEval-2020 Task 1: Semantic Shift Tracing by Clustering in
BERT-based Embedding Spaces [63.17308641484404]
We propose to identify clusters among different occurrences of each target word, considering these as representatives of different word meanings.
Disagreements in obtained clusters naturally allow to quantify the level of semantic shift per each target word in four target languages.
Our approach performs well both measured separately (per language) and overall, where we surpass all provided SemEval baselines.
arXiv Detail & Related papers (2020-10-02T08:38:40Z) - Interactive Re-Fitting as a Technique for Improving Word Embeddings [0.0]
We make it possible for humans to adjust portions of a word embedding space by moving sets of words closer to one another.
Our approach allows users to trigger selective post-processing as they interact with and assess potential bias in word embeddings.
arXiv Detail & Related papers (2020-09-30T21:54:22Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - RUSSE'2020: Findings of the First Taxonomy Enrichment Task for the
Russian language [70.27072729280528]
This paper describes the results of the first shared task on taxonomy enrichment for the Russian language.
16 teams participated in the task demonstrating high results with more than half of them outperforming the provided baseline.
arXiv Detail & Related papers (2020-05-22T13:30:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.