RUSSE'2020: Findings of the First Taxonomy Enrichment Task for the
Russian language
- URL: http://arxiv.org/abs/2005.11176v1
- Date: Fri, 22 May 2020 13:30:37 GMT
- Title: RUSSE'2020: Findings of the First Taxonomy Enrichment Task for the
Russian language
- Authors: Irina Nikishina and Varvara Logacheva and Alexander Panchenko and
Natalia Loukachevitch
- Abstract summary: This paper describes the results of the first shared task on taxonomy enrichment for the Russian language.
16 teams participated in the task demonstrating high results with more than half of them outperforming the provided baseline.
- Score: 70.27072729280528
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes the results of the first shared task on taxonomy
enrichment for the Russian language. The participants were asked to extend an
existing taxonomy with previously unseen words: for each new word their systems
should provide a ranked list of possible (candidate) hypernyms. In comparison
to the previous tasks for other languages, our competition has a more realistic
task setting: new words were provided without definitions. Instead, we provided
a textual corpus where these new terms occurred. For this evaluation campaign,
we developed a new evaluation dataset based on unpublished RuWordNet data. The
shared task features two tracks: "nouns" and "verbs". 16 teams participated in
the task demonstrating high results with more than half of them outperforming
the provided baseline.
Related papers
- Presence or Absence: Are Unknown Word Usages in Dictionaries? [6.185216877366987]
We evaluate our system in the AXOLOTL-24 shared task for Finnish, Russian and German languages.
We use a graph-based clustering approach to predict mappings between unknown word usages and dictionary entries.
Our system ranks first in Finnish and German, and ranks second in Russian on the Subtask 2 testphase leaderboard.
arXiv Detail & Related papers (2024-06-02T07:57:45Z) - Sõnajaht: Definition Embeddings and Semantic Search for Reverse Dictionary Creation [0.21485350418225246]
We present an information retrieval based reverse dictionary system using modern pre-trained language models and approximate nearest neighbors search algorithms.
The proposed approach is applied to an existing Estonian language lexicon resource, Sonaveeb (word web), with the purpose of enhancing and enriching it by introducing cross-lingual reverse dictionary functionality powered by semantic search.
arXiv Detail & Related papers (2024-04-30T10:21:14Z) - Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics
Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions.
This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z) - CompoundPiece: Evaluating and Improving Decompounding Performance of
Language Models [77.45934004406283]
We systematically study decompounding, the task of splitting compound words into their constituents.
We introduce a dataset of 255k compound and non-compound words across 56 diverse languages obtained from Wiktionary.
We introduce a novel methodology to train dedicated models for decompounding.
arXiv Detail & Related papers (2023-05-23T16:32:27Z) - SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding
Tasks [88.4408774253634]
Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community.
There are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers.
Recent work has begun to introduce such benchmark for several tasks.
arXiv Detail & Related papers (2022-12-20T18:39:59Z) - IRB-NLP at SemEval-2022 Task 1: Exploring the Relationship Between Words
and Their Semantic Representations [0.0]
We present our findings based on the descriptive, exploratory, and predictive data analysis conducted on the CODWOE dataset.
We give a detailed overview of the systems that we designed for Definition Modeling and Reverse Dictionary tasks.
arXiv Detail & Related papers (2022-05-13T18:15:20Z) - Be More with Less: Hypergraph Attention Networks for Inductive Text
Classification [56.98218530073927]
Graph neural networks (GNNs) have received increasing attention in the research community and demonstrated their promising results on this canonical task.
Despite the success, their performance could be largely jeopardized in practice since they are unable to capture high-order interaction between words.
We propose a principled model -- hypergraph attention networks (HyperGAT) which can obtain more expressive power with less computational consumption for text representation learning.
arXiv Detail & Related papers (2020-11-01T00:21:59Z) - BRUMS at SemEval-2020 Task 3: Contextualised Embeddings for Predicting
the (Graded) Effect of Context in Word Similarity [9.710464466895521]
This paper presents the team BRUMS submission to SemEval-2020 Task 3: Graded Word Similarity in Context.
The system utilise state-of-the-art contextualised word embeddings, which have some task-specific adaptations, including stacked embeddings and average embeddings.
Following the final rankings, our approach is ranked within the top 5 solutions of each language while preserving the 1st position of Finnish subtask 2.
arXiv Detail & Related papers (2020-10-13T10:25:18Z) - NEMO: Frequentist Inference Approach to Constrained Linguistic Typology
Feature Prediction in SIGTYP 2020 Shared Task [83.43738174234053]
We employ frequentist inference to represent correlations between typological features and use this representation to train simple multi-class estimators that predict individual features.
Our best configuration achieved the micro-averaged accuracy score of 0.66 on 149 test languages.
arXiv Detail & Related papers (2020-10-12T19:25:43Z) - CIRCE at SemEval-2020 Task 1: Ensembling Context-Free and
Context-Dependent Word Representations [0.0]
We present an ensemble model that makes predictions based on context-free and context-dependent word representations.
The key findings are that (1) context-free word representations are a powerful and robust baseline, (2) a sentence classification objective can be used to obtain useful context-dependent word representations, and (3) combining those representations increases performance on some datasets while decreasing performance on others.
arXiv Detail & Related papers (2020-04-30T13:18:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.