HJ-Ky-0.1: an Evaluation Dataset for Kyrgyz Word Embeddings
- URL: http://arxiv.org/abs/2411.10724v2
- Date: Thu, 28 Nov 2024 22:37:57 GMT
- Title: HJ-Ky-0.1: an Evaluation Dataset for Kyrgyz Word Embeddings
- Authors: Anton Alekseev, Gulnara Kabaeva,
- Abstract summary: This work introduces the first'silver standard' dataset for constructing word vector representations in the Kyrgyz language.<n>We train corresponding models and validate the dataset's suitability through quality evaluation metrics.
- Score: 1.1920184024241331
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the key tasks in modern applied computational linguistics is constructing word vector representations (word embeddings), which are widely used to address natural language processing tasks such as sentiment analysis, information extraction, and more. To choose an appropriate method for generating these word embeddings, quality assessment techniques are often necessary. A standard approach involves calculating distances between vectors for words with expert-assessed 'similarity'. This work introduces the first 'silver standard' dataset for such tasks in the Kyrgyz language, alongside training corresponding models and validating the dataset's suitability through quality evaluation metrics.
Related papers
- Automated Collection of Evaluation Dataset for Semantic Search in Low-Resource Domain Language [4.5224851085910585]
Domain-specific languages that use a lot of specific terminology often fall into the category of low-resource languages.
This study addresses the challenge of automated collecting test datasets to evaluate semantic search in low-resource domain-specific German language.
arXiv Detail & Related papers (2024-12-13T09:47:26Z) - ImpScore: A Learnable Metric For Quantifying The Implicitness Level of Language [40.4052848203136]
Implicit language is essential for natural language processing systems to achieve precise text understanding and facilitate natural interactions with users.
This paper develops a scalar metric that quantifies the implicitness level of language without relying on external references.
ImpScore is trained using pairwise contrastive learning on a specially curated dataset comprising $112,580$ (implicit sentence, explicit sentence) pairs.
arXiv Detail & Related papers (2024-11-07T20:23:29Z) - Rethinking Evaluation Metrics of Open-Vocabulary Segmentaion [78.76867266561537]
The evaluation process still heavily relies on closed-set metrics without considering the similarity between predicted and ground truth categories.
To tackle this issue, we first survey eleven similarity measurements between two categorical words.
We designed novel evaluation metrics, namely Open mIoU, Open AP, and Open PQ, tailored for three open-vocabulary segmentation tasks.
arXiv Detail & Related papers (2023-11-06T18:59:01Z) - Assessing Word Importance Using Models Trained for Semantic Tasks [0.0]
We derive word significance from models trained to solve semantic task: Natural Language Inference and Paraphrase Identification.
We evaluate their relevance using a so-called cross-task evaluation.
Our method can be used to identify important words in sentences without any explicit word importance labeling in training.
arXiv Detail & Related papers (2023-05-31T09:34:26Z) - CompoundPiece: Evaluating and Improving Decompounding Performance of
Language Models [77.45934004406283]
We systematically study decompounding, the task of splitting compound words into their constituents.
We introduce a dataset of 255k compound and non-compound words across 56 diverse languages obtained from Wiktionary.
We introduce a novel methodology to train dedicated models for decompounding.
arXiv Detail & Related papers (2023-05-23T16:32:27Z) - A Comprehensive Empirical Evaluation of Existing Word Embedding
Approaches [5.065947993017158]
We present the characteristics of existing word embedding approaches and analyze them with regard to many classification tasks.
Traditional approaches mostly use matrix factorization to produce word representations, and they are not able to capture the semantic and syntactic regularities of the language very well.
On the other hand, Neural-network-based approaches can capture sophisticated regularities of the language and preserve the word relationships in the generated word representations.
arXiv Detail & Related papers (2023-03-13T15:34:19Z) - Benchmarking Generalization via In-Context Instructions on 1,600+
Language Tasks [95.06087720086133]
Natural-Instructions v2 is a collection of 1,600+ diverse language tasks and their expert written instructions.
The benchmark covers 70+ distinct task types, such as tagging, in-filling, and rewriting.
This benchmark enables large-scale evaluation of cross-task generalization of the models.
arXiv Detail & Related papers (2022-04-16T03:12:30Z) - SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation
on Natural Speech [44.68649535280397]
We propose a suite of benchmark tasks for Spoken Language Understanding Evaluation (SLUE)
SLUE consists of limited-size labeled training sets and corresponding evaluation sets.
We present the first phase of the SLUE benchmark suite, consisting of named entity recognition, sentiment analysis, and ASR on the corresponding datasets.
We provide new transcriptions and annotations on subsets of the VoxCeleb and VoxPopuli datasets, evaluation metrics and results for baseline models, and an open-source toolkit to reproduce the baselines and evaluate new models.
arXiv Detail & Related papers (2021-11-19T18:59:23Z) - SAT Based Analogy Evaluation Framework for Persian Word Embeddings [0.0]
In recent years there has been a special interest in word embeddings as a new approach to convert words to vectors.
It will be costly to evaluate the application end-to-end in order to identify quality of the used embedding model.
In this paper we introduce an evaluation framework including a hand crafted Persian SAT based analogy dataset.
arXiv Detail & Related papers (2021-06-29T18:43:06Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z) - XL-WiC: A Multilingual Benchmark for Evaluating Semantic
Contextualization [98.61159823343036]
We present the Word-in-Context dataset (WiC) for assessing the ability to correctly model distinct meanings of a word.
We put forward a large multilingual benchmark, XL-WiC, featuring gold standards in 12 new languages.
Experimental results show that even when no tagged instances are available for a target language, models trained solely on the English data can attain competitive performance.
arXiv Detail & Related papers (2020-10-13T15:32:00Z) - Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size.
We propose a fully compositional output embedding layer for language models.
To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.