SAT Based Analogy Evaluation Framework for Persian Word Embeddings
- URL: http://arxiv.org/abs/2106.15674v1
- Date: Tue, 29 Jun 2021 18:43:06 GMT
- Title: SAT Based Analogy Evaluation Framework for Persian Word Embeddings
- Authors: Seyyed Ehsan Mahmoudi and Mehrnoush Shamsfard
- Abstract summary: In recent years there has been a special interest in word embeddings as a new approach to convert words to vectors.
It will be costly to evaluate the application end-to-end in order to identify quality of the used embedding model.
In this paper we introduce an evaluation framework including a hand crafted Persian SAT based analogy dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years there has been a special interest in word embeddings as a new
approach to convert words to vectors. It has been a focal point to understand
how much of the semantics of the the words has been transferred into embedding
vectors. This is important as the embedding is going to be used as the basis
for downstream NLP applications and it will be costly to evaluate the
application end-to-end in order to identify quality of the used embedding
model. Generally the word embeddings are evaluated through a number of tests,
including analogy test. In this paper we propose a test framework for Persian
embedding models. Persian is a low resource language and there is no rich
semantic benchmark to evaluate word embedding models for this language. In this
paper we introduce an evaluation framework including a hand crafted Persian SAT
based analogy dataset, a colliquial test set (specific to Persian) and a
benchmark to study the impact of various parameters on the semantic evaluation
task.
Related papers
- HJ-Ky-0.1: an Evaluation Dataset for Kyrgyz Word Embeddings [1.1920184024241331]
This work introduces the first'silver standard' dataset for constructing word vector representations in the Kyrgyz language.
We train corresponding models and validate the dataset's suitability through quality evaluation metrics.
arXiv Detail & Related papers (2024-11-16T07:14:32Z) - FarSSiBERT: A Novel Transformer-based Model for Semantic Similarity Measurement of Persian Social Networks Informal Texts [0.0]
This paper introduces a new transformer-based model to measure semantic similarity between Persian informal short texts from social networks.
It is pre-trained on approximately 104 million Persian informal short texts from social networks, making it one of a kind in the Persian language.
It has been demonstrated that our proposed model outperforms ParsBERT, laBSE, and multilingual BERT in the Pearson and Spearman's coefficient criteria.
arXiv Detail & Related papers (2024-07-27T05:04:49Z) - A Comprehensive Analysis of Static Word Embeddings for Turkish [0.058520770038704165]
There are basically two types of word embedding models which are non-contextual (static) models and contextual models.
We compare and evaluate the performance of several contextual and non-contextual models in both intrinsic and extrinsic evaluation settings for Turkish.
The results of the analyses provide insights about the suitability of different embedding models in different types of NLP tasks.
arXiv Detail & Related papers (2024-05-13T14:23:37Z) - Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language
Modelling [70.23876429382969]
We propose a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks.
Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena.
For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge.
arXiv Detail & Related papers (2023-07-16T15:18:25Z) - Language Model Classifier Aligns Better with Physician Word Sensitivity
than XGBoost on Readmission Prediction [86.15787587540132]
We introduce sensitivity score, a metric that scrutinizes models' behaviors at the vocabulary level.
Our experiments compare the decision-making logic of clinicians and classifiers based on rank correlations of sensitivity scores.
arXiv Detail & Related papers (2022-11-13T23:59:11Z) - Just Rank: Rethinking Evaluation with Word and Sentence Similarities [105.5541653811528]
intrinsic evaluation for embeddings lags far behind, and there has been no significant update since the past decade.
This paper first points out the problems using semantic similarity as the gold standard for word and sentence embedding evaluations.
We propose a new intrinsic evaluation method called EvalRank, which shows a much stronger correlation with downstream tasks.
arXiv Detail & Related papers (2022-03-05T08:40:05Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z) - XL-WiC: A Multilingual Benchmark for Evaluating Semantic
Contextualization [98.61159823343036]
We present the Word-in-Context dataset (WiC) for assessing the ability to correctly model distinct meanings of a word.
We put forward a large multilingual benchmark, XL-WiC, featuring gold standards in 12 new languages.
Experimental results show that even when no tagged instances are available for a target language, models trained solely on the English data can attain competitive performance.
arXiv Detail & Related papers (2020-10-13T15:32:00Z) - Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size.
We propose a fully compositional output embedding layer for language models.
To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z) - A novel approach to sentiment analysis in Persian using discourse and
external semantic information [0.0]
Many approaches have been proposed to extract the sentiment of individuals from documents written in natural languages.
The majority of these approaches have focused on English, while resource-lean languages such as Persian suffer from the lack of research work and language resources.
Due to this gap in Persian, the current work is accomplished to introduce new methods for sentiment analysis which have been applied on Persian.
arXiv Detail & Related papers (2020-07-18T18:40:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.