Related papers: RuSentNE-2023: Evaluating Entity-Oriented Sentiment Analysis on Russian News Texts

RuSentNE-2023: Evaluating Entity-Oriented Sentiment Analysis on Russian News Texts

URL: http://arxiv.org/abs/2305.17679v1
Date: Sun, 28 May 2023 10:04:15 GMT
Title: RuSentNE-2023: Evaluating Entity-Oriented Sentiment Analysis on Russian News Texts
Authors: Anton Golubev, Nicolay Rusnachenko, Natalia Loukachevitch
Abstract summary: The paper describes the RuSentNE-2023 evaluation devoted to targeted sentiment analysis in Russian news texts. The dataset for RuSentNE-2023 evaluation is based on the Russian news corpus RuSentNE having rich sentiment-related annotation.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The paper describes the RuSentNE-2023 evaluation devoted to targeted sentiment analysis in Russian news texts. The task is to predict sentiment towards a named entity in a single sentence. The dataset for RuSentNE-2023 evaluation is based on the Russian news corpus RuSentNE having rich sentiment-related annotation. The corpus is annotated with named entities and sentiments towards these entities, along with related effects and emotional states. The evaluation was organized using the CodaLab competition framework. The main evaluation measure was macro-averaged measure of positive and negative classes. The best results achieved were of 66% Macro F-measure (Positive+Negative classes). We also tested ChatGPT on the test set from our evaluation and found that the zero-shot answers provided by ChatGPT reached 60% of the F-measure, which corresponds to 4th place in the evaluation. ChatGPT also provided detailed explanations of its conclusion. This can be considered as quite high for zero-shot application.

Related papers

Implicit Sentiment Analysis Based on Chain of Thought Prompting [1.4582633500696451]
This paper introduces a Sentiment Analysis of Thinking (SAoT) framework. The framework first analyzes the implicit aspects and opinions in the text using common sense and thinking chain capabilities. The model is evaluated on the SemEval 2014 dataset, consisting of 1120 restaurant reviews and 638 laptop reviews.
arXiv Detail & Related papers (2024-08-22T06:55:29Z)
Evaluating D-MERIT of Partial-annotation on Information Retrieval [77.44452769932676]
Retrieval models are often evaluated on partially-annotated datasets. We show that using partially-annotated datasets in evaluation can paint a distorted picture.
arXiv Detail & Related papers (2024-06-23T08:24:08Z)
Can ChatGPT evaluate research quality? [3.9627148816681284]
ChatGPT-4 can produce plausible document summaries and quality evaluation rationales that match REF criteria. Overall, ChatGPT does not yet seem to be accurate enough to be trusted for any formal or informal research quality evaluation tasks.
arXiv Detail & Related papers (2024-02-08T10:00:40Z)
Beyond Sentiment: Leveraging Topic Metrics for Political Stance Classification [1.0878040851638]
This study introduces topic metrics, dummy variables converted from extracted topics, as both an alternative and complement to sentiment metrics in stance classification. The experiment results show that BERTopic improves coherence scores by 17.07% to 54.20% when compared to traditional approaches. Our findings suggest topic metrics are especially effective for context-rich texts and corpus where stance and sentiment correlations are weak.
arXiv Detail & Related papers (2023-10-24T00:50:33Z)
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models [66.12432440863816]
We propose Prometheus, a fully open-source Large Language Model (LLM) that is on par with GPT-4's evaluation capabilities. Prometheus scores a Pearson correlation of 0.897 with human evaluators when evaluating with 45 customized score rubrics. Prometheus achieves the highest accuracy on two human preference benchmarks.
arXiv Detail & Related papers (2023-10-12T16:50:08Z)
INSTRUCTSCORE: Explainable Text Generation Evaluation with Finegrained Feedback [80.57617091714448]
We present InstructScore, an explainable evaluation metric for text generation. We fine-tune a text evaluation metric based on LLaMA, producing a score for generated text and a human readable diagnostic report.
arXiv Detail & Related papers (2023-05-23T17:27:22Z)
Is ChatGPT a Good Sentiment Analyzer? A Preliminary Study [31.719155787410685]
ChatGPT has drawn great attention from both the research community and the public. We provide a preliminary evaluation of ChatGPT on the understanding of emphopinions, emphsentiments, and emphemotions contained in the text.
arXiv Detail & Related papers (2023-04-10T00:55:59Z)
Is ChatGPT a Good NLG Evaluator? A Preliminary Study [121.77986688862302]
We provide a preliminary meta-evaluation on ChatGPT to show its reliability as an NLG metric. Experimental results show that compared with previous automatic metrics, ChatGPT achieves state-of-the-art or competitive correlation with human judgments. We hope our preliminary study could prompt the emergence of a general-purposed reliable NLG metric.
arXiv Detail & Related papers (2023-03-07T16:57:20Z)
Just Rank: Rethinking Evaluation with Word and Sentence Similarities [105.5541653811528]
intrinsic evaluation for embeddings lags far behind, and there has been no significant update since the past decade. This paper first points out the problems using semantic similarity as the gold standard for word and sentence embedding evaluations. We propose a new intrinsic evaluation method called EvalRank, which shows a much stronger correlation with downstream tasks.
arXiv Detail & Related papers (2022-03-05T08:40:05Z)
An analysis of full-size Russian complexly NER labelled corpus of Internet user reviews on the drugs based on deep learning and language neural nets [94.37521840642141]
We present the full-size Russian complexly NER-labeled corpus of Internet user reviews. A set of advanced deep learning neural networks is used to extract pharmacologically meaningful entities from Russian texts.
arXiv Detail & Related papers (2021-04-30T19:46:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.