You Shall Know a Tool by the Traces it Leaves: The Predictability of Sentiment Analysis Tools
- URL: http://arxiv.org/abs/2410.14626v1
- Date: Fri, 18 Oct 2024 17:27:38 GMT
- Title: You Shall Know a Tool by the Traces it Leaves: The Predictability of Sentiment Analysis Tools
- Authors: Daniel Baumartz, Mevlüt Bagci, Alexander Henlein, Maxim Konca, Andy Lücking, Alexander Mehler,
- Abstract summary: We show that sentiment analysis tools disagree on the same dataset.
We show that the sentiment tool used for sentiment annotation can even be predicted from its outcome.
- Score: 74.98850427240464
- License:
- Abstract: If sentiment analysis tools were valid classifiers, one would expect them to provide comparable results for sentiment classification on different kinds of corpora and for different languages. In line with results of previous studies we show that sentiment analysis tools disagree on the same dataset. Going beyond previous studies we show that the sentiment tool used for sentiment annotation can even be predicted from its outcome, revealing an algorithmic bias of sentiment analysis. Based on Twitter, Wikipedia and different news corpora from the English, German and French languages, our classifiers separate sentiment tools with an averaged F1-score of 0.89 (for the English corpora). We therefore warn against taking sentiment annotations as face value and argue for the need of more and systematic NLP evaluation studies.
Related papers
- Lexicon-Based Sentiment Analysis on Text Polarities with Evaluation of Classification Models [1.342834401139078]
This work uses a lexicon-based method to perform sentiment analysis and shows an evaluation of classification models trained over textual data.
The lexicon-based methods identify the intensity of emotion and subjectivity at word levels.
This work is based on a multi-class problem of text being labeled as positive, negative, or neutral.
arXiv Detail & Related papers (2024-09-19T15:31:12Z) - Leveraging ChatGPT As Text Annotation Tool For Sentiment Analysis [6.596002578395151]
ChatGPT is a new product of OpenAI and has emerged as the most popular AI product.
This study explores the use of ChatGPT as a tool for data labeling for different sentiment analysis tasks.
arXiv Detail & Related papers (2023-06-18T12:20:42Z) - Sentiment analysis and opinion mining on E-commerce site [0.0]
The goal of this study is to solve the sentiment polarity classification challenges in sentiment analysis.
A broad technique for categorizing sentiment opposition is presented, along with comprehensive process explanations.
arXiv Detail & Related papers (2022-11-28T16:43:33Z) - Sentiment-Aware Word and Sentence Level Pre-training for Sentiment
Analysis [64.70116276295609]
SentiWSP is a Sentiment-aware pre-trained language model with combined Word-level and Sentence-level Pre-training tasks.
SentiWSP achieves new state-of-the-art performance on various sentence-level and aspect-level sentiment classification benchmarks.
arXiv Detail & Related papers (2022-10-18T12:25:29Z) - Causal Intervention Improves Implicit Sentiment Analysis [67.43379729099121]
We propose a causal intervention model for Implicit Sentiment Analysis using Instrumental Variable (ISAIV)
We first review sentiment analysis from a causal perspective and analyze the confounders existing in this task.
Then, we introduce an instrumental variable to eliminate the confounding causal effects, thus extracting the pure causal effect between sentence and sentiment.
arXiv Detail & Related papers (2022-08-19T13:17:57Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - SentiQ: A Probabilistic Logic Approach to Enhance Sentiment Analysis
Tool Quality [13.450001922002478]
SentiQ is an unsupervised Markov logic Network-based approach that injects the semantic dimension in the tools through rules.
Preliminary experimental results demonstrate the usefulness of SentiQ.
arXiv Detail & Related papers (2020-08-19T14:30:00Z) - Tweets Sentiment Analysis via Word Embeddings and Machine Learning
Techniques [1.345251051985899]
This paper aims to perform sentiment analysis of real-time 2019 election twitter data using the feature selection model word2vec and the machine learning algorithm random forest for sentiment classification.
Word2vec improves the quality of features by considering contextual semantics of words in a text hence improving the accuracy of machine learning and sentiment analysis.
arXiv Detail & Related papers (2020-07-05T08:10:30Z) - SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis [69.80296394461149]
We introduce Sentiment Knowledge Enhanced Pre-training (SKEP) in order to learn a unified sentiment representation for multiple sentiment analysis tasks.
With the help of automatically-mined knowledge, SKEP conducts sentiment masking and constructs three sentiment knowledge prediction objectives.
Experiments on three kinds of sentiment tasks show that SKEP significantly outperforms strong pre-training baseline.
arXiv Detail & Related papers (2020-05-12T09:23:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.