Tweet Sentiment Quantification: An Experimental Re-Evaluation
- URL: http://arxiv.org/abs/2011.08091v3
- Date: Sat, 18 Sep 2021 00:07:20 GMT
- Title: Tweet Sentiment Quantification: An Experimental Re-Evaluation
- Authors: Alejandro Moreo and Fabrizio Sebastiani
- Abstract summary: Sentiment quantification is the task of training, by means of supervised learning, estimators of the relative frequency (also called prevalence'') of sentiment-related classes.
We re-evaluate those quantification methods following a now consolidated and much more robust experimental protocol.
Results are dramatically different from those obtained by Gao Gao Sebastiani, and they provide a different, much more solid understanding of the relative strengths and weaknesses of different sentiment quantification methods.
- Score: 88.60021378715636
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sentiment quantification is the task of training, by means of supervised
learning, estimators of the relative frequency (also called ``prevalence'') of
sentiment-related classes (such as \textsf{Positive}, \textsf{Neutral},
\textsf{Negative}) in a sample of unlabelled texts. This task is especially
important when these texts are tweets, since the final goal of most sentiment
classification efforts carried out on Twitter data is actually quantification
(and not the classification of individual tweets). It is well-known that
solving quantification by means of ``classify and count'' (i.e., by classifying
all unlabelled items by means of a standard classifier and counting the items
that have been assigned to a given class) is less than optimal in terms of
accuracy, and that more accurate quantification methods exist. Gao and
Sebastiani (2016) carried out a systematic comparison of quantification methods
on the task of tweet sentiment quantification. In hindsight, we observe that
the experimental protocol followed in that work was weak, and that the
reliability of the conclusions that were drawn from the results is thus
questionable. We now re-evaluate those quantification methods (plus a few more
modern ones) on exactly the same same datasets, this time following a now
consolidated and much more robust experimental protocol (which also involves
simulating the presence, in the test data, of class prevalence values very
different from those of the training set). This experimental protocol (even
without counting the newly added methods) involves a number of experiments
5,775 times larger than that of the original study. The results of our
experiments are dramatically different from those obtained by Gao and
Sebastiani, and they provide a different, much more solid understanding of the
relative strengths and weaknesses of different sentiment quantification
methods.
Related papers
- Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance.
DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator.
Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z) - Multi-Label Quantification [78.83284164605473]
Quantification, variously called "labelled prevalence estimation" or "learning to quantify", is the supervised learning task of generating predictors of the relative frequencies of the classes of interest in unsupervised data samples.
We propose methods for inferring estimators of class prevalence values that strive to leverage the dependencies among the classes of interest in order to predict their relative frequencies more accurately.
arXiv Detail & Related papers (2022-11-15T11:29:59Z) - Estimating Confidence of Predictions of Individual Classifiers and Their
Ensembles for the Genre Classification Task [0.0]
Genre identification is a subclass of non-topical text classification.
Nerve models based on pre-trained transformers, such as BERT or XLM-RoBERTa, demonstrate SOTA results in many NLP tasks.
arXiv Detail & Related papers (2022-06-15T09:59:05Z) - On Quantitative Evaluations of Counterfactuals [88.42660013773647]
This paper consolidates work on evaluating visual counterfactual examples through an analysis and experiments.
We find that while most metrics behave as intended for sufficiently simple datasets, some fail to tell the difference between good and bad counterfactuals when the complexity increases.
We propose two new metrics, the Label Variation Score and the Oracle score, which are both less vulnerable to such tiny changes.
arXiv Detail & Related papers (2021-10-30T05:00:36Z) - Dynamic Semantic Matching and Aggregation Network for Few-shot Intent
Detection [69.2370349274216]
Few-shot Intent Detection is challenging due to the scarcity of available annotated utterances.
Semantic components are distilled from utterances via multi-head self-attention.
Our method provides a comprehensive matching measure to enhance representations of both labeled and unlabeled instances.
arXiv Detail & Related papers (2020-10-06T05:16:38Z) - Identifying Spurious Correlations for Robust Text Classification [9.457737910527829]
We propose a method to distinguish spurious and genuine correlations in text classification.
We use features derived from treatment effect estimators to distinguish spurious correlations from "genuine" ones.
Experiments on four datasets suggest that using this approach to inform feature selection also leads to more robust classification.
arXiv Detail & Related papers (2020-10-06T03:49:22Z) - Cooperative Bi-path Metric for Few-shot Learning [50.98891758059389]
We make two contributions to investigate the few-shot classification problem.
We report a simple and effective baseline trained on base classes in the way of traditional supervised learning.
We propose a cooperative bi-path metric for classification, which leverages the correlations between base classes and novel classes to further improve the accuracy.
arXiv Detail & Related papers (2020-08-10T11:28:52Z) - Quantifying With Only Positive Training Data [0.5735035463793008]
Quantification is the research field that studies methods for counting the number of data points that belong to each class in an unlabeled sample.
This article closes the gap between Positive and Unlabeled Learning (PUL) and One-class Quantification (OCQ)
We compare our method, Passive Aggressive Threshold (PAT), against PUL methods and show that PAT generally is the fastest and most accurate algorithm.
arXiv Detail & Related papers (2020-04-22T01:18:25Z) - An interpretable semi-supervised classifier using two different
strategies for amended self-labeling [0.0]
Semi-supervised classification techniques combine labeled and unlabeled data during the learning phase.
We present an interpretable self-labeling grey-box classifier that uses a black box to estimate the missing class labels and a white box to explain the final predictions.
arXiv Detail & Related papers (2020-01-26T19:37:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.