How Will Your Tweet Be Received? Predicting the Sentiment Polarity of
Tweet Replies
- URL: http://arxiv.org/abs/2104.10513v1
- Date: Wed, 21 Apr 2021 13:08:45 GMT
- Title: How Will Your Tweet Be Received? Predicting the Sentiment Polarity of
Tweet Replies
- Authors: Soroosh Tayebi Arasteh, Mehrpad Monajem, Vincent Christlein, Philipp
Heinrich, Anguelos Nicolaou, Hamidreza Naderi Boldaji, Mahshad Lotfinia,
Stefan Evert
- Abstract summary: We propose a new task: predicting the predominant sentiment among (first-order) replies to a given tweet.
We create RETWEET, a large dataset of tweets and replies manually annotated with sentiment labels.
We use the automatically labeled data for supervised training of a neural network to predict reply sentiment from the original tweets.
- Score: 3.5263924621989196
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Twitter sentiment analysis, which often focuses on predicting the polarity of
tweets, has attracted increasing attention over the last years, in particular
with the rise of deep learning (DL). In this paper, we propose a new task:
predicting the predominant sentiment among (first-order) replies to a given
tweet. Therefore, we created RETWEET, a large dataset of tweets and replies
manually annotated with sentiment labels. As a strong baseline, we propose a
two-stage DL-based method: first, we create automatically labeled training data
by applying a standard sentiment classifier to tweet replies and aggregating
its predictions for each original tweet; our rationale is that individual
errors made by the classifier are likely to cancel out in the aggregation step.
Second, we use the automatically labeled data for supervised training of a
neural network to predict reply sentiment from the original tweets. The
resulting classifier is evaluated on the new RETWEET dataset, showing promising
results, especially considering that it has been trained without any manually
labeled data. Both the dataset and the baseline implementation are publicly
available.
Related papers
- Real-Time Summarization of Twitter [9.034423337410274]
We focus on real time push notification scenario, which requires a system monitors the stream of sampled tweets and returns the tweets relevant to given interest profiles.
We employ Dirichlet score with and with very little smoothing (baseline) to classify whether a tweet is relevant to a given interest profile.
It is also desired to remove the redundant tweets from the pushing queue.
arXiv Detail & Related papers (2024-07-11T01:56:31Z) - Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels.
By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data.
The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z) - Context-Based Tweet Engagement Prediction [0.0]
This thesis investigates how well context alone may be used to predict tweet engagement likelihood.
We employed the Spark engine on TU Wien's Little Big Data Cluster to create scalable data preprocessing, feature engineering, feature selection, and machine learning pipelines.
We also found that factors such as the prediction algorithm, training dataset size, training dataset sampling method, and feature selection significantly affect the results.
arXiv Detail & Related papers (2023-09-28T08:36:57Z) - Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z) - Taureau: A Stock Market Movement Inference Framework Based on Twitter
Sentiment Analysis [0.0]
We propose Taureau, a framework that leverages Twitter sentiment analysis for predicting stock market movement.
We first utilize Tweepy and getOldTweets to obtain historical tweets indicating public opinions for a set of top companies.
We correlate the temporal dimensions of the obtained sentiment scores with monthly stock price movement data.
arXiv Detail & Related papers (2023-03-30T19:12:08Z) - Identification of Twitter Bots based on an Explainable ML Framework: the
US 2020 Elections Case Study [72.61531092316092]
This paper focuses on the design of a novel system for identifying Twitter bots based on labeled Twitter data.
Supervised machine learning (ML) framework is adopted using an Extreme Gradient Boosting (XGBoost) algorithm.
Our study also deploys Shapley Additive Explanations (SHAP) for explaining the ML model predictions.
arXiv Detail & Related papers (2021-12-08T14:12:24Z) - Exploiting Twitter as Source of Large Corpora of Weakly Similar Pairs
for Semantic Sentence Embeddings [3.8073142980733]
We propose a language-independent approach to build large datasets of pairs of informal texts weakly similar.
We exploit Twitter's intrinsic powerful signals of relatedness: replies and quotes of tweets.
Our model learns classical Semantic Textual Similarity, but also excels on tasks where pairs of sentences are not exact paraphrases.
arXiv Detail & Related papers (2021-10-05T13:21:40Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z) - Self-training Improves Pre-training for Natural Language Understanding [63.78927366363178]
We study self-training as another way to leverage unlabeled data through semi-supervised learning.
We introduce SentAugment, a data augmentation method which computes task-specific query embeddings from labeled data.
Our approach leads to scalable and effective self-training with improvements of up to 2.6% on standard text classification benchmarks.
arXiv Detail & Related papers (2020-10-05T17:52:25Z) - Hate Speech Detection and Racial Bias Mitigation in Social Media based
on BERT model [1.9336815376402716]
We introduce a transfer learning approach for hate speech detection based on an existing pre-trained language model called BERT.
We evaluate the proposed model on two publicly available datasets annotated for racism, sexism, hate or offensive content on Twitter.
arXiv Detail & Related papers (2020-08-14T16:47:25Z) - Semi-Supervised Models via Data Augmentationfor Classifying Interactive
Affective Responses [85.04362095899656]
We present semi-supervised models with data augmentation (SMDA), a semi-supervised text classification system to classify interactive affective responses.
For labeled sentences, we performed data augmentations to uniform the label distributions and computed supervised loss during training process.
For unlabeled sentences, we explored self-training by regarding low-entropy predictions over unlabeled sentences as pseudo labels.
arXiv Detail & Related papers (2020-04-23T05:02:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.